Hacker News new | past | comments | ask | show | jobs | submit login

"first mass produced AMP architecture"

Nope. Remember the Cell? The processor in the Playstation 3? One main CPU with 8 little CPUs and no shared memory, just channels.

The Playstation 4 isn't a AMP machine because programming the Cell was so hard.




> The Playstation 4 isn't a AMP machine because programming the Cell was so hard.

The PS4 could be an AMP design if you consider the (closely coupled) GPU a processor.

It doesn't require the same gymnastics as the PS3 though because both the main processor and GPU are more capable. The SPUs were required to perform computation that the anemic PPU could not do as well as fill in where the pre-unified shader model GPU was unable to keep the pace.

Memory was shared on the PS3 but, from the SPUs, required explicit put and fetch operations.


Cell is essentially distributed memory cluster on single chip, because each SPU has it's own address space and cannot directly access main memory. I'm not sure about what the exact definition of AMP is, but it does not exactly match my feeling of what AMP should be. In this regard Wii seems more like AMP systems with two completely different CPUs (PPC and ARM) sharing what essentially amounts to be same address space (and in WiiU there are 3 PPC cores where one of them is slightly different than other two and cache coherency between them can only be described as broken).

There is no question of hardness of programming for Cell, but I think it's mostly about middleware support (probably because the platform is so different from PC and xbox360).


The real problem with the Cell was that each Cell SPE processor only has 256K of local memory. It has bulk DMA access to main memory, but that's more like I/O. 256K is too small for a video frame, a game level, or much else in a modern game. So everything has to be done on an assembly line basis, where data is pumped into a Cell processor, processed, and pumped out. Great for audio, terrible for everything else. In comparison, the main processor had access to 256MB of RAM.

If they'd had, say, 16MB per processor, it might have worked out. One CPU for collision detection and physics, one for NPC management and AI, etc. But giving each SPE processor only 0.1% of the total memory space was too constraining.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: