Hacker News new | past | comments | ask | show | jobs | submit login
New 167-processor chip is super-fast, ultra energy-efficient (esciencenews.com)
46 points by Anon84 on April 22, 2009 | hide | past | favorite | 14 comments



URL to the project: http://www.ece.ucdavis.edu/vcl/asap/ [last updated a year ago, does not address this latest silicon]

Hotchips presentation: http://www.hotchips.org/archives/hc18/2_Mon/HC18.S5/HC18.S5T... [also of the previous 36 processor silicon]

Looks like 64 words of instruction and 128 of data for each processor. You will want to think "dataflow of simple DSPs" rather than "cluster of linux".

The power use is amazing. 84mW while 100% active on their 36 processor 475MHz unit.


Would an electrical engineer please explain why the number of cores is prime?


It is a 13x13 array with 5 chopped off to make space. That goes down to 164, but there are three special purpose processors added back in to get up to 167.

If you can open a PowerPoint presentation you can read about the chip at http://www.ece.ucdavis.edu/vcl/pubs/2008.06.symp.vlsi/vlsi_s...

Also worth noting that although they do an 802.11a receiver in 20-30 cores, they use two big chunks of special purpose silicon for a Viterbi decoder and an FFT. I'd be curious to know how many more cores it takes without the special purpose hardware, if it is even possible.


I wonder if this could fly.

Because the other approach very-many-cores processor design that I know of, was by Intellasys (http://www.intellasys.net/), the latest company of Forth's inventor Chuck Moore. And it seemed to have gone nowhere: http://www.colorforth.com/S40.htm

The problem with this design, as I could see it, is that tiny cores can be too small to do anything practically useful on their own; inter-processor communication and I/O eats cycles. In Intellasys' case I guess the problems were multiplied by the fact that it would be harder to get traction for the chip was designed to be programmed in Forth , and Chuck Moore would settle for nothing less.

Maybe University of California will have better chance of success, i.e. if it has more runway to tweak their design.

Does anyone know of other very-many-cores processor chips?


Rigel (https://rigel.crhc.illinois.edu/) is one such project, though the silicon implementation is decidedly less far-along than AsAp. The goal there is 1024+ cores in 45nm. The targeted applications are more like "things that are massively parallel but don't run well on GPUs" than "DSP kernels that decompose into pipelines", but they are both accelerators nonetheless. A more detailed paper on the architecture itself can be found here: https://netfiles.uiuc.edu/jkelm2/www/papers/kelm-isca2009.pd... .

Full Disclosure: I am affiliated with this project. However, I am making no claims about the relative merits of AsAp and Rigel, just pointing out another manycore research project.

Also, you may be interested in commercial chips from Cavium (http://www.cavium.com/OCTEON_MIPS64.html) and Azul Systems (http://www.azulsystems.com/products/compute_appliance.htm). Of course, modern GPUs and Intel's forthcoming Larrabee are manycore chips as well, though GPUs have some restrictions and special-purpose hardware that makes them slightly less general than others.


Not as helpful as I'd want, but yeah, I recall one that was getting some traction as a signals-processing aide for tasks like, eg, radar or video analysis and possibly DPI.

Totally blanking on the name, it had a pretty similar kit: it seemed like you wrote the raw code in C or a C-like lang, then there was some software assist for laying out dataflow between cores.

Edit: I was thinking of Tilera.


IIRC the seaforth chips were designed to have blocking reactive IO at ports on each processor, so there was no need to waste cycles waiting on IO or explicitly driving peripherals, you just coded as if the stream was continuous.


Yes, you're right. In Seaforth's case inter-processor communication and I/O shouldn't eat cycles on its own.

I wish we could use some of this technology.


Motorola used to do devkits for their 56k series that featured USB or RS232 connections to a box containing the DSP and a pair of audio i/o jacks with average DACs.

This plus a front-end DSP environment like http://synthmaker.co.uk/ would sell like hot cakes. Update to this now please: http://www.ece.ucdavis.edu/vcl/asap/asap_demo_boards.html


Someone's come up with a One Instruction Set Computer for this purpose -- to enable reconfigurable arrays of small processors for massively parallel signal processing.

http://en.wikipedia.org/wiki/One_instruction_set_computer


Not sure if this fits the bill, but your comment reminded me of it:

http://www.cellmatrix.com/entryway/products/concepts/intro1....


Sounds like the ideal target for the spatial composition of Haskell's "lava" ( http://raintown.org/lava/ ).


> just three months to write "a fully compliant Wi-Fi transmitter

That seems like a long time. But I don't know how hard a fully compliant "Wi-Fi" transmitter is.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: