Hacker News new | past | comments | ask | show | jobs | submit login

Couldn't you implement a bitnet kernel, and use that as a co-processor to a PC? Or is the I/O bandwidth so low that it won't be worth it?



Since I don't have a board with PCIe port the fastest I could get is 100MBit ethernet, i think. Or rather use the Microchip board which has a hard RISC-V quad core processor on it connected via an AXI-Bus with the FPGA fabric. The CPU itself run at only 625MHz, so there is huge potential to speed up some fancy computation


Even with a PCIe FPGA card you're still going to be memory bound during inference. When running LLama.cpp on straight CPU memory bandwidth, not CPU power, is always the bottleneck.

Now if the FPGA card had a large amount of GPU tier memory then that would help.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: