Hacker News new | past | comments | ask | show | jobs | submit login

Well I keep seeing all models quantized and for 2-bit, 4-bit and 1-bit quantizations I had good very good inference performance (either througput or latency) on CNNs and some RNNs on Alveo boards using FINN (so, mostly high level synthesis and very little actual fpga wrangling). No idea about the current status of all these, will read the paper though :-)



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: