Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
low_tech_punk
24 days ago
|
parent
|
context
|
favorite
| on:
RWKV Language Model
Thanks! The 0.1B version looks perfect for embedded system. What is the key benefit of attention-free architecture?
pico_creator
24 days ago
[–]
lower compute cost especially over longer sequence length. Depending on context length, its 10x, 100x, or even 1000x+ cheaper. (quadratic vs linear cost difference)
Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: