Hacker News new | past | comments | ask | show | jobs | submit login

Awesome to see this published.

Work on transformer alternatives, especially parallelizable ones like this, is incredibly important - it would suck if we get sucked down a local optima in architecture without actually looking at nearby viable alternatives.




Yup. I’m all here for infinite scaling of context size




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: