Hacker News new | past | comments | ask | show | jobs | submit login

Yes! Check out this paper describing how Linear Transformers are secretly Fast Weight Programmers: https://arxiv.org/abs/2102.11174.



Jürgen Schmidhuber reminds me of Richard Feynman in one very specific way: while everybody else in their respective fields uses math to hide the deep insights they've been mining for publications, Schmidhuber and Feynman just simply tell you the big insight, and then proceed to refine and illuminate it with math.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: