Being a RNN there is another trick: caching a long prompt, because RNNs only loo... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		visarga on May 23, 2023 \| parent \| context \| favorite \| on: RWKV: Reinventing RNNs for the Transformer Era Being a RNN there is another trick: caching a long prompt, because RNNs only look back one step while transformers see the whole sequence. So you can load your long context only once and reuse it many times.

pico_creator on May 23, 2023 [–]

Yup. This is commonly done in the community for the chat models as well (due to the huge amount of reuse for each reply)

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact