Hacker News new | past | comments | ask | show | jobs | submit login

I'm quite interested in repeng [0] (representztion engineering) for steerability of (so fzr transformer based) LLMs and was wondering if anyone had tried such methods on rwkv (or mamba for that matter). Maybe there are some low hanging fruits about it.

[0] https://github.com/vgel/repeng/issues




One of the interesting "new direction" for RWKV and Mamba (or any recurrent model), is the monitoring and manipulation of the state in between token. For steerability, alignment, etc =)

Not saying its a good or bad idea, but pointing out that having a fixed state in between has interesting applications in this space




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: