Interesting. Very cryptic for simple user like me. I wonder if it’s useful today...

pico_creator · 2025-01-02T07:25:13 1735802713

Currently the strongest RWKV model is 32B in size: https://substack.recursal.ai/p/q-rwkv-6-32b-instruct-preview

This is a full drop in replacement for any transformer model use cases on model sizes 32B and under, as it has equal performance to existing open 32B models in most benchmarks

We are in works on a 70B, which will be a full drop in replacement for most text use cases

lostmsu · 2025-01-02T09:52:36 1735811556

Why aren't you on lmarena (former chatbot arena) leaderboard?

pico_creator · 2025-01-02T12:28:58 1735820938

kinda on a todo list, the model is open source on HF for anyone who is willing to make it work with lmarena

swyx · 2025-01-02T07:39:20 1735803560

how about finetuning your 32B to be R1QWQKV?

pico_creator · 2025-01-02T12:30:32 1735821032

There is a current lack of "O1 style" reasoning dataset in open source space. QWQ did not release their dataset. So that would take some time for the community to prepare.

It's definitely something we are tracking to do as well =)