It is an ELO system based on users voting LLM answers to real questions
> what is Llama-7b equivalent to in OpenAI land?
I don't think Llama 7b compares with OpenAI models, but if you look in the rank I linked above, there are some 7B models which rank higher than early versions of GPT 3.5. those models are Mistral 7b fine tunes.
This is what I like to use for comparing models: https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboar...
It is an ELO system based on users voting LLM answers to real questions
> what is Llama-7b equivalent to in OpenAI land?
I don't think Llama 7b compares with OpenAI models, but if you look in the rank I linked above, there are some 7B models which rank higher than early versions of GPT 3.5. those models are Mistral 7b fine tunes.