User to LLM: "Rate this response to the following prompt on a scale of 1-10, where 1 is a poor response and 10 is a great response: [response]"
LLM rates responses of all other LLMs
All other LLMs do the same
Then we take the average score of each response. The LLMs that produced the top 50% of responses will respond again until one response with the highest score remains.
User to LLM: "Rate this response to the following prompt on a scale of 1-10, where 1 is a poor response and 10 is a great response: [response]"
LLM rates responses of all other LLMs
All other LLMs do the same
Then we take the average score of each response. The LLMs that produced the top 50% of responses will respond again until one response with the highest score remains.