All of these benchmarks have gotten out of hand, which is highly suggestive. Ben...

Majromax · 2025-04-10T13:01:27 1744290087

> You know a burger tastes really good when you eat it, no benchmarks required.

I'd say this is a good example of the opposite, where the problem is finding the quantification of an ultimately subjective experience. Take three restaurant reviewers to a burger joint and you might end up with four different opinions.

Benchmarks proliferate because many LLM domains defy easy, quantitative measurement, yet LLM development and deployment are so expensive that they need to be guided by independent and quantitative (even if not fully objective) measures.