Latency (ttft) would be a nice metric.

Gcam · on Jan 16, 2024

We have this (and other more detailed metrics) on the models page https://artificialanalysis.ai/models if you scroll down and for individual hosts if you click into a model (nav or click one of the model bars/bubbles) :)

There are some interesting views of throughput vs. latency whereby some models are slower to the first chunk but faster for subsequent chunks and vice versa, and so suit different use cases (e.g. if just want a true/false vs. more detailed model responses)

throwawaymaths · on Jan 16, 2024

Thanks!