Hacker News new | past | comments | ask | show | jobs | submit login

Latency (ttft) would be a nice metric.



We have this (and other more detailed metrics) on the models page https://artificialanalysis.ai/models if you scroll down and for individual hosts if you click into a model (nav or click one of the model bars/bubbles) :)

There are some interesting views of throughput vs. latency whereby some models are slower to the first chunk but faster for subsequent chunks and vice versa, and so suit different use cases (e.g. if just want a true/false vs. more detailed model responses)


Thanks!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: