I wish more places showed Time To First Token. For scenarios real time human interaction, the important part is how long until the first token is returned, and are tokens generated faster than people consume them.
Hey com2kid - if you're still there, we did end up adding boxplots to show variance. Can be seen on the models page https://artificialanalysis.ai/models and on each models page where you view hosts by clicking one of the models. They are toward the end of the page under 'Detailed performance metrics'
Sadly very few benchmarks bother to track this.