Hacker News new | past | comments | ask | show | jobs | submit login

Actually, doh, you're right, my bad, turns out sample quantiles are only asymptotically unbiased, and a better estimator would take some sort of weighted average of P99 and P99.5 with the weights depending on the sample size. (You still don't need the full population and you don't have to merge histograms, though.)



I find this part of statistics very hard to argue about. Simulations only work with well-known distributions, but even then not all of them. Even when you have a simulation that confirms hypothesis A1 using distribution D1 it says little about some pathetical distribution D2.

Using mathematical precision might take years to get it right and often there is no easy answer.

What I've been looking for is something that is fast and works "good enough". Estimating the 95th percentile and averaging it did not work well in my use case. Using the histogram method does work well although it certainly is not perfect.



Thanks, this is informative and pretty similar to what I've been implementing in JavaScript. I guess I'll have to take a deeper look into literature when doing an update to that topic.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: