> The article argues that the graphs supposedly demonstrating this fact, can als...

bena · on April 15, 2022

To be fair, in the random data, you have people of all actual skill levels estimating their ability at 0.

That's going to skew the data somewhat because people don't work that way. While you will likely have some zeros, I wouldn't expect any from the skilled population and I'd expect fewer from the rest of the population.

That's going to make the "difference in estimation" line higher overall, which would make it intersect higher and more to the right.

_dain_ · on April 15, 2022

Okay well in that case the article is deficient (I skimmed it, assuming it was making the same case I already knew about[1] -- my bad). You can actually reproduce the DK graph without supposing that estimation bias depends on skill. There is an interactive visualization here[2] that does precisely that (click on the "line plot w/ centiles" tab). You can adjust the parameters to see how it behaves.

[1] https://news.ycombinator.com/item?id=31038386

[2] http://emilkirkegaard.dk/understanding_statistics/?app=Dunni...

nbernard · on April 15, 2022

Thanks for the link!

I see what you mean, but is using the "general overestimation" slider actually unbiased? I don't really understand what it is supposed to represent. Looking at the code, it seems to be only an additive term that shifts the plot upwards, but that does not explain anything (actually, in the code, the variable is called "up_bias", and could correspond to an actual DK effect...).

_dain_ · on April 15, 2022

Per DK's hypothesis, the added bias would be a decreasing function of underlying skill, not a constant as it is here. I think this is the only way to define it sensibly. It's easier to see what it's doing if you look at the scatterplot, although it's kind of annoying because the axes are flipped and it keeps the view centred on the data, so the points stay still and the axes move around them. Basically it's just translating all the points uniformly along the estimated score axis.