I can independently attest all of your comment from my own internal research. Well summarised.
I also think fitting the exact distribution is unimportant and uninteresting. The important thing is that you have roughly the right distribution and that it has the fat-tailed property of only slowly converging to normal.
Why the distribution doesn't matter beyond that is that, I think, finding the exact 95 % upper bound is probably impossible to do with any statistical significance, because of its rarity and how contributing factors change over time. Getting in the right ballpark matters a great deal.
Verifying that you get in the right ballpark (only 1 in 20 blow their committed date, and they do so independently in time, etc, the usual value-at-risk stuff) is fortunately also trivial, no matter the underlying distribution and it's parametrisation at the time.
Thanks, and I'm glad to hear you've found the same thing! So including the article, that makes at least three of us, it seems.
I agree that it might not be terribly important practically, but I think it's interesting because it might hint something about the underlying nature of task planning. Sort of like how if you were measuring radioactive decay times and found a Poisson distribution, you might learn something about the underlying nature of radioactivity.
What it is exactly, I'm still not sure, but I do think there's something there to poke at.
One thing that makes software special and might contain a kernel of an explanation is that software is scale-free: big software is build from many small software, which in turn are build from even smaller software. This makes it different from e.g. houses, which are not built from many miniature houses.
I speculate this self-similar nature drives a lot of the other odd properties we observe in software development, but I haven't come up with a specific model yet.
I also think fitting the exact distribution is unimportant and uninteresting. The important thing is that you have roughly the right distribution and that it has the fat-tailed property of only slowly converging to normal.
Why the distribution doesn't matter beyond that is that, I think, finding the exact 95 % upper bound is probably impossible to do with any statistical significance, because of its rarity and how contributing factors change over time. Getting in the right ballpark matters a great deal.
Verifying that you get in the right ballpark (only 1 in 20 blow their committed date, and they do so independently in time, etc, the usual value-at-risk stuff) is fortunately also trivial, no matter the underlying distribution and it's parametrisation at the time.