As for the chosen time period, it's really very simple. We had limited time, and English-language Wikipedia has year pages going back as far as 500BC. Before that, years are grouped into decades and centuries - it would have been possible to parse those too, but hey, limited time.
Also the bias highlights the flaw in the underlying data which is the English version of wikipedia