I currently have a subset of one year's worth of all HN stories and comment trees, organized by story, but it's on my local machine. Where is a good place to post it? It's quite big, on the order of multiple GB.
The problem (if you want an easy scraper) is that the HN API limits you to 1k requests per hour. So it took me about 10 days of continuous running and restarting because of random crashes to get all the data.
The problem (if you want an easy scraper) is that the HN API limits you to 1k requests per hour. So it took me about 10 days of continuous running and restarting because of random crashes to get all the data.