Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"public domain... but can [only] be accessed in small pieces"

Sounds like the wonderful world of "APIs" on the www.

This sort of data should be on an FTP server.

I can build my own "apps". Give me the option of raw data.

Just my opinion, nothing more.




That's not quite what is going on. I had to do a paper for class recently that was looking at economic indicators. If I needed data on "Agriculture & Rural Development":

http://data.worldbank.org/topic/agriculture-and-rural-develo...

Down at the bottom is a link that will give you a csv file:

http://api.worldbank.org/v2/en/topic/1?downloadformat=csv

I thought it was really easy. I ended up having to visit multiple download links, but stitching the data together was simple using Python.

Without looking at the paper, I'm not sure what exactly they have. Have they just done the stitching already and are providing the complete data set? Or is there additional data not covered in the csv downloads?

Edit: Having looked a the paper it looks like the data they scraped was not in the CSVs. But I really cannot tell. If that is the case, I don't know why only some data is available as a bulk download and other data is not. So...back to your original point.


You're misquoting the article.

"...is not in the public domain, but can be accessed in small pieces..." (emphasis added)

Besides, this data should absolutely be provided in an API. An exporter tool that queries the API is superior to a static ftp site.


Sorry for the misquote. I guess that my misquote completely misrepresents the issue?

I never said the data should be not provided in an API.

If you read closely (for more than only "errors"), then you would observe the word "option".

That word is there for a reason.

The role of the FTP site (or whatever protocol you prefer) is to transfer the raw data in bulk to my local media.

Then I move it into my own database of choice and write my own code to access it.

Anyway, fear not. Your preferred www "API" world is not in jeopardy.

Cogent arguments why raw, bulk _public data_ _should never be provided_ in addition to rate-limited, by the slice snippets via "APIs" and third party "app developers" are welcome.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: