"The API key is to prevent someone from coming and taking all the bread at once and leaving none for everyone else (i.e., computation/network capacity)."
I disagree. The data dump gives the developer/user the ability to download and make the data available to the user's device locally, perhaps even via direct attachment thereby obviating the need for network connectivity to access the data.
Consider all the resources needed to stay online and serve myriad API requests 24/7 to consumers all over the globe. Add the resources to develop and maintain an API key system. And then consider behavioral issues like denials of service.
Now contrast this with the resources needed to periodically upload and host data dumps via FTP, Bittorrent or some well-suited protocol; the dumps can also be mirrored, reducing the strain on one set of servers and providing some level of DoS protection.
The key difference is that a data dump can be mirrored by third parties, while API calls must stay with one source.
Downloading a data dump is not "taking all the bread and leaving none for everyone else". Bread was a poor analogy, because customers do not take the original loaf, they merely take a copy of it. If the "bread" is data, then the "baker" never runs out.
The finite resource you allude to is the computation and network capacity.
With a data dump, the developer can provide users with another source for the data: the user's own local storage. As more developers make the data available, user access increases. The costs of the computation and network resources are redistributed. With an API, develpers merely point users at the same data source. That sole source has to foot the bill for the computation and network resources.
It is difficult to discuss this at length without referring to any concrete examples. Keep in mind I gave Wikipedia as one, and let me be clear I only have public data in mind (e.g., government data or user generated content). I am not suggesting that anyone give away proprietary data in a data dump. What I'm saying is that the way in which people are accessing a great deal of publicly available data today (via "APIs") confuses me; it make poor use of the current computational and network resources available.
I'll admit I'm biased. I like working with data dumps. I do not "reinvent the wheel" when I work with data dumps (as someone else suggested), but I do use a simple, more robust wheel than any "API". I like giving users speed, simplicity and reliability. Data dumps give me the freedom to do that in a way that APIs do not.
I disagree. The data dump gives the developer/user the ability to download and make the data available to the user's device locally, perhaps even via direct attachment thereby obviating the need for network connectivity to access the data.
Consider all the resources needed to stay online and serve myriad API requests 24/7 to consumers all over the globe. Add the resources to develop and maintain an API key system. And then consider behavioral issues like denials of service.
Now contrast this with the resources needed to periodically upload and host data dumps via FTP, Bittorrent or some well-suited protocol; the dumps can also be mirrored, reducing the strain on one set of servers and providing some level of DoS protection.
The key difference is that a data dump can be mirrored by third parties, while API calls must stay with one source.
Downloading a data dump is not "taking all the bread and leaving none for everyone else". Bread was a poor analogy, because customers do not take the original loaf, they merely take a copy of it. If the "bread" is data, then the "baker" never runs out.
The finite resource you allude to is the computation and network capacity.
With a data dump, the developer can provide users with another source for the data: the user's own local storage. As more developers make the data available, user access increases. The costs of the computation and network resources are redistributed. With an API, develpers merely point users at the same data source. That sole source has to foot the bill for the computation and network resources.
It is difficult to discuss this at length without referring to any concrete examples. Keep in mind I gave Wikipedia as one, and let me be clear I only have public data in mind (e.g., government data or user generated content). I am not suggesting that anyone give away proprietary data in a data dump. What I'm saying is that the way in which people are accessing a great deal of publicly available data today (via "APIs") confuses me; it make poor use of the current computational and network resources available.
I'll admit I'm biased. I like working with data dumps. I do not "reinvent the wheel" when I work with data dumps (as someone else suggested), but I do use a simple, more robust wheel than any "API". I like giving users speed, simplicity and reliability. Data dumps give me the freedom to do that in a way that APIs do not.
Anyway, thanks for the comment.