I'd really like to see the source of the data and what exactly is being measured. The implication of the title is that these are new physical books. If that's the case, I guess I'm a bit skeptical.
I suppose it's possible that there are an outsize number of books published between about 100 and 150 years ago that get read for school and the like, which would account for that big spike. But these books are mostly not all that cheap just because they're in the public domain. And, presumably, the number of different titles still being read from that era is relatively small. And the top n% (where n is a small number) most popular titles from any era are mostly still in print, whether they're out of copyright or not.
Now, on the other hand, if this includes e-books, there's a pretty simple explanation. Many of the pre-copyright titles are available for free and very low cost and most of us with Kindles and Kindle apps have downloaded quite a few even if we haven't gotten around to reading all of them.
(I'm not arguing against shorter copyright terms by the way. But my personal experience is at odds with the suggestion that there are vast numbers of 20th century books that people would be reading in droves if only there were in print.)
"I'd really like to see the source of the data and what exactly is being measured."
Following the source link in Yglesias' post [1], the graph seems to come from a presentation that Paul Heald (a professor at the University of Illinois) [2] did at the University of Canterbury [3]. A video is available on YouTube; the graph in question (with a different title) seems to be available at the 12:50 mark: http://www.youtube.com/watch?v=-DpfZcftI00#t12m50s
I think the mistake may be in discounting the long tail: Even if the top n% are still in print, the bottom (100-n)% aren't. The volume from a million out-of-print old books that would each sell on average fifty copies a year can easily dwarf that of a few thousand in-print old books that each sell a thousand copies a year.
As for the e-book explanation, even if that turns out to be the case, isn't that an even stronger argument for shorter copyright terms? Now that public domain books are "free" you get a lot more public benefit out of something being in the public domain, because instead of the price going from $17 to $15 it goes from $17 to $0, which results in far more people getting access to the book.
If the million books are out of print, Amazon certainly can't sell them as physical copies. (OK, there's PoD but I can't see that skewing these numbers in a big way.) And even Gutenberg has "only" 36,000 e-books, which I assume corresponds pretty closely to all the available public domain works that have been converted to text. That's a big number but it's still a fairly small slice of all published books even allowing for the fact that the number of published books has gone up over time.
> As for the e-book explanation, even if that turns out to be the case, isn't that an even stronger argument for shorter copyright terms?
Sure, at least in principle. Insofar as copyright law is supposedly about balancing the public good with encouraging the creation of new works, making the good relatively more valuable (through wider distribution) once out of copyright would presumably change the balance point. As I said, I'm certainly not going argue for the current copyright regime.
Surely whether a book is under copyright or not has some effect on the publisher's decision to publish, the price at which it is published, and the consumer's choice to buy. You seem to be saying that there's no way these effects can be significant, because the top books will always sell and all other old books will never sell.
I don't think there's a clear dividing line between the "top books" and the rest. The line is fuzzy, and will shift in response to market pressures. For a random example: The Production of Commodities by Means of Commodities was a rather influential book in economics. It is out of print and listed for over $100 used on Amazon. David Ricardo's Principles of Political Economy, which probably appeals to the same smallish group of econ geeks, is available in a reprint for $10.
I suppose it's possible that there are an outsize number of books published between about 100 and 150 years ago that get read for school and the like, which would account for that big spike. But these books are mostly not all that cheap just because they're in the public domain. And, presumably, the number of different titles still being read from that era is relatively small. And the top n% (where n is a small number) most popular titles from any era are mostly still in print, whether they're out of copyright or not.
Now, on the other hand, if this includes e-books, there's a pretty simple explanation. Many of the pre-copyright titles are available for free and very low cost and most of us with Kindles and Kindle apps have downloaded quite a few even if we haven't gotten around to reading all of them.
(I'm not arguing against shorter copyright terms by the way. But my personal experience is at odds with the suggestion that there are vast numbers of 20th century books that people would be reading in droves if only there were in print.)