Precisely. While "all NYT data since 1851" sounds like a lot, and >8660 days sounds like a long retention period for a Kafka topic, this--like most systems in the world--is not a Big Data application. One of the key insights from the post is that there are interesting architectural considerations that have nothing to do with data size that make immutable logs a good idea.