Hacker News new | past | comments | ask | show | jobs | submit login

It may be unfair to describe setting a documented configuration parameter as "fiddling." Retention is seven days by default. It is trivial to set it to arbitrarily long periods of time. To my knowledge, this functionality isn't really in question. Whether logs are a good unifying abstraction on which to build systems is in dispute among reasonable people, but whether Kafka randomly deletes stuff is not. :)



I don't claim that Kafka randomly deletes things. Just that it automatically does so.

The danger is not that Kafka will choose not to respect the configuration value. It is that the default setting will find a way to creep back in without the admin noticing it, and then a quick reboot, maybe even an unplanned one caused by a power trip or a kernel crash, will be sayonara to the system of record. Sure, there are backups, but who needs that aggravation? (p.s.: there probably aren't actually any workable backups)

In the RDBMS world, MySQL's automatic and silent truncation of VARCHARs down to the character limit of the column was seen as a sign of its badness. That demonstrates the difference in paradigm.

Anyway, the argument doesn't really hinge on whether or not there's an automatic eviction model in the software. It's just a clear, loud signal that the software is not really intended for long-term storage, and that you are, at best, entrusting decades of mission-critical data to a less-tested configuration on a young, maturing product. This should not be appealing by itself, and that's a very optimistic perspective on the choice to forgo the data integrity features provided by a traditional RDBMS.

Developers just cannot seem to grok that just because something appears to store data across server restarts does not mean it is necessarily a safe permanent parking spot. I'm not a DBA, and I've had my share of serious squabbles with them, but when this is what happens with unsupervised developers at the helm, it's hard not to be sympathetic to their aggressive, almost hostile, feelings around developer input into the data model.


Agreed that this is a newer architectural paradigm and a younger product. Of that there is no doubt, and there is always risk there. Also return, of course; someone had to deploy an RDBMs for the first time too—and everyone is glad they did.

But I still don't follow the argument. If the eviction model is a loud, clear signal that this is the wrong solution, why isn't the mutability of RDBMS data the same sort of signal? Claiming that the presence of a DELETE statement in SQL rules out relational databases as durable data stores would not get me too far. And nor should it!

You are 100% right that this is a new approach. You are also right that it is possible to make configuration errors that will break the system. But this is true of all nontrivial systems. At the end of all of this, we still have a very interesting sequence of events (all NYT content ever) stored in an immutable log. This seems reasonable. Maybe the NYT team is blazing a trail, it's not prima facie a crazy one. :)


SQL provides users with a lot of facilities and mechanisms to limit, control, supervise, and if performed within a transaction, even undo overzealous DELETE statements. I discussed some of these at https://news.ycombinator.com/item?id=15188619.

AFAIK, with Postgres, there are no known circumstances where restarting your server will result in the purge of your database; that's really just icing on the cake, not the core of the argument. The core of the argument is that SQL provides not only a rational design paradigm for long-term storage and choices that reflect they take it seriously, but also an extremely strong feature set for data management and integrity.

As I've said numerous times now, SQL isn't invincible. But it's inarguably more resilient than Kafka, and it provides the controls necessary to keep some sanity over data in the long run.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: