Hacker News new | past | comments | ask | show | jobs | submit login

First, I've made one sinful over-simplification in my own post, in conflating NoSQL systems with eventual consistency. While that's usually the case, it's absolutely not an intrinsic property: my apologies!

True SERIALIZABLE-level ACID does pretty much simply just work - and if you're using Postgres the performance hit really isn't too bad. Of course you're chucking away replication then, so whether it's suitable for your needs may vary rather!

Dynamo-based systems have 'tunable consistency' but that's almost always over one key: multi-key operations are usually inconsistent. That being the case, they're pretty much only 'easy to use' for applications with a very simple data model: my experience is that most applications of any real complexity will at some point want to do some kind of multi-key operation. That being the case, you're probably on the hook for a pretty expensive programmer.

I'm vaguely aware that this doesn't strictly apply to Cassandra, and it has a limited notion of transactions - last I checked they didn't work very well at all ( https://aphyr.com/posts/294-call-me-maybe-cassandra/ ), but that may well have changed

I do appreciate your blog post in general - I think there's an awful lot of oversimplifying of this stuff out there. Part of the problem is that high speed, concurrent, distributed data storage is a topic that is, at its heart, pretty damn complicated. Unfortunately,




Well, I've never used postgres, but from the documentation v9.4 it does not look that different from the others db engine: 'Read Committed is the default isolation'; 'applications using this level must be prepared to retry transactions due to serialization failures' (plus the one you mentioned already: 'Serializable transaction isolation level has not yet been added to Hot Standby replication targets')

Not exactly what I like to call 'simply works' ;-)

But I didn't want to say that a NoSQL database is always better than an traditional one. Just that Isolation is complex on traditional systems when dealing with volumes & concurrency. And, typically, transactions between tables or even rows are difficult/impossible for a distributed database, as these rows can be on different nodes (the 'multi-key operations' you mentioned)


Postgres is quite different in one respect: it has a truly serializable snapshot isolation, at an acceptable performance cost (single digit percentage generally). Other DB systems are either not truly serializable, or have lock-based systems that are sometimes more difficult to work with for web apps.

> 'applications using this level must be prepared to retry transactions due to serialization failures'

True of any serializable system that supports concurrent access, AFAIK - not quite sure that's a fair criticism :-).

------

> And, typically, transactions between tables or even rows are difficult/impossible for a distributed database

Depends what you mean by 'distributed', really - Oracle RAC is very much distributed, and supports normal transactional behaviour. On the other hand you won't get that working across a large geographical area.

I accept that understanding the impact of isolation levels can be complex - I'm just very much of the opinion that you'll take a lot more pain trying to maintain consistency in a typical NoSQL system.


I can agree with these facts. I just give them a different weight than you do: I don't like the incertitude of the 'serialization failures': it depends on a workload that can be difficult to predict, especially if you're a software vendor. YMMV :-) Thanks for the constructive feedback.


Thanks for a good read/conversation!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: