Dynamo vs. Cassandra: Systems Design of NoSQL Databases (2018)

etaioinshrdlu · on Dec 14, 2019

This is the key idea behind these databases. It is a good way to design a hash table to scale seamlessly. https://en.wikipedia.org/wiki/Consistent_hashing

tyingq · on Dec 14, 2019

I suppose interest in this is rising since AWS just announced managed Cassandra.

tlunter · on Dec 14, 2019

I didn't see if there was a re:invent talk about it, but is it actuallly Cassandra under the hood? It seemed like it might just be the Cassandra API akin to Aurora being MySQL.

tyingq · on Dec 14, 2019

It happens to be Cassandra, but that did make me think about the way Amazon brands the Postgres compatible Aurora as "Aurora PostgreSQL".

That's pretty lousy of them to take advantage of the name. I imagine the uptake would be lower if it weren't in the name, and they had to settle for just saying "Postgres Compatible" in the description.

I also imagine AWS would come after me if I launched "XYZ Fargate" or similar.

scarface74 · on Dec 14, 2019

There are two separate offerings. AWS offers Aurora/Postgres which is a fork of Postgres with Amazon’s own code and there is regular RDS/Postgres which is basically managed Postgres.

tyingq · on Dec 14, 2019

I'm not talking about RDS.

I'm talking about "Amazon Aurora PostgreSQL". That's what they call it. See this page, for example: https://aws.amazon.com/quickstart/architecture/aurora-postgr...

As mentioned, they likely wouldn't tolerate a "Tyingq Typhoon Fargate" that was my Fargate clone.

scarface74 · on Dec 14, 2019

It is Postgres. It’s using the same source code as a base and it’s compatible with all Postgres tools.

tyingq · on Dec 14, 2019

The storage backend isn't Postgres, and I assume the repeated use of the words "compatible" and "wire protocol" is on purpose, so they can continue to change it.

scarface74 · on Dec 14, 2019

Who cares about the “storage backend”? How does that affect clients?

sjwright · on Dec 14, 2019

I realise the parent comment is being voted down, but I think it’s a good point. If you’re buying a managed service, why should it matter if Amazon twiddled with how it stores data? What matters is that it behaves exactly like PostgreSQL in every way—which I’m led to believe it does.

There may be marginal resultant performance characteristics but they’re unlikely to be significant or wildly non-linear. My understanding is that this isn’t a storage engine rewrite, but a modification to the IO layer at the bottom of the storage engine.

Still, if you want “pure” anything, run it yourself.

voidfunc · on Dec 14, 2019

Clients no, but if you have had a Postgres DBA optimize your database to take advantage of known Postgres storage backend behavior you may be in for unexpected performance degradation under the assumption "it's just Postgres".

That said, I dust that's pretty uncommon.

scarface74 · on Dec 14, 2019

Well, you get the same problem if you have “network administrators” who took one AWS certification and call themselves “AWS Consultants”.

In both cases you end up with suboptimal solutions. The lesson to learn is not that AWS shouldn’t be making storage optimizations, it’s that you don’t depend on a bunch of old school net ops “lift and shifters” who didn’t take the time to learn the environment and who think that the cloud is just an overpriced colo.

scarface74 · on Dec 14, 2019

Aurora is basically a fork of Mysql that is more tightly integrated with AWS.

On the other hand, DocumentDB with Mongo Support doesn’t use any code from Mongo.

paulddraper · on Dec 14, 2019

AFAIK Aurora MySQL is largely MySQL code.

That's certainly the case at least for Aurora PostgreSQL, but then again, Aurora PostgreSQL lags MySQL significantly [1] in features; maybe that is related.

[1] https://github.com/pauldraper/aws-aurora-sql

antman · on Dec 14, 2019

I remember in another thread here on HN an engineer chimed in and said it is Cassandrs.

redwood · on Dec 14, 2019

DynamoDB under the hood

tnolet · on Dec 14, 2019

Citation needed...

mmbleh · on Dec 14, 2019

https://twitter.com/_msw_/status/1201924979647905792

tnolet · on Dec 15, 2019

The tweet literally says “it’s Apache Cassandra under the hood”. Then it says the reused some tech from DynamoDB which can mean anything.

scarface74 · on Dec 14, 2019

It seems just like Aurora then. The code is MySql/Aurora with AWS’s own storage layer.

alexnewman · on Dec 14, 2019

Yearly reminder dynamo != dymanodb

GreekPete · on Dec 14, 2019

Hmm, looks like Dynamo is a storage system. Interesting, never heard about it before.

https://en.wikipedia.org/wiki/Dynamo_(storage_system)

snak · on Dec 16, 2019

Thanks for the reminder. Should be mentioned in the article.

ripvanwinkle · on Dec 14, 2019

In the last write wins mechanism if two writes (V1 and V2) happen concurrently to different replicas why is V2 considered the last write in the article.

Reading the explanation and lead up, i was left with the impression that the last updates to each column (for a columnar store like Cassandra) would take effect so the final would be

{"street" : "Cubbon", "city" : "Bombay"}

FpUser · on Dec 14, 2019

Is it just me having inadequate vision or the visual design of the article is challenging? Light grey text on white background, dark red and blackish colors on dark grey background.