Hacker News new | past | comments | ask | show | jobs | submit login

No, it's nothing like them in implementation. MemcacheDB is like a giant hash table where your app server can figure out where something is based on what the key hashes to. An individual item is stored on one server (no redundancy) and which server it gets written to is a matter of the hash.

Google's BigTable is a single-master, column based storage system with redundancy. That's a huge difference. In fact, there's very little that is similar.

While Amazon hasn't published as much about SimpleDB, it's most definitely not a giant hash table. It's likely either a column store like BigTable/HBase or a document store like CouchDB.

They're completely different tools. MemcacheDB isn't anything like the other two and for scalability purposes it's important to realize why so that you can choose the correct tool for the job.

If you're interested in a tool like the App Engine Datastore or SimpleDB, there's HBase, CouchDB, and HyperTable which will all fit the bill.




You are right. I was under impression that BigTable is distributed hashtable (as MemcacheDB), but it's not.

I revisited Google IO talk on App Engine Datastore where they explicitly say it [1]. They call it sharded sorted array.

Here "sorted" is the key difference: it means you can do efficient prefix and range scans of contiguous areas in one sweep without extra disk seeks.

This wouldn't be the case for MemcacheDB even if you created some clever key naming scheme, as locality there is defined by their hash function.

That of course, being in addition to many other features that GAE Datastore has and MemcacheDB doesn't have.

I mentioned similarity based on how fundamental is the "key->value" aspect of both MemcacheDB and GAE Datastore, as opposed to traditional relational databases accessed via SQL.

[1] http://sites.google.com/site/io/under-the-covers-of-the-goog...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: