Hacker News new | past | comments | ask | show | jobs | submit login
ZooKeeper: Wait-free coordination for Internet-scale systems (muratbuffalo.blogspot.com)
100 points by mad44 on Sept 29, 2014 | hide | past | favorite | 9 comments



Obligatory shout-out to Apache Curator: http://curator.apache.org/

Curator implements a bunch of algorithms often implemented on top of ZooKeeper. I like to think of ZooKeeper as the distributed systems equivalent of peer-reviewed implementations of cryptographic primitives. Curator is a like a whole cryptographic protocol / cryptosystem. In both cases: don't implement your own!


That is a fantastic analogy (and a great library), in my experience it's works well but takes effort in ongoing maintenance (log cleaning, etc). Also, it acts as a canary in the coal mine for other networking problems.

I've also used it in a WAN setup for low throughput, transactional data that I needed solid exactly-once semantics. Some of the docs related to WAN settings were non-existent but eventually it worked as intended, thanks to the help of the community.


I use ZooKeeper in production for snitch.io.

There are some interesting new alternatives such as etcd / serf/consul - but at the time ZooKeeper had the best track record (under Jepsen analysis). Things might have changed since then.

Aphyr has done a bunch of analysis of these systems part of his Jepsen tool: http://aphyr.com/tags/jepsen and http://aphyr.com/posts/291-call-me-maybe-zookeepe

If you are going to use ZooKeeper I strongly suggest looking at both Apache Curator and Netflix Exhibitor (they are complimentary).

The examples bundled with ZK don't handle all errors/edge cases...

Curator is a library of common patterns available to use mostly out of the box.

Exhibitor is a ZooKeeper "aware" supervisor system: https://github.com/Netflix/exhibitor

Also always remember your ensemble should have an odd number of nodes (3,5,7)


If you enjoyed this, I highly recommend Mikito Takada's "Distributed systems for fun and profit" http://book.mixu.net/distsys

The "Partition-tolerant consensus algorithms: Paxos, Raft, ZAB" section is relevant, along with the "Further Reading" which follows it.



I've used ZooKeeper not only as a service registry but also as a fairly small message queue - I wanted to be sure that my message will be delivered at least once, and thanks to Kazoo's (Python ZK library) LockingQueue recipe I was able to get what I want really easily with all the benefits from ZK's clustering nature.


The article doesn't give any background: ZK is a some of Chubby, Google's distributed lock manager. Locks are small files.


has anyone had real-world experience comparing ZooKeeper with consul? Is consul considered production-ready yet?


I can't speak for zookeeper, but I've been using consul in production since 1.0. I've never had an issue with it - it works great for me! I've written a ruby gem to interface with it's API called Diplomat (here: https://github.com/WeAreFarmGeek/diplomat) and a whole bunch of ansible scripts to setup checks for postgres, nginx, etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: