Etcd – The Road to 1.0

argc · on April 15, 2014

Is there any benefit to using coreOS when you don't need a million machines? How much work is it to start with, for example, if you have no idea what your scaling needs will be in the future?

HeyImAlex · on April 15, 2014

No work at all; basic coreos is just docker managed by systemd. Can you make your application docker-izable? Were you planning on using process management? If you answered yes to both of these questions... :)

Further, coreos tries to make you write applications in a 12-factor-y way, so when the time does come for you to use a million machines, you won't need to make huge adjustments to your deployments (just plug the container and init script into fleet and let it roll).

robszumski · on April 15, 2014

The main benefit is it's an easy to update, minimal OS. And docker's already set up for you and updated regularly.

Our cloud-config implementation is designed so that you can use the same config to set up a new machine to match an existing one, and have it automatically join the cluster. Start with 3 machines and scale up as you need.

Learning systemd is one bump you'll have to get over, but all of the major distros will be using it, so they're is no better time to try it out.

cardmagic · on April 15, 2014

It is not a lot of work to get started, here is a short and simple tutorial that goes from nothing to a full running app with a database on CoreOS: http://www.centurylinklabs.com/building-your-first-app-on-co...

anko · on April 15, 2014

I'm interested in it because I plan to use docker to ease application management, and with CoreOS I barely have to maintain the operating system because it's so lightweight. So it greatly simplifies things :)

If you think docker is a good fit for your application, this is kind of the next step.

0xbadcafebee · on April 15, 2014

The only significant benefit over traditional operating systems/network services is it's designed to work on flaky hardware and networks. The only use cases i've found for decentralized distributed networks of application services (other than obscure stuff like parallel processing of large datasets) is when you have no guarantees of availability. As far as "scaling", you don't need coreOS to build a scalable network (and i've seen no performance benchmarks of coreOS running on thousands of machines at a time, so I have no idea how well it scales)

dpezely · on April 15, 2014

"Recoverable system upgrades" - https://coreos.com/blog/recoverable-system-upgrades/

Run from root partition A and only update on partition B, essentially via chroot. Downtime required for updates is the cost of a reboot. If, however, upon an error during boot, reboot into other partition which will be your last known good config.

Seems simple and straight-forward but is difficult for unmodified Debian-based distributions, so they've addressed this as a key feature of CoreOS.

bashcoder · on April 15, 2014

I know that there are lots of heavyweight folks who swear by Zookeeper as both a reliable and powerful tool, and for good reason. Unfortunately the docs can be fairly inscrutable, even for experts, and it typically requires the maintenance of a separate cluster of Zookeeper nodes.

So I like that etcd is a fundamental component of CoreOS, with these features:

  1. Written from scratch in Go
  2. Implements the Raft protocol for node coordination
  3. Has a useful command line app
  4. Runs on every node in a cluster
  5. Enables auto-discovery
  6. Allows CoreOS nodes to share state

darklajid · on April 15, 2014

No offense, I just don't get it: Why is 1) a feature for you? Everything else on the list kinda makes sense (I understand that this describes something I'd call a feature), but 'written from scratch' or 'in Go'?

Can you explain what excites you about that?

bashcoder · on April 15, 2014

I listed what I like. "Excites" is your word.

I guess what I meant by "from scratch" is that they aren't burdened by legacy code, and aren't limited to using the Paxos algorithm.

If you look at Deis, for example, it basically outsources a lot of node management to Chef Server, which in my view creates a great deal of technical debt on day one.

darklajid · on April 15, 2014

You read negative connotations into 'excites'. That wasn't intended.

I was just curious, since 'from scratch' can just as well mean 'untested' although I certainly agree that it sometimes is the Right Thing. The reference to Go was another thing that threw me off, since I rarely (admittedly .. sometimes) judge software projects by the language it is written in.

Thank you for the answer and some more references.

mavelikara · on April 15, 2014

> and aren't limited to using the Paxos algorithm

ZK uses the ZAB protocol, which is similar but not the same as Paxos.

https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab+vs...

ecnahc515 · on April 15, 2014

Probably the fact that you get a single binary (unlike interpreted languages), and that it isn't a mix of Go and C meaning I worries about c libraries.

babo · on April 15, 2014

Just to note, zookeeper is written in Java.

ecnahc515 · on April 15, 2014

Yes, I'm aware. I was just answering the question, not to compare it to anything else.

Still, Java still requires another dependency..the JVM. Go binaries require...well nothing.

ekimekim · on April 15, 2014

Personally, my preference for zookeeper comes from the API. To me, the ZK API and docs were far more understandable than the etcd ones. The etcd api docs appear to be a collection of examples, not reference docs. It does a poor job of explaining the possible operations and what various options will do, particularly in what combinations of options are allowed.

In fact, I had to resort to running test queries against a running etcd server just to work out the proper semantics of some of the arguments.

stormbrew · on April 15, 2014

> 4. Runs on every node in a cluster

Is that so? When looking at it I distinctly remember it advising running a set of 3-9 nodes of etcd (not necessarily separate from other things).

robszumski · on April 15, 2014

The functionality to handle this is mentioned in the blog post: "standby" peer mode.

"Our upcoming release, etcd 0.4, adds a new feature, standby mode. It is an important first step in allowing etcd clusters to grow beyond just the peers that participate in consensus."

stormbrew · on April 15, 2014

Fair enough, it just sounded like the GP was describing something inherent rather than something new, and it didn't mesh with my understanding of how etcd worked to date.

fizx · on April 15, 2014

I don't know that that's necessarily a good idea.

As you (perhaps automatically) expand and collapse the cluster, you'll need to make sure to communicate to all nodes what the new cluster size is. If some nodes don't know the correct quorum count, split-brain!

Also, coordination services are typically critical, so its important to isolate from to the bugs in the adhoc code you're writing for your web tier, a crazy query in your database, etc.

It's much easier and safer in practice to just have 3 or 5 nodes running the coordination in isolation.

Edit: more reasons -- It's easier to deploy a coordination service to 5 nodes than 500. It's easier to debug 5 nodes than 500.

bashcoder · on April 15, 2014

I probably should have said that it "can" run on any node. Yes, currently it does run on every node, but their roadmap doesn't have the requirement that every node be actively participating in elections.

I'm sure that you have seen fleets of dedicated Zookeeper nodes. I rather like that etcd is simply a service that can run on any node, and does not require a separate role-specific fleet of servers just to do coordination. That was the point I was attempting to make.

Bjoern · on April 15, 2014

Whats the communities opinion on Serf? How does it measure up against Etcd?

nemothekid · on April 15, 2014

While similar, Serf and Etcd solve a different problem. Etcd is strong consistent (all nodes will see the same data, however a partition may cause the system to not accept writes) while Serf is eventually consistent (all nodes are not guaranteed to see the same data, however the system will always accept writes)

So somethings like a distributed lock is impossible to implement with Serf.

Bjoern · on April 17, 2014

Very interesting, wasn't aware that ETCD is strongly consistent. This makes alot of sense in terms of Brewster's theorem. Thanks for clarifying!

dredmorbius · on April 15, 2014

What is etcd?

Don't write about something without defining what it is for those not familiar with it (or you, or your service(s)).

ecnahc515 · on April 15, 2014

Config data accessible via HTTP instead of files.

Although its not limited to config data necessarily.

I suggest reading http://sysadvent.blogspot.com/2013/12/day-20-distributed-con...