Service discovery and docker is still a pain point with the technology. Serf [1] and etcd [2] are tools that manages a cluster of services and helps solve the problem described in the article.
It actually seems like Consul still might be an easier way to go than this. Dnsmasq is sure easy, but consul isn't exactly challenging and is happy to run in its own container when local dev is a key.
Why have a different methodology for local dev and deploy when you could have the SAME methodology for local and deploy with almost no extra cost?
I'm using that approach for Reesd. Instead of a single dnsmasq instance on the host, I have one instance (actually a Docker container) for each logical group of containers. Containers can reference db.storage.local and different groups will actually talk to their own database. This is pretty cool for integration tests or spinning a new version of the group before promoting it into production.
Before that, I was using SkyDNS but I was not happy in having to maintain/reset the TTL.
Side note: when you use --dns on a container and commit the result to an image, that image will have the --dns value in its /etc/resolv.conf (meaning that you still have to run a DNS at that address or to provide again --dns to supply a new address).
The idea it that if I have a few containers that need to know each others. Say an application server `app` and a PostgreSQL database `db`. While conceptually I need to spawn only `app` and `db`, I have a script to actually spawn `ns` (i.e. dnsmasq), `app`, and `db`. That script will register the IPs of `app` and `db` into dnsmasq (as you describe in the post).
This means that I can run that script more than once on my laptop and have multiple triple (`app`, `db`,` ns`) side by side without fear of crosstalking between logical groups.
I'd love to hear about other people's workflow when it comes to address that issue - are you simply re-provisoning the all shebang if a critical component is to be updated or change? or relying on etcd/zookeeper as mentioned? or ...
We use consul. Each instance has it's own bind setup with DNS forwarding and it forwards normal traffic on through the regular DNS servers and anything in .consul on to the consul client running locally.
Just wrote something very similar, but more targeting development sandboxes. host runs a script that uses rubydns. a zone cut to the host for dev.domain.com. rubydns then uses docker-api (with small redis cache) to respond to dns requests for *.dev.domain.com. Developers can now reference their sandboxes as foo.dev.domain.com from within the host, or anywhere else, including their own machines. More to it then this ofcourse, but thats the dns portion.
One shortcoming of the OP's approach is the out-of-band process for updating dnsmasq's hosts file. If you want to use something like fig to start a cluster of containers, you still have to wrap a script around the fig command to update the hosts file. I wonder how hard it would be to modify SkyDock to update dnsmasq (update the hosts file and then kill -HUP dnsmasq).
OP here, the post is only a simple trick that may be of some use to some people.
Eventually we merged that logic in an in-house app we built back in the days when the containers orchestration tools were still "rare", if i recall https://github.com/toscanini/maestro had just launched but wasn't really usable yet...
Does anyone else just bind Docker containers to specific ports and/or group of ports on the same interface thereby not needing to care about what IP address each container thinks it is?
That only makes sense in small deployments where you aren't multihomed.
What's frustrating is that I don't think Docker currently has an elegant way to make you not care if the remote service is local or not. What'd be interesting is if there was a special tun interface for that kind of communication and you could bind that to local containers or just dump the traffic back onto your LAN, NAT'd to a remote host.
Eh, I think it makes sense for any service/app setup where every server with docker containers has an identical quantity of services running on it, when the server is healthy.
At that point, you aren't really doing anything different than you were before with managing servers & load balancing. I don't care if I have 1 app server or 10,000 app servers as long as they are functionally identical and interchangeable. One of them loses a container? Restart the container. That fails? Kill the instance and rebuild.
It seems rare to me that there are systems in normal app deployment where there is NO remote shared state between app (and therefore physical) nodes. Given that, your production deploy will almost always have at least a few datacenter-local remote resources.
Besides compute and indexing farms, what sort of deploys demonstrate that pattern?
I don't use Docker for anything that stores state.
I store it in a separate pool of servers that are solely for database nodes. At which point, I'm back to the "Service Discovery does not require a Docker-aware design."
I think the issue here is I'm used to different assumptions than other people. I virtualize application/service containers. I don't virtualize the datastores but run those on bare metal whenever possible.
So it goes LB -> pool of docker containers -> Database Cluster(s). The database cluster(s) are known in advance and are restored to the same DNS location when they are revived/moved/transferred. [e.g. I have a cluster of 5 DB servers, they are always something like galera1-servicegroup through galera5-servicegroup, or whatever]
I'm perfectly fine with the DNS/DHCP approach, just wondering what are the alternatives out there. That dnsmasq approach worked like a charm for several of our projects for 1+ year.
Always wondering of the extra round-trip on DNS request when we could have a hardcoded value in the host. We're talking local network so the latency ain't much of an issue but still. Then there is the possibility of a dnsmasq restart when a request occur, caching ala nscd could work, but then we're in for a lot more trouble when it comes to expire that cache !
[1] Serf is by the guys behind Vagrant. - http://www.serfdom.io/
[2] etcd - http://coreos.com/using-coreos/etcd/