A few details on the "standard linux support" part.
To remove the hard dependency on the AUFS patches, we moved it to an optional storage driver, and shipped a second driver which uses thin LVM snapshots (via libdevmapper) for copy-on-write. The big advantage of devicemapper/lvm, of course, is that it's part of the mainline kernel.
If your system supports AUFS, Docker will continue to use the AUFS driver. Otherwise it will pick lvm. Either way, the image format is preserved and all images on the docker index (http://index.docker.io) or any instance of the open-source registry will continue to work on all drivers.
No, docker sets up its own partition in a loop-mounted sparse file. There is a slight performance tradeoff, but in production you should assume the copy-on-write layer is too slow anyway and use volumes for your performance-critical directories: http://docs.docker.io/en/master/use/working_with_volumes/
In the future the drivers should support using actual lvm-enabled devices if you have them.
Any drawbacks to using lvm comapared to AUFS? I'd like to switch to Debian from Ubuntu - will I notice any performance differences or other restrictions?
There is a slight performance overhead with lvm, but nothing dramatic. The other advantage of the AUFS driver is that it is more proven. If you have a way to get aufs on your system (and I believe debian does), my pragmatic ops advice would be to use that and give the other drivers some time to get hardened. Of course my advice as a maintainer is that all of them are equally awesome :)
how would you compare aufs/lvm performance vs with the upcoming btrfs support?
That is the one that I presume will have the most long term continuity because of btrfs becoming the default sooner than later (isnt it already for opensuse/fedora ?)
Not yet, but that's coming very soon. We've been artificially limiting the number of architectures supported to limit the headache of managing cross-arch container images. We're reaching the point where that will no longer be a problem - meanwhile the Docker on Raspberry Pi community is growing restless and we want to make them happy :)
The boundary between "kernel" and "libraries like libc" is very stable and doesn't change often. That means that often, the kernel distributed by Arch can work reasonably well in an Ubuntu system, and vice versa.
With that in mind: The "ubuntu" image ships the "ubuntu-glibc" and "ubuntu-bash" and "ubuntu-coreutils" and so on, but they continue to work on your Arch host because the system calls don't ever change.
You can't link (say) ubuntu-glibc into arch-bash though, which is why containers are built off of a "base ubuntu image" in the first place.
Containers come with their libraries though; you don't have to "add" anything. You'd just apt-get it within the container and it would pull down its dependencies.
This, by the way, is the reason we artificially prevent docker from running on multiple archs. If half of the containers you download end up not running on your arch, and there's no elegant way for you to manage that by filtering your results, and producing the same build for multiple archs (and what does it even mean to do that?) - then all of sudden using docker would become much more frustrating.
This sounds like a clothing manufacturer who only makes clothes in black because trying to match colors would be too frustrating for customers. At the end of the day, lots of people have multiarch networks, and don't want to have to choose between using a tool or supporting just one platform. Removing functionality does not make their lives easier.
Could someone explain the logistics of Docker in a distributed app development scenario? I feel like I am on the outskirts of understanding.
My goal is having a team of developers use Docker to have their local development environments match the production environment. The production environment should use the same Docker magic to define its environment.
Is the idea that developers define their Docker environment in the Dockerfile, and then on app deployment, the production environment builds its world from the same Dockerfile? How does docker push/pull of images factor into that, if at all?
Or is the idea that developers push a container, which contains the app code, up to production?
What happens when a developer makes changes to his/her environment from the shell rather than scripted in the Dockerfile?
What about dealing with differences in configuration between production and dev? (Eg. developers need a PostgreSQL server to develop, but on production, the Postgres host is separate from the app server - ideally running PG in a Docker container, but the point being multiple apps share a PG server rather than each running their own individual PG instance). Is the idea that in local dev, the app server and PG are in two separate Docker containers, and then in deployment, that separation allows for the segmentation of app server and PG instance?
I see the puzzle pieces but I am not quite fitting them together into a cohesive understanding. Or possibly I am misunderstanding entirely.
> Could someone explain the logistics of Docker in a distributed app development scenario?
Docker lets you build an environment (read: put together a bunch of files) for you to run an app in. It also has other features, like reducing space if lots of your apps [on the same host] use the same docker images, and networking stuff.
> My goal is having a team of developers use Docker to have their local development environments match the production environment
You run a container. You use images to distribute files. A Dockerfile is a loose set of instructions to build the images.
The basic idea is that, any single program that you want to be able to run anywhere, you make into a container. You can run it here, you can run it there, you can run it anywhere. The whole "running it anywhere" concept comes from the idea that all of your containers are created based on the same images, so no matter what kind of crazy mix of machines you have, your containers will just work - because you're literally shipping them a micro linux distribution in which to run your application. And since all the applications are running in isolated little identical containers, you can run as many of them as you want, independent of each other, in whatever configuration you want.
You'll have a PSQL container and an App container, and you'll manage them separately, even if they share images - the changes they make get saved off to a temp folder so they don't impact each other. Your environment stays the same only as long as the containers and images you're using are the same.
There will always be differences between development and production. You have to focus on managing those differences so you have confidence that what you're shipping to production actually works. The only thing Docker really does there is make sure the files are basically the same.
Since no one has answered you, I'll take a stab. I haven't actually used Docker yet, but I've been lurking extensively. Hopefully someone will correct me if I get this wrong.
A developer will use a Dockerfile to build the main dependencies that you need, and then save an image. Then to deploy code (or other frequently changing data), you pull that base image, add or update whatever else is needed, and save it again. (You can add layers to an image as often as desired.)
Then this saved image is pushed into a production environment. It seems like configuration data is often saved in images, so they require zero configuration. However, you can pass as many command line parameters as desired to run the image in production, so you can keep your configuration separated (like http://12factor.net/config) if you want.
I looked at it several times but never really got it. Can I use Docker to isolate different servers (think http, xmpp, another http on another port) on a server so that if one of them was exploited, the attacker would be constrained to inside the container? Or is it "just" of a convenient way to put applications into self-contained packages?
It's both. Everyone using docker benefits from the "software distribution" feature. Some people using docker also benefit from the security and isolation features - it depends on your needs and the security profile of your application. Because the underlying namespacing features of the kernel are still young, it's recommended to avoid running untrusted code as root inside a container on a shared machine. If you drop privileges inside the container, use an additional layer of security like SELinux, grsec, apparmor etc. then it is absolutely possible and viable in production.
It's only a matter of time before linux namespaces get more scrutiny and people grow more comfortable with them as a first-class security mechanism. It took a while for OpenVZ to get there, but not there are hosting providers using it for VPS offerings.
On top of namespacing, docker manages other aspects of the isolation including firewalling between containers and the outside world (currently optional but soon to be mandatory), whitelisting of traffic between certain containers, etc.
To my knowledge there are currently no known exploits. It's more a matter of risk management: newer codebases are less secure because we had less time to find bugs and spread best practices. That problem is amplified for larger codebases (but diminished by a more active developer community).
Yes, it is trivial for a root user in an LXC container to break out. One can load a kernel module from within a container, for example. LXC containers do not provide security partitioning at all.
This is just totally wrong. Any decent container configuration (including the default docker configuration) will agressively drop capabilities, preventing you from doing this, and any other script-kiddie attack.
See my other comment in this thread for a more accurate answer.
Yes, you probably need a proper kernel vulnerability, which you can exploit in a reduced environment. Not trivial, but not impossible, scanning this years CVEs some would probably be sufficient (eg ones that only nees socket access).
That's the idea. Docker is based on LXC which provides configurable isolation between process groups. So, you could run your http in one container, your database in another, and your application server in a 3rd and have reasonable confidence that they can only talk to each other the way you want them to.
I have heard two opinions. One of them is paranoid: "if you give up root on the container you give up root on the host system." I am not sure I believe this, I can't show you how to do the compromise and escalation, but I know there is a -privileged mode you can use on your containers in which Root user can do things like create device nodes, reset the system clock, and presumably other things you should not allow if you are concerned about this kind of attack.
The other opinion (given that you don't enable -privileged mode) is that lxc namespaces _do_ protect your host and other containers from being exploited due to compromise of a contained process, and cgroups _do_ protect your precious i/o and cpu cycles from being fully consumed and deadlocking your whole system when a rogue container decides to start infinite looping.
The competent answer of course is "try it", you should know that there are ways your processes can be compromised, and you should check if the exploits you know about result in the same or different privilege escalations when you apply them to a service hosted within a docker instance. That being said, you can't really prove a negative...
Yes, Docker should be able to help you. It is definitely not just for large companies.
Docker can help you streamline your development and deployment process across all the machines, from the development laptop to the production server. The result should be that your development environment is easier to setup and replicate, more similar to production, and easier to customize as you evaluate new tools, hosting providers and software components.
I just went through the Getting started tutorial. Thanks for a great intro into Docker. Is this tutorial open sourced too? What would really help is, include a few lines on how the commands can be used in real world.
You will be able to have multiple Ubuntu servers running on one Ubuntu VPS on Digital Ocean.
Each of them may have some initial state (installed some software, configuration) that you will be able to spawn as separate virtual machine. The "inside" machine will consume mnimal amount of memory (for running applications) and not for whole OS as in case of "real virtual machine". Also, you will be able to spawn and destroy new VMS very fast (under 1s).
What does the vm abstraction buy you vs processes? If one entity "owns" all the servers what is the draw in seperating the servers out in their own VM?
You don't have to think of them as VMs, that's just a simplification for the sake of the conversation. They're resource-isolated process groups. It's exactly the same idea as the original Unix-paradigm security approach of making each service run under a separate system-uid, but with a much more complete isolation.
Well, Ubuntu can do that without Docker; just with LXC. Docker uses LXC and does other things with it, but if all you want to do is run multiple lightweight Ubuntu VMs, then just use LXC.
Is it an easier to use chroot? Because I'd just use chroot if I wanted that or maybe openvz. Or just spin up a virtual node dedicated to the task I need it to perform.
It is definitely designed to be easier in several ways. One way is that it defines a format for portable container images, and an infrastructure for moving images around machines and sharing them with others. See http://www.docker.io/gettingstarted/ for an online tutorial and https://index.docker.io for the index of all community images.
I am a newbie at this, but from what I'm reading you need to create a bridge called lxbr0, assign some IP on it and docker should be able to figure out IPs to use for the containers.
FWICT, storage limits are still not directly supported.
You can however create a user per container and limit that user's quota.
This is the relevant user comment (but the full discussion is insightful): https://github.com/dotcloud/docker/issues/471#issuecomment-2...
I'd be interested in this as well. The workarounds mentioned seems to require a bit to much hacking for my liking.
Are there any actual plans for implementing simple disk-usage (and perhaps network) limiting?
Can anyone please tell about the overhead of Docker, compared to no-container scenario (not against a fat vm scenario)? I am a "dev" not "ops", but we might make use of Docker in our rapidly growing service oriented backend... Thanks
For process isolation, the overhead is approximately zero. It adds a few milliseconds to initial execution time, then CPU and memory consumption of your process should be undistinguishable from a no-container scenario.
For disk IO, as long as you use data volumes [1] for performance-critical directories (typically the database files), then overhead is also zero. For other directories, you incur the overhead of the underlying copy-on-write driver. That overhead varies from negligible (aufs) to very small (devicemapper). Either way that overhead ends up not mattering in production because those files are typically read/written very unfrequently (otherwise they would be in volumes).
VEry good answer, thank you. So in my scenario, I was thinking of packaging our service apps (which run on JVM) and if I make sure that the OS has jvm and all the needed stuff ready. Shipping a package and deploying the service, is just transferring the Docker container and run it. Which happens to be using the same process models as the OS itself. So in other terms, Docker is a convenience layer (a glorified FS abstraction). I am not saying this to undermine the utility, just trying to figure out if it solves more problems then the complexity it brings (which is one more moving part in your toolchain)
Will it be possible to run Docker containers on Android? I may be asking this incorrectly. So correct me if I have a mistake. My question might be "Will it be possible to run Docker containers on Dalvik VM?" or "Can I run an Android in Docker container?"
I have not tried personally, but I think it should be possible. People already do crazy things with Docker :) For example: http://resin.io/docker-on-raspberry-pi/
You cant "run docker on Dalvik VM" as it runs on Linux not on a JVM. You also can't run Android on a Docker container as Android needs some system calls not in a standard Linux kernel, although there may be some ways around that.
You might be able to run docker on Android though, but you may need to compile a kernel with containers (namespace) support as I don't know that Android ships with them (I forget).
Is it possible to have multiple instances of the same app running in Docker containers and having readonly access to a "global" memory linked file? What I'm trying to achieve is having sand-boxed consumers having access to some shared resource.
Yes, it's idiomatic, but typically you will use shared filesystem access rather than shared memory. By default nothing is shared, and you can specify exceptions using shared directories called "volumes": http://docs.docker.io/en/master/use/working_with_volumes/
Hmm. this is good. but my build environment is kind of wonky.
Can I just hand it a .tar.gz and say, put it in /usr/asdf and let it run? What about python scripts? Maybe I just give it a location to an RPM? like in this document?
You can use the ADD build instruction in the Dockerfile to upload any part of your source repo into the container. Then more RUN instructions to manipulate that, for example compile the code etc.
I couldn't reply to your deeper comment so I'm going to do it here. You can use the Dockerfile to copy files from you local system onto the image and then run various commands on the image. You have access to a full linux environment inside the docker image so you can do anything on the image that you can do in your normal development environment.
> I couldn't reply to your deeper comment so I'm going to do it here. You can use the Dockerfile to copy files from you local system onto the image and then run various commands on the image.
I wonder if this would be good for (large) embedded systems. Thanks for the tip.
This is crazy talk of course, but I wonder if there'd be some way to use rsync or git to support distributed development of images the way git does with code?
I mean, it'd be neat to be able to do a "pull" of diffs from one image into another related image. Merge branches and so on. I don't know, possibly this would be just too unreliable, but I would have previously thought that what docker is doing right now would be too unreliable for production use, and lo and behold we have it and it's awesome.
There has been discussion of taking the git analogy further. We actually experimented with a lot of that early on (https://github.com/dotcloud/cloudlets) and I can tell you it's definitely possible to take the scm analogy too far :)
I do think we can still borrow a few interesting things from git. Including, potentially, their packfile format and cryptographic signatures of each diff. We'll see!
Documentation is definitely part of the process :) Apparently the documentation service triggered an incorrect build overnight. Until we fix that you can browse the latest version of the docs from the master branch: http://docs.docker.io/en/master
Quick update: until we figure out what broke the build on our ReadTheDocs.org setup, we switched the default branch to master. So if you visit http://docs.docker.io you will get the bleeding edge build of the documentation, which happens to be accurate since we released it this morning :)
Sorry about that. One more lesson learned on our quest to ultimate quality!
Is the warning "This is a community contributed installation path. The only ‘official’ installation is using the Ubuntu installation path. This version may be out of date because it depends on some binaries to be updated and published." still true? Fedora also has this warning (and no instructions). The "Look for your favorite distro in our installation docs!" link does not give me up-to-date instructions for any of my favorite Linux distros. I can't even see where in that installation documentation it says how to install from source code on generic Linux. What am I missing? (Of course I can get the source code and build it, but I want the documentation to be great :-D)
Nice to see Docker 0.7 hit with some very useful changes.
I see lots of people are getting some generic Docker questions answered in here, and want to ask one I have been wondering about.
What is the easiest way to use dockers like I would virtual machines? I want to boot an instance, make some changes e.g. apt-get install or edit config files, shutdown the instance, and have the changes available next time I boot that instance. Unless I misunderstand something, Docker requires me to take snapshots of the running instance before I shut it down, which takes an additional terminal window if I started into the instance with something like docker run -i -t ubuntu /bin/bash. I know there are volumes that I can attach/detach to instances, but this doesn't help for editing something like /etc/ssh/sshd_config.
There's no need to manually take snapshots, docker does this automatically every time you run a process inside a container. In your example of running /bin/bash, after you exit bash and return to the host machine docker will give you the id for the container which has your changes. You can restart the container or run a new command inside it and your changes will still be there. If you want to access it more easily later, you can run 'docker commit' which will create an image from the container with a name you can reference. You can also use that new image as a base for other containers.
This is great for development or playing around with something new, but the best practice for creating a reusable image with your custom changes would be to write a Dockerfile which describes the steps necessary to build the image: http://docs.docker.io/en/latest/use/builder/
Yes, my goal here is ease of playing around with something new. I would setup a dockerfile after I knew exactly what setup I wanted.
You're right, I misunderstood what docker was doing when shutting down the container. Seems like I can start and reattach just fine. Here is an example workflow for anyone curious:
root@chris-VM:~# docker run -i -t ubuntu /bin/bash
root@0a8f96822140:/# cd /root
root@0a8f96822140:/root# ls
root@0a8f96822140:/root# vim shouldStayHere
bash: vim: command not found
root@0a8f96822140:/root# apt-get install -qq vim
...<snipped>...
Setting up vim (2:7.3.429-2ubuntu2) ...
root@0a8f96822140:/root# vim shouldStayHere
...Not exactly necessary, but I added a line to the file so I could identify it...
root@0a8f96822140:/root# exit
root@chris-VM:~# docker ps
ID IMAGE COMMAND CREATED STATUS PORTS
root@chris-VM:~# docker ps -a
ID IMAGE COMMAND CREATED STATUS PORTS
0a8f96822140 ubuntu:12.04 /bin/bash About a minute ago Exit 0
root@chris-VM:~# docker attach 0a8f96822140
2013/11/26 10:29:41 Impossible to attach to a stopped container, start it first
root@chris-VM:~# docker start 0a8f96822140
0a8f96822140
root@chris-VM:~# docker attach 0a8f96822140
ls
bin boot dev etc home lib lib64 media mnt opt proc root run sbin selinux srv sys tmp usr var
root@0a8f96822140:/# cd /root
root@0a8f96822140:/root# ls
shouldStayHere
root@0a8f96822140:/root# cat shouldStayHere
Hello World!
root@0a8f96822140:/root#
So, if I did some heavylifting to set something up and wanted to keep this as a base for later work, now I would do e.g.
docker commit a8f96822140 <some identifier>
Yes, I wrote a long wordy response and neglected to mention "docker start" which is a perfectly good way to come back to a stopped container after the first "docker run".
I prefer to never keep anything important in a stopped container (for very long) without committing it back to an image, and I don't like dealing with numeric ids.
Recently (it looks like you don't have this change yet) docker added the automatic naming scheme giving every container a random name of some "color_animal" pair which I think reinforces the point, stopped containers are not a place to store meaningful/persistent state information for very long.
This mishmash gets run almost every day on my docker hosts to clean up after terminated experiments:
You can assign permanent names to containers. We're designing the naming syatem to make it completely OK to keep persistent containers. They're just named directories, docker will never remove them on its own.
For example a common pattern is to create a placeholder database container with reference dara as a volume, but no actual process running and no network ports. Then successive versions of the database container are started on the side, with shared access to the placeholder's volume. Later you might run a backup container on the same volume. In other words you can point to a particular dataset as a container of its own, separate of the various applications which might access it. All of these interactions are visible to docker so it can authenticate, restrict, log, or hook them in all the standard ways.
In fact internally images and containers are stored side-by-side. In future versions we are going to accentuate that similarity.
Indeed, if I had done anything important I would certainly commit the changes to the container. It's great to have some version control for my playful discovery.
The changes you mention sound nice. It's no surprise I don't have them:
root@chris-VM:~# docker version
Client version: 0.5.3
Server version: 0.5.3
Go version: go1.1
It was the easiest VM I had access to at the moment of posting. I should update the docker in there.
I have used docker ps -a | awk '{print $1}' | xargs docker rm a couple times to clean up after playing around. I was slightly annoyed that it tried to docker rm a (nonexistent) container with the id ID. Thanks for reminding me to throw a egrep -v 'ID|Up' in front of awk.
tl;dr They are going to pick a new pair of things for every major release of Docker. This is meant to let you keep track of more containers over a long time. Apparently you are in fact meant to keep them around if they're still in working order, and remember them "by ID" or by name.
I have not updated my own docker in a long time, I use CoreOS now, which comes with automatic updating via chaos monkeys. It's always a pleasant surprise when I see my system is about to go down for a reboot, and trying to find what's changed when it comes back up!
You don't have to take the snapshot before you shut it down. The container persists after its main process is terminated. It's safe to take the snapshot any time after.
Your question is very near to me, I published a teeny-tiny script called 'urbinit'[1] for doing exactly what you are asking. It's meant to use with urbit, where you start by creating a 'pier' and use it to begin 'ships'. The ships must not be discarded (unless you don't mind having no real identity, and many parts of urbit require you to maintain a persistent identity). If you re-do the key exchange dance again from a fresh pier to identify yourself as the same 'destroyer' (having discarded the previous incarnation), you wind up as a perfectly viable ship with packet sequence numbers that are out of order and can't really receive any messages or acknowledge any packets that are waiting to be delivered in the correct order from your neighbor ships.
Anyway, urbinit is a manifestation of the fact that Docker doesn't really have anything to help you deal with this (very common and not at all unique to Urbit) problem. Docker offers that you can use a volume... make the state you need to be persistent part of a volume. Unfortunately this means it won't be present in your images; or you can do roughly what urbinit does, which is a very simple and classic problem all Computer Scientists past year 1 should already know.
Launch the container in detached mode (-d) and save the ID. Create a lock file. Urbinit won't do step 1 if there is already a lock file created that was never cleared. This is to prevent you from launching the same ship twice at one time with two addresses, which is very confusing to the network.
Attach if you need to interact through terminal, or however you want, affect the container by sending packets to it (eg. login via ssh and do stuff). Terminate your process or let it end when it's ready to end naturally, or the power goes out. If you're using urbinit (and the power did not go out), this is the point where the commit process comes along and commits your container ID back to the named image that it was created from, simultaneously clearing the lock created in step 2.
If you're doing this with several different images on one host that are meant to run simultaneously, a nice trick is to name your lock file after the actual docker image that's used in the commit step.
So, I assume that if you aren't using AUFS then you don't have to deal with potentially bumping up against the 42 layer limit? Or does this update also address the issue with AUFS?
The 42 layer limit is still there... But not for long! There is a fix underway to remove the limitation from all drivers, including aufs.
Instead of lifting the limit for some drivers first (which would mean some images on the index could only be used by certain drivers - something we really want to avoid), we're artificially enforcing the limit on all drivers until it can be lifted altogether.
(This pull request is on my personal fork because that's where the storage driver feature branch lived until it was merged. Most of the action is usually on the main repo).
One thing I've been wondering is whether the 42 limit was related to performance considerations. If yes, I actually somewhat like it - I'm a proponent to making systems behave in a way that they won't come around and bite you. Will a container based on, say, 200 layers still load and run with reasonable performance?
That depends on the backend a bit i guess. devicemapper "flattens" each layer, so the depth should have zero effect on the performance.
For aufs, I have no real data, but I assume that a 42 layer image is somewhat slower than a 1 layer image.
The fix for going to > 42 layers is to recreate a full image snapshot (using hard links to share file data) every N layers, so the performance would be somewhere between the performance between 1 and 42 layers depending on how many AUFS layers you got.
Doesn't 'flatten' mean that there's going to be a tradeoff between IO performance and storage size? If that's the case I'd like to have some control over what happens, at least with one of the implementations. Say, during development of a Dockerfile just use the layers as before, potentially without any depth limits, but when an image is ready being able to call 'flatten' manually.
Some background: Using 0.65 it took me several days to develop a Docker image with manually compiled versions of V8, PyV8, CouchDB, Flask, Bootstrap and some JS Libraries[1]. With no 42 limit it wouldn't have taken so long since I would have had way more caching points - however I'd also be afraid about performance.
[1] the image is available on the repository as ategra/xforge-dependencies. I can upload the Dockerfile to github if anyone is interested.
"flatten" is a simplification, what I mean is that the devicemapper thin provisioning module uses a data structure on disk that has a complexity that is independent of the snapshot depth. I did not mean that the Docker image itself is somehow flattened.
I don't think there will be any performance problems with deep layering of images, either on dm or ads.
That's some great news, thanks a lot for the heads up and the great work you guys are doing! I'm really looking forward to what Docker and its ecosystem is becoming - I think it's already quite obvious that it will revolutionize the way people think about linux application rollout - both from the user- as well as the application developer's perspective. It might even make 2014 the year of the linux desktop ;-).
The 42 limit comes from the aufs code itself. My guess is that it's because of stack depth: if you stack these kernel data structures too many levels deep, you might trigger an overflow somewhere. The hardcoded limit is probably a pragmatic way to avoid that.
Of course if you had infinite memory and those kernel data structures were designed to never overflow, then you would have a performance problem on your hands instead of a kernel panic :)
I was pretty sure that the requirement for AUFS would stick for a long time -- I was resigned to use a special kernel. But again, you folks surprise me!
Docker really replaces the need for Chef in a sense. You don't need Chef for configuration of your container, because ideally your container should be saved in an image which you use to deploy. This keeps things consistent between your dev environment, staging and production.
Chef is based on re-running the same commands with various different options depending on the environment, and even without any thing in the cookbooks/attributes/environments changing, Chef still cannot guarantee that this run will produce the same results as a run that happened yesterday, simply because it isn't like an image.
I'm new to these tools. Given your explanation, how does Docker replace a packaged Vagrant machine[0] with all the software already pre-installed (without using Chef)?
Can someone please explain what docker does and brings to the table, what all the fuss seems to be about? I've looked into it several times and really can't tell from any of what I've found.
Also, when writing about -p option (e.g. -p 8080:8080), you could choose different port numbers for container and host, so we know the latter is for the host and former is for container.
To remove the hard dependency on the AUFS patches, we moved it to an optional storage driver, and shipped a second driver which uses thin LVM snapshots (via libdevmapper) for copy-on-write. The big advantage of devicemapper/lvm, of course, is that it's part of the mainline kernel.
If your system supports AUFS, Docker will continue to use the AUFS driver. Otherwise it will pick lvm. Either way, the image format is preserved and all images on the docker index (http://index.docker.io) or any instance of the open-source registry will continue to work on all drivers.
It's pretty easy to develop new drivers, and there is a btrfs one on the way: https://github.com/shykes/docker/pull/65
If you want to hack your own driver, there are basically 4 methods you need to implement: Create, Get, Remove and Cleanup. Take a look at the graphdriver/ package: https://github.com/dotcloud/docker/tree/master/graphdriver
As usual don't hesitate to come ask questions on IRC! #docker/freenode for users, #docker-dev/freenode for aspiring contributors.