Docker 0.7 runs on all Linux distributions

shykes · on Nov 26, 2013

A few details on the "standard linux support" part.

To remove the hard dependency on the AUFS patches, we moved it to an optional storage driver, and shipped a second driver which uses thin LVM snapshots (via libdevmapper) for copy-on-write. The big advantage of devicemapper/lvm, of course, is that it's part of the mainline kernel.

If your system supports AUFS, Docker will continue to use the AUFS driver. Otherwise it will pick lvm. Either way, the image format is preserved and all images on the docker index (http://index.docker.io) or any instance of the open-source registry will continue to work on all drivers.

It's pretty easy to develop new drivers, and there is a btrfs one on the way: https://github.com/shykes/docker/pull/65

If you want to hack your own driver, there are basically 4 methods you need to implement: Create, Get, Remove and Cleanup. Take a look at the graphdriver/ package: https://github.com/dotcloud/docker/tree/master/graphdriver

As usual don't hesitate to come ask questions on IRC! #docker/freenode for users, #docker-dev/freenode for aspiring contributors.

dgregd · on Nov 26, 2013

When it will be "production ready"? It seems that the schedule in "GETTING TO DOCKER 1.0" post is outdated.

mateuszf · on Nov 26, 2013

Does the LVM part mean that the partition/disk used to store/run Docker images has to use LVM?

shykes · on Nov 26, 2013

No, docker sets up its own partition in a loop-mounted sparse file. There is a slight performance tradeoff, but in production you should assume the copy-on-write layer is too slow anyway and use volumes for your performance-critical directories: http://docs.docker.io/en/master/use/working_with_volumes/

In the future the drivers should support using actual lvm-enabled devices if you have them.

knitatoms · on Nov 26, 2013

Any drawbacks to using lvm comapared to AUFS? I'd like to switch to Debian from Ubuntu - will I notice any performance differences or other restrictions?

shykes · on Nov 26, 2013

There is a slight performance overhead with lvm, but nothing dramatic. The other advantage of the AUFS driver is that it is more proven. If you have a way to get aufs on your system (and I believe debian does), my pragmatic ops advice would be to use that and give the other drivers some time to get hardened. Of course my advice as a maintainer is that all of them are equally awesome :)

sandGorgon · on Nov 26, 2013

how would you compare aufs/lvm performance vs with the upcoming btrfs support? That is the one that I presume will have the most long term continuity because of btrfs becoming the default sooner than later (isnt it already for opensuse/fedora ?)

shykes · on Nov 26, 2013

btrfs will probably compare favorably to lvm.. But it's just an educated guess at this point.

nl · on Nov 26, 2013

LVM would use more disk space, wouldn't it?

DannoHung · on Nov 26, 2013

Is there a document on the relative costs and benefits of the two drivers?

andyl · on Nov 26, 2013

does this mean that docker will run on 32 bit systems ?

shykes · on Nov 26, 2013

Not yet, but that's coming very soon. We've been artificially limiting the number of architectures supported to limit the headache of managing cross-arch container images. We're reaching the point where that will no longer be a problem - meanwhile the Docker on Raspberry Pi community is growing restless and we want to make them happy :)

k__ · on Nov 26, 2013

So the software in a container runs on the host-OS and there is no extra OS installed in the container?

gcr · on Nov 27, 2013

There's a sleight of hand going on here.

The boundary between "kernel" and "libraries like libc" is very stable and doesn't change often. That means that often, the kernel distributed by Arch can work reasonably well in an Ubuntu system, and vice versa.

With that in mind: The "ubuntu" image ships the "ubuntu-glibc" and "ubuntu-bash" and "ubuntu-coreutils" and so on, but they continue to work on your Arch host because the system calls don't ever change.

You can't link (say) ubuntu-glibc into arch-bash though, which is why containers are built off of a "base ubuntu image" in the first place.

k__ · on Nov 27, 2013

ah, so only the host-kernel is used and I have to add (distribution specific) libraries to the container?

gcr · on Nov 28, 2013

Pretty much.

Containers come with their libraries though; you don't have to "add" anything. You'd just apt-get it within the container and it would pull down its dependencies.

TkTech · on Nov 26, 2013

This is correct. It uses features of the kernel and modern filesystems for efficient isolation.

salient · on Nov 26, 2013

For your sake, you better hope there won't be another ARMv7 and also ARMv8 version of Raspberry Pi, too :)

justincormack · on Nov 27, 2013

Well, these things exist not raspberry pies. ARM architectures are a pain. Like x86 ones were in the old days (i386, i486). MIPS is even worse.

deserted · on Nov 26, 2013

Docker already runs fine on 32 bit systems if you compile it yourself and remove the check for 64-bit-ness.

andyl · on Nov 26, 2013

Didn't know that - I'll try it!

What happens if you put a 64bit executable in a container they try to run the container on a 32bit machine? Or a Rasberry/Arm device?

0xbadcafebee · on Nov 26, 2013

When you try to run an executable on an architecture it wasn't designed for, it does not run. But this has nothing to do with docker or Linux.

shykes · on Nov 27, 2013

This, by the way, is the reason we artificially prevent docker from running on multiple archs. If half of the containers you download end up not running on your arch, and there's no elegant way for you to manage that by filtering your results, and producing the same build for multiple archs (and what does it even mean to do that?) - then all of sudden using docker would become much more frustrating.

0xbadcafebee · on Nov 27, 2013

This sounds like a clothing manufacturer who only makes clothes in black because trying to match colors would be too frustrating for customers. At the end of the day, lots of people have multiarch networks, and don't want to have to choose between using a tool or supporting just one platform. Removing functionality does not make their lives easier.

Legion · on Nov 26, 2013

Could someone explain the logistics of Docker in a distributed app development scenario? I feel like I am on the outskirts of understanding.

My goal is having a team of developers use Docker to have their local development environments match the production environment. The production environment should use the same Docker magic to define its environment.

Is the idea that developers define their Docker environment in the Dockerfile, and then on app deployment, the production environment builds its world from the same Dockerfile? How does docker push/pull of images factor into that, if at all?

Or is the idea that developers push a container, which contains the app code, up to production?

What happens when a developer makes changes to his/her environment from the shell rather than scripted in the Dockerfile?

What about dealing with differences in configuration between production and dev? (Eg. developers need a PostgreSQL server to develop, but on production, the Postgres host is separate from the app server - ideally running PG in a Docker container, but the point being multiple apps share a PG server rather than each running their own individual PG instance). Is the idea that in local dev, the app server and PG are in two separate Docker containers, and then in deployment, that separation allows for the segmentation of app server and PG instance?

I see the puzzle pieces but I am not quite fitting them together into a cohesive understanding. Or possibly I am misunderstanding entirely.

0xbadcafebee · on Nov 26, 2013

> Could someone explain the logistics of Docker in a distributed app development scenario?

Docker lets you build an environment (read: put together a bunch of files) for you to run an app in. It also has other features, like reducing space if lots of your apps [on the same host] use the same docker images, and networking stuff.

> My goal is having a team of developers use Docker to have their local development environments match the production environment

You run a container. You use images to distribute files. A Dockerfile is a loose set of instructions to build the images.

The basic idea is that, any single program that you want to be able to run anywhere, you make into a container. You can run it here, you can run it there, you can run it anywhere. The whole "running it anywhere" concept comes from the idea that all of your containers are created based on the same images, so no matter what kind of crazy mix of machines you have, your containers will just work - because you're literally shipping them a micro linux distribution in which to run your application. And since all the applications are running in isolated little identical containers, you can run as many of them as you want, independent of each other, in whatever configuration you want.

You'll have a PSQL container and an App container, and you'll manage them separately, even if they share images - the changes they make get saved off to a temp folder so they don't impact each other. Your environment stays the same only as long as the containers and images you're using are the same.

There will always be differences between development and production. You have to focus on managing those differences so you have confidence that what you're shipping to production actually works. The only thing Docker really does there is make sure the files are basically the same.

cschmidt · on Nov 26, 2013

Since no one has answered you, I'll take a stab. I haven't actually used Docker yet, but I've been lurking extensively. Hopefully someone will correct me if I get this wrong.

A developer will use a Dockerfile to build the main dependencies that you need, and then save an image. Then to deploy code (or other frequently changing data), you pull that base image, add or update whatever else is needed, and save it again. (You can add layers to an image as often as desired.)

Then this saved image is pushed into a production environment. It seems like configuration data is often saved in images, so they require zero configuration. However, you can pass as many command line parameters as desired to run the image in production, so you can keep your configuration separated (like http://12factor.net/config) if you want.

Sprint · on Nov 26, 2013

I looked at it several times but never really got it. Can I use Docker to isolate different servers (think http, xmpp, another http on another port) on a server so that if one of them was exploited, the attacker would be constrained to inside the container? Or is it "just" of a convenient way to put applications into self-contained packages?

shykes · on Nov 26, 2013

It's both. Everyone using docker benefits from the "software distribution" feature. Some people using docker also benefit from the security and isolation features - it depends on your needs and the security profile of your application. Because the underlying namespacing features of the kernel are still young, it's recommended to avoid running untrusted code as root inside a container on a shared machine. If you drop privileges inside the container, use an additional layer of security like SELinux, grsec, apparmor etc. then it is absolutely possible and viable in production.

It's only a matter of time before linux namespaces get more scrutiny and people grow more comfortable with them as a first-class security mechanism. It took a while for OpenVZ to get there, but not there are hosting providers using it for VPS offerings.

On top of namespacing, docker manages other aspects of the isolation including firewalling between containers and the outside world (currently optional but soon to be mandatory), whitelisting of traffic between certain containers, etc.

pradn · on Nov 26, 2013

Which namespacing features are still young? Is it still possible to evade LXC as described in this post? http://blog.bofh.it/debian/id_413

shykes · on Nov 26, 2013

To my knowledge there are currently no known exploits. It's more a matter of risk management: newer codebases are less secure because we had less time to find bugs and spread best practices. That problem is amplified for larger codebases (but diminished by a more active developer community).

throwaway092834 · on Nov 26, 2013

Yes, it is trivial for a root user in an LXC container to break out. One can load a kernel module from within a container, for example. LXC containers do not provide security partitioning at all.

shykes · on Nov 26, 2013

This is just totally wrong. Any decent container configuration (including the default docker configuration) will agressively drop capabilities, preventing you from doing this, and any other script-kiddie attack.

See my other comment in this thread for a more accurate answer.

justincormack · on Nov 27, 2013

Yes, you probably need a proper kernel vulnerability, which you can exploit in a reduced environment. Not trivial, but not impossible, scanning this years CVEs some would probably be sufficient (eg ones that only nees socket access).

riquito · on Nov 27, 2013

Can't you take advantage of a kernel vulnerability on any reduced environment, regardless of LXC?

throwaway092834 · on Nov 28, 2013

Not in certain container types such as a full VM.

throwaway092834 · on Nov 26, 2013

Not wrong at all. He asked about LXC. Privilege restrictions are not part of LXC.

zrail · on Nov 26, 2013

> Can I use Docker to isolate different servers

That's the idea. Docker is based on LXC which provides configurable isolation between process groups. So, you could run your http in one container, your database in another, and your application server in a 3rd and have reasonable confidence that they can only talk to each other the way you want them to.

yebyen · on Nov 26, 2013

I have heard two opinions. One of them is paranoid: "if you give up root on the container you give up root on the host system." I am not sure I believe this, I can't show you how to do the compromise and escalation, but I know there is a -privileged mode you can use on your containers in which Root user can do things like create device nodes, reset the system clock, and presumably other things you should not allow if you are concerned about this kind of attack.

The other opinion (given that you don't enable -privileged mode) is that lxc namespaces _do_ protect your host and other containers from being exploited due to compromise of a contained process, and cgroups _do_ protect your precious i/o and cpu cycles from being fully consumed and deadlocking your whole system when a rogue container decides to start infinite looping.

The competent answer of course is "try it", you should know that there are ways your processes can be compromised, and you should check if the exploits you know about result in the same or different privilege escalations when you apply them to a service hosted within a docker instance. That being said, you can't really prove a negative...

julien421 · on Nov 26, 2013

This is maybe part of the answer on the security point http://blog.docker.io/2013/08/containers-docker-how-secure-a...

mclarke · on Nov 26, 2013

I wrote up some thoughts yesterday on Docker's value as a tool in continuous integration testing which might find interesting: http://mike-clarke.com/2013/11/applied-docker-continuous-int...

neals · on Nov 26, 2013

I see docker come around every now and then here. I'm a smalltime developer shop, small team, small webspps. What can docker do for me?

Can this reduce the time it takes me to put up and Ubuntu installation on Digital Ocean?

Is this more for larger companies ?

shykes · on Nov 26, 2013

Yes, Docker should be able to help you. It is definitely not just for large companies.

Docker can help you streamline your development and deployment process across all the machines, from the development laptop to the production server. The result should be that your development environment is easier to setup and replicate, more similar to production, and easier to customize as you evaluate new tools, hosting providers and software components.

I recommend checking out the "Getting started" page, it has a great online tutorial: http://www.docker.io/gettingstarted/

You should also browse through this list of real-world use cases. http://www.docker.io/community/#What-people-have-already-bui...

I hope this helps!

hsuresh · on Nov 26, 2013

I just went through the Getting started tutorial. Thanks for a great intro into Docker. Is this tutorial open sourced too? What would really help is, include a few lines on how the commands can be used in real world.

Edit: The website/tutorial is open source too. https://github.com/dotcloud/www.docker.io

neals · on Nov 26, 2013

Thank you, I will definitely do so.

mateuszf · on Nov 26, 2013

You will be able to have multiple Ubuntu servers running on one Ubuntu VPS on Digital Ocean. Each of them may have some initial state (installed some software, configuration) that you will be able to spawn as separate virtual machine. The "inside" machine will consume mnimal amount of memory (for running applications) and not for whole OS as in case of "real virtual machine". Also, you will be able to spawn and destroy new VMS very fast (under 1s).

dman · on Nov 26, 2013

What does the vm abstraction buy you vs processes? If one entity "owns" all the servers what is the draw in seperating the servers out in their own VM?

derefr · on Nov 26, 2013

You don't have to think of them as VMs, that's just a simplification for the sake of the conversation. They're resource-isolated process groups. It's exactly the same idea as the original Unix-paradigm security approach of making each service run under a separate system-uid, but with a much more complete isolation.

mateuszf · on Nov 26, 2013

High speed, low memory consumption with proper snapshots/separation.

rlpb · on Nov 27, 2013

Well, Ubuntu can do that without Docker; just with LXC. Docker uses LXC and does other things with it, but if all you want to do is run multiple lightweight Ubuntu VMs, then just use LXC.

jebblue · on Nov 26, 2013

Is it an easier to use chroot? Because I'd just use chroot if I wanted that or maybe openvz. Or just spin up a virtual node dedicated to the task I need it to perform.

shykes · on Nov 26, 2013

It is definitely designed to be easier in several ways. One way is that it defines a format for portable container images, and an infrastructure for moving images around machines and sharing them with others. See http://www.docker.io/gettingstarted/ for an online tutorial and https://index.docker.io for the index of all community images.

Nux · on Nov 26, 2013

EL6 users (RHEL, CentOS, SL), I've just learned Docker is now in EPEL (testing for now, but will hit release soon):

yum --enablerepo=epel-testing install docker-io

PS: make sure you have "cgconfig" service running

klaruz · on Nov 26, 2013

Very nice, I can stop updating my fork of the other rpm spec that's on github.

Any info on how it sets up networking? I see -b none in the startup... Is there some other tool that does it on EL6?

Do you know where devel talks for this rpm are happening? I did not see any mention of it in the EPEL list archives.

Nux · on Nov 26, 2013

I am a newbie at this, but from what I'm reading you need to create a bridge called lxbr0, assign some IP on it and docker should be able to figure out IPs to use for the containers.

jackielii · on Nov 27, 2013

Hmm. Thanks for this, but I couldn't find package "docker-io" in epel-testing.

Could you post a detailed instruction?

Thanks a lot!

alanthing · on Nov 27, 2013

Looks like it was pulled- I can't find it on any EPEL mirrors.

Nux · on Nov 28, 2013

Should be back shortly. Meanwhile: http://koji.fedoraproject.org/koji/packageinfo?packageID=170...

SEJeff · on Nov 26, 2013

Thanks for paying it forward :)

speeq · on Nov 26, 2013

Does anyone know if it is possible to set a disk quota on a container?

rbadaro · on Nov 26, 2013

FWICT, storage limits are still not directly supported. You can however create a user per container and limit that user's quota. This is the relevant user comment (but the full discussion is insightful): https://github.com/dotcloud/docker/issues/471#issuecomment-2...

geku · on Nov 26, 2013

Interested in that too.

veidr · on Nov 26, 2013

Me 3.

larsmak · on Nov 27, 2013

I'd be interested in this as well. The workarounds mentioned seems to require a bit to much hacking for my liking. Are there any actual plans for implementing simple disk-usage (and perhaps network) limiting?

apphrase · on Nov 26, 2013

Can anyone please tell about the overhead of Docker, compared to no-container scenario (not against a fat vm scenario)? I am a "dev" not "ops", but we might make use of Docker in our rapidly growing service oriented backend... Thanks

shykes · on Nov 26, 2013

For process isolation, the overhead is approximately zero. It adds a few milliseconds to initial execution time, then CPU and memory consumption of your process should be undistinguishable from a no-container scenario.

For disk IO, as long as you use data volumes [1] for performance-critical directories (typically the database files), then overhead is also zero. For other directories, you incur the overhead of the underlying copy-on-write driver. That overhead varies from negligible (aufs) to very small (devicemapper). Either way that overhead ends up not mattering in production because those files are typically read/written very unfrequently (otherwise they would be in volumes).

[1] http://docs.docker.io/en/master/use/working_with_volumes/

apphrase · on Nov 26, 2013

VEry good answer, thank you. So in my scenario, I was thinking of packaging our service apps (which run on JVM) and if I make sure that the OS has jvm and all the needed stuff ready. Shipping a package and deploying the service, is just transferring the Docker container and run it. Which happens to be using the same process models as the OS itself. So in other terms, Docker is a convenience layer (a glorified FS abstraction). I am not saying this to undermine the utility, just trying to figure out if it solves more problems then the complexity it brings (which is one more moving part in your toolchain)

Xelom · on Nov 26, 2013

Will it be possible to run Docker containers on Android? I may be asking this incorrectly. So correct me if I have a mistake. My question might be "Will it be possible to run Docker containers on Dalvik VM?" or "Can I run an Android in Docker container?"

shykes · on Nov 26, 2013

I have not tried personally, but I think it should be possible. People already do crazy things with Docker :) For example: http://resin.io/docker-on-raspberry-pi/

justincormack · on Nov 27, 2013

You cant "run docker on Dalvik VM" as it runs on Linux not on a JVM. You also can't run Android on a Docker container as Android needs some system calls not in a standard Linux kernel, although there may be some ways around that.

You might be able to run docker on Android though, but you may need to compile a kernel with containers (namespace) support as I don't know that Android ships with them (I forget).

T-zex · on Nov 26, 2013

Is it possible to have multiple instances of the same app running in Docker containers and having readonly access to a "global" memory linked file? What I'm trying to achieve is having sand-boxed consumers having access to some shared resource.

alexlarsson · on Nov 26, 2013

Use the volumes feature to share a filesystem between multiple containers

julien421 · on Nov 26, 2013

Have you try to use the mount option/instruction to do so?

T-zex · on Nov 26, 2013

Haven't tried anything yet. Just wondering if this is an idiomatic use case to grant container access to global resources.

shykes · on Nov 26, 2013

Yes, it's idiomatic, but typically you will use shared filesystem access rather than shared memory. By default nothing is shared, and you can specify exceptions using shared directories called "volumes": http://docs.docker.io/en/master/use/working_with_volumes/

sown · on Nov 26, 2013

Hey,

docker newb here. Can I easily put my own software in it? I've got this c++ program that has a few dependencies in ubuntu.

shykes · on Nov 26, 2013

The easiest way is to add a Dockerfile to your source repository, and use 'docker build' to create a container image from it.

Documentation link: http://docs.docker.io/en/master/use/builder/

sown · on Nov 26, 2013

Hmm. this is good. but my build environment is kind of wonky.

Can I just hand it a .tar.gz and say, put it in /usr/asdf and let it run? What about python scripts? Maybe I just give it a location to an RPM? like in this document?

http://docs.docker.io/en/master/examples/python_web_app/

shykes · on Nov 26, 2013

You can use the ADD build instruction in the Dockerfile to upload any part of your source repo into the container. Then more RUN instructions to manipulate that, for example compile the code etc.

Here are 2 examples:

https://github.com/steeve/docker-opencv https://github.com/shykes/docker-znc

PetrolMan · on Nov 26, 2013

I couldn't reply to your deeper comment so I'm going to do it here. You can use the Dockerfile to copy files from you local system onto the image and then run various commands on the image. You have access to a full linux environment inside the docker image so you can do anything on the image that you can do in your normal development environment.

sown · on Nov 26, 2013

> I couldn't reply to your deeper comment so I'm going to do it here. You can use the Dockerfile to copy files from you local system onto the image and then run various commands on the image.

I wonder if this would be good for (large) embedded systems. Thanks for the tip.

jeffheard · on Nov 26, 2013

This is crazy talk of course, but I wonder if there'd be some way to use rsync or git to support distributed development of images the way git does with code?

I mean, it'd be neat to be able to do a "pull" of diffs from one image into another related image. Merge branches and so on. I don't know, possibly this would be just too unreliable, but I would have previously thought that what docker is doing right now would be too unreliable for production use, and lo and behold we have it and it's awesome.

shykes · on Nov 26, 2013

It wouldn't be the craziest thing people do with docker:

http://blog.bittorrent.com/2013/10/22/sync-hacks-deploy-bitt...

http://blog.docker.io/2013/07/docker-desktop-your-desktop-ov...

There has been discussion of taking the git analogy further. We actually experimented with a lot of that early on (https://github.com/dotcloud/cloudlets) and I can tell you it's definitely possible to take the scm analogy too far :)

I do think we can still borrow a few interesting things from git. Including, potentially, their packfile format and cryptographic signatures of each diff. We'll see!

Here's a relevant discussion thread: https://groups.google.com/forum/#!msg/docker-user/CWc5HB6kAN...

jfchevrette · on Nov 26, 2013

Unfortunately it looks like the documentation has not been updated yet...

So much for feature #7. Documentation should be part of the development/release process

julien421 · on Nov 26, 2013

The doc has been updated, but there was a problem while pushing it this morning. We are working on it.

shykes · on Nov 26, 2013

Documentation is definitely part of the process :) Apparently the documentation service triggered an incorrect build overnight. Until we fix that you can browse the latest version of the docs from the master branch: http://docs.docker.io/en/master

shykes · on Nov 26, 2013

Quick update: until we figure out what broke the build on our ReadTheDocs.org setup, we switched the default branch to master. So if you visit http://docs.docker.io you will get the bleeding edge build of the documentation, which happens to be accurate since we released it this morning :)

Sorry about that. One more lesson learned on our quest to ultimate quality!

idupree · on Nov 26, 2013

The Arch Linux instructions are still wrong; they say that aufs3 is required but it isn't anymore. http://docs.docker.io/en/master/installation/archlinux/

Is the warning "This is a community contributed installation path. The only ‘official’ installation is using the Ubuntu installation path. This version may be out of date because it depends on some binaries to be updated and published." still true? Fedora also has this warning (and no instructions). The "Look for your favorite distro in our installation docs!" link does not give me up-to-date instructions for any of my favorite Linux distros. I can't even see where in that installation documentation it says how to install from source code on generic Linux. What am I missing? (Of course I can get the source code and build it, but I want the documentation to be great :-D)

shykes · on Nov 27, 2013

You're right, the arch docs need some updating. Pushing that to the queue. Thanks!

idupree · on Nov 27, 2013

Yay! You should have have a section on installing from source too. This is part of doing open-source well.

shykes · on Nov 27, 2013

We explain how to setup a dev and build environment here: http://docs.docker.io/en/latest/contributing/devenvironment/

I guess we could reference that it the install docs.

dmunoz · on Nov 26, 2013

Nice to see Docker 0.7 hit with some very useful changes.

I see lots of people are getting some generic Docker questions answered in here, and want to ask one I have been wondering about.

What is the easiest way to use dockers like I would virtual machines? I want to boot an instance, make some changes e.g. apt-get install or edit config files, shutdown the instance, and have the changes available next time I boot that instance. Unless I misunderstand something, Docker requires me to take snapshots of the running instance before I shut it down, which takes an additional terminal window if I started into the instance with something like docker run -i -t ubuntu /bin/bash. I know there are volumes that I can attach/detach to instances, but this doesn't help for editing something like /etc/ssh/sshd_config.

MoosePlissken · on Nov 26, 2013

There's no need to manually take snapshots, docker does this automatically every time you run a process inside a container. In your example of running /bin/bash, after you exit bash and return to the host machine docker will give you the id for the container which has your changes. You can restart the container or run a new command inside it and your changes will still be there. If you want to access it more easily later, you can run 'docker commit' which will create an image from the container with a name you can reference. You can also use that new image as a base for other containers.

This is great for development or playing around with something new, but the best practice for creating a reusable image with your custom changes would be to write a Dockerfile which describes the steps necessary to build the image: http://docs.docker.io/en/latest/use/builder/

dmunoz · on Nov 26, 2013

Yes, my goal here is ease of playing around with something new. I would setup a dockerfile after I knew exactly what setup I wanted.

You're right, I misunderstood what docker was doing when shutting down the container. Seems like I can start and reattach just fine. Here is an example workflow for anyone curious:

    root@chris-VM:~# docker run -i -t ubuntu /bin/bash
    root@0a8f96822140:/# cd /root
    root@0a8f96822140:/root# ls
    root@0a8f96822140:/root# vim shouldStayHere
    bash: vim: command not found
    root@0a8f96822140:/root# apt-get install -qq vim
    ...<snipped>...
    Setting up vim (2:7.3.429-2ubuntu2) ...
    root@0a8f96822140:/root# vim shouldStayHere
    ...Not exactly necessary, but I added a line to the file so I could identify it...
    root@0a8f96822140:/root# exit
    root@chris-VM:~# docker ps
    ID                  IMAGE               COMMAND             CREATED             STATUS              PORTS
    root@chris-VM:~# docker ps -a
    ID                  IMAGE               COMMAND                CREATED              STATUS              PORTS
    0a8f96822140        ubuntu:12.04        /bin/bash              About a minute ago   Exit 0
    root@chris-VM:~# docker attach 0a8f96822140
    2013/11/26 10:29:41 Impossible to attach to a stopped container, start it first
    root@chris-VM:~# docker start 0a8f96822140
    0a8f96822140
    root@chris-VM:~# docker attach 0a8f96822140
    ls
    bin  boot  dev  etc  home  lib  lib64  media  mnt  opt  proc  root  run  sbin  selinux  srv  sys  tmp  usr  var
    root@0a8f96822140:/# cd /root
    root@0a8f96822140:/root# ls
    shouldStayHere
    root@0a8f96822140:/root# cat shouldStayHere
    Hello World!
    root@0a8f96822140:/root#

So, if I did some heavylifting to set something up and wanted to keep this as a base for later work, now I would do e.g. docker commit a8f96822140 <some identifier>

yebyen · on Nov 26, 2013

Yes, I wrote a long wordy response and neglected to mention "docker start" which is a perfectly good way to come back to a stopped container after the first "docker run".

I prefer to never keep anything important in a stopped container (for very long) without committing it back to an image, and I don't like dealing with numeric ids.

Recently (it looks like you don't have this change yet) docker added the automatic naming scheme giving every container a random name of some "color_animal" pair which I think reinforces the point, stopped containers are not a place to store meaningful/persistent state information for very long.

This mishmash gets run almost every day on my docker hosts to clean up after terminated experiments:

docker ps -a|egrep -v 'ID|Up'|awk '{print $1}'|xargs docker rm

Beware, it will delete all of the stray containers you've ever created before that are now stopped!

shykes · on Nov 26, 2013

You can assign permanent names to containers. We're designing the naming syatem to make it completely OK to keep persistent containers. They're just named directories, docker will never remove them on its own.

For example a common pattern is to create a placeholder database container with reference dara as a volume, but no actual process running and no network ports. Then successive versions of the database container are started on the side, with shared access to the placeholder's volume. Later you might run a backup container on the same volume. In other words you can point to a particular dataset as a container of its own, separate of the various applications which might access it. All of these interactions are visible to docker so it can authenticate, restrict, log, or hook them in all the standard ways.

In fact internally images and containers are stored side-by-side. In future versions we are going to accentuate that similarity.

dmunoz · on Nov 26, 2013

Indeed, if I had done anything important I would certainly commit the changes to the container. It's great to have some version control for my playful discovery.

The changes you mention sound nice. It's no surprise I don't have them:

    root@chris-VM:~# docker version
    Client version: 0.5.3
    Server version: 0.5.3
    Go version: go1.1

It was the easiest VM I had access to at the moment of posting. I should update the docker in there.

I have used docker ps -a | awk '{print $1}' | xargs docker rm a couple times to clean up after playing around. I was slightly annoyed that it tried to docker rm a (nonexistent) container with the id ID. Thanks for reminding me to throw a egrep -v 'ID|Up' in front of awk.

yebyen · on Nov 27, 2013

If you hadn't heard, the new release of docker no longer uses "color_animal" but "mood_inventor"... "the most important new feature of docker 0.7"

https://github.com/dotcloud/docker/pull/2837

tl;dr They are going to pick a new pair of things for every major release of Docker. This is meant to let you keep track of more containers over a long time. Apparently you are in fact meant to keep them around if they're still in working order, and remember them "by ID" or by name.

yebyen · on Nov 26, 2013

I have not updated my own docker in a long time, I use CoreOS now, which comes with automatic updating via chaos monkeys. It's always a pleasant surprise when I see my system is about to go down for a reboot, and trying to find what's changed when it comes back up!

yebyen · on Nov 26, 2013

You don't have to take the snapshot before you shut it down. The container persists after its main process is terminated. It's safe to take the snapshot any time after.

Your question is very near to me, I published a teeny-tiny script called 'urbinit'[1] for doing exactly what you are asking. It's meant to use with urbit, where you start by creating a 'pier' and use it to begin 'ships'. The ships must not be discarded (unless you don't mind having no real identity, and many parts of urbit require you to maintain a persistent identity). If you re-do the key exchange dance again from a fresh pier to identify yourself as the same 'destroyer' (having discarded the previous incarnation), you wind up as a perfectly viable ship with packet sequence numbers that are out of order and can't really receive any messages or acknowledge any packets that are waiting to be delivered in the correct order from your neighbor ships.

Anyway, urbinit is a manifestation of the fact that Docker doesn't really have anything to help you deal with this (very common and not at all unique to Urbit) problem. Docker offers that you can use a volume... make the state you need to be persistent part of a volume. Unfortunately this means it won't be present in your images; or you can do roughly what urbinit does, which is a very simple and classic problem all Computer Scientists past year 1 should already know.

Launch the container in detached mode (-d) and save the ID. Create a lock file. Urbinit won't do step 1 if there is already a lock file created that was never cleared. This is to prevent you from launching the same ship twice at one time with two addresses, which is very confusing to the network.

Attach if you need to interact through terminal, or however you want, affect the container by sending packets to it (eg. login via ssh and do stuff). Terminate your process or let it end when it's ready to end naturally, or the power goes out. If you're using urbinit (and the power did not go out), this is the point where the commit process comes along and commits your container ID back to the named image that it was created from, simultaneously clearing the lock created in step 2.

If you're doing this with several different images on one host that are meant to run simultaneously, a nice trick is to name your lock file after the actual docker image that's used in the commit step.

[1]: http://nerdland.info/urbit-docker

gexla · on Nov 26, 2013

So, I assume that if you aren't using AUFS then you don't have to deal with potentially bumping up against the 42 layer limit? Or does this update also address the issue with AUFS?

shykes · on Nov 26, 2013

The 42 layer limit is still there... But not for long! There is a fix underway to remove the limitation from all drivers, including aufs.

Instead of lifting the limit for some drivers first (which would mean some images on the index could only be used by certain drivers - something we really want to avoid), we're artificially enforcing the limit on all drivers until it can be lifted altogether.

If you want to follow the progress of the fix: https://github.com/shykes/docker/pull/66

(This pull request is on my personal fork because that's where the storage driver feature branch lived until it was merged. Most of the action is usually on the main repo).

m_mueller · on Nov 26, 2013

One thing I've been wondering is whether the 42 limit was related to performance considerations. If yes, I actually somewhat like it - I'm a proponent to making systems behave in a way that they won't come around and bite you. Will a container based on, say, 200 layers still load and run with reasonable performance?

alexlarsson · on Nov 26, 2013

That depends on the backend a bit i guess. devicemapper "flattens" each layer, so the depth should have zero effect on the performance.

For aufs, I have no real data, but I assume that a 42 layer image is somewhat slower than a 1 layer image.

The fix for going to > 42 layers is to recreate a full image snapshot (using hard links to share file data) every N layers, so the performance would be somewhere between the performance between 1 and 42 layers depending on how many AUFS layers you got.

m_mueller · on Nov 26, 2013

Doesn't 'flatten' mean that there's going to be a tradeoff between IO performance and storage size? If that's the case I'd like to have some control over what happens, at least with one of the implementations. Say, during development of a Dockerfile just use the layers as before, potentially without any depth limits, but when an image is ready being able to call 'flatten' manually.

Some background: Using 0.65 it took me several days to develop a Docker image with manually compiled versions of V8, PyV8, CouchDB, Flask, Bootstrap and some JS Libraries[1]. With no 42 limit it wouldn't have taken so long since I would have had way more caching points - however I'd also be afraid about performance.

[1] the image is available on the repository as ategra/xforge-dependencies. I can upload the Dockerfile to github if anyone is interested.

alexlarsson · on Nov 26, 2013

"flatten" is a simplification, what I mean is that the devicemapper thin provisioning module uses a data structure on disk that has a complexity that is independent of the snapshot depth. I did not mean that the Docker image itself is somehow flattened.

I don't think there will be any performance problems with deep layering of images, either on dm or ads.

m_mueller · on Nov 26, 2013

That's some great news, thanks a lot for the heads up and the great work you guys are doing! I'm really looking forward to what Docker and its ecosystem is becoming - I think it's already quite obvious that it will revolutionize the way people think about linux application rollout - both from the user- as well as the application developer's perspective. It might even make 2014 the year of the linux desktop ;-).

shykes · on Nov 26, 2013

The 42 limit comes from the aufs code itself. My guess is that it's because of stack depth: if you stack these kernel data structures too many levels deep, you might trigger an overflow somewhere. The hardcoded limit is probably a pragmatic way to avoid that.

Of course if you had infinite memory and those kernel data structures were designed to never overflow, then you would have a performance problem on your hands instead of a kernel panic :)

shimon_e · on Nov 26, 2013

The links feature will make deploying sites a million times easier.

neumino · on Nov 26, 2013

You guys are awesome, just awesome!

I was pretty sure that the requirement for AUFS would stick for a long time -- I was resigned to use a special kernel. But again, you folks surprise me!

You guys just rock!

oskarhane · on Nov 26, 2013

Hmm, not sure I'm understanding #1 correct. Can I install it on, let's say, Debian without Vagrant/Virtualbox now?

I can't find the info in the docs.

chc · on Nov 26, 2013

They don't have a package repository for Debian yet, but I think you should just be able to make/install it.

chr15 · on Nov 27, 2013

For local development, I use Vagrant + Chef cookbooks to setup my environment. The same Chef cookbooks are used to provision the production servers.

It's not clear to me how I can benefit from Docker given my setup above. Any comments?

ecnahc515 · on Nov 27, 2013

Docker really replaces the need for Chef in a sense. You don't need Chef for configuration of your container, because ideally your container should be saved in an image which you use to deploy. This keeps things consistent between your dev environment, staging and production.

Chef is based on re-running the same commands with various different options depending on the environment, and even without any thing in the cookbooks/attributes/environments changing, Chef still cannot guarantee that this run will produce the same results as a run that happened yesterday, simply because it isn't like an image.

fosk · on Nov 27, 2013

I'm new to these tools. Given your explanation, how does Docker replace a packaged Vagrant machine[0] with all the software already pre-installed (without using Chef)?

[0] http://docs.vagrantup.com/v2/cli/package.html

powerbook5300CS · on Nov 27, 2013

Much lighter weight. Instead of hosting an entire operating system you just host the application.

Imagine spinning up your db instance vm, your web tier vm, your load balancer vm... etc.

Unless you have a ton of ram it isn't going to happen. With docker you can run containers that mimick a very very large infrastructure on your laptop.

vivab0rg · on Nov 27, 2013

What about using Vagrant with the [vagrant-lxc plugin](http://fabiorehm.com/blog/2013/04/28/lxc-provider-for-vagran...)?

kro0ub · on Nov 27, 2013

Can someone please explain what docker does and brings to the table, what all the fuss seems to be about? I've looked into it several times and really can't tell from any of what I've found.

saboot · on Nov 27, 2013

I have heard / read about docker for quite some time, yet it is unclear still how this is useful.

Let me ask a direct need I have, would docker allow me to use newer c++ compilers on redhat so I can code in c++11?

unwind · on Nov 26, 2013

Annoying typo in the submission's title, it would be awesome if someone could fix that.

It's just s/distrubtions/distributions/, obviously.

zrail · on Nov 26, 2013

Fixed. Sorry about that.

linvin · on Nov 26, 2013

Also, when writing about -p option (e.g. -p 8080:8080), you could choose different port numbers for container and host, so we know the latter is for the host and former is for container.

shykes · on Nov 26, 2013

That's a good point, thanks. Note: the author and HN poster are 2 different people, but evidently we both check comments :)

vpsserver · on Nov 26, 2013

It doesn't run on a typical OpenVZ VPS.

Is there any alternative for separating apps on a single VPS?

ecnahc515 · on Nov 27, 2013

Thats because you can't really use the tools the kernel provides which docker uses while in an OpenVZ container.

binarnosp · on Nov 27, 2013

It looks like the Ubuntu packages are not there? (apt-get cannot find them)

Edmond · on Nov 26, 2013

getting excited about docker and lxc in general..

igl · on Nov 26, 2013

i like docker \o/