Docker uses LXC containers. In Linux, these aren't VMs and are light weight user...

cpuguy83 · on March 24, 2016

There are no custom kernel modules, everything is in a stock kernel since 3.10 (which not-so-coincidentally is the minimum supported kernel version).

Containers are run with whatever user you tell it to run as, the default is root because that's the only guarantee.

LXC is also something different. LXC is a set of userland tooling to interact with cgroups and namespaces (which docker used to exec out to). LXC != Linux containers (and indeed there isn't really such a thing as a container like there is a zone or a jail on Solaris and BSD respectively, it's made up) Also again, no custom kernel modules on BSD.

cpuguy83 · on March 24, 2016

Docker on Windows (with native Windows containers) is also a very real thing and will ship with the next Windows Server release (you can download a technical preview from Microsoft now).

hinkley · on March 24, 2016

The glaring security hole in Docker is that it has not designed a solution for keeping secret data necessary to build an image from being in the image at run time.

They also haven't solved the general case of keeping transient build data out of the final image either, but that's a broader problem that doesn't necessarily involve security concerns.

For now not a lot of people are concerned about either problem so it's not getting the attention it deserves. But they've been steadily peppered with inquiries about these issues for a year or two now and they still don't have an answer, which is concerning. I believe this is one of the reasons the CoreOS guys wandered off to do their own thing.

Fortunately for us and unfortunately for them, they have the design aesthetics of the Marquis de Sade, and until they start giving even half a thought to ergonomics, Docker is perfectly safe.

ownagefool · on March 24, 2016

They have build args for this in now. Thus, you'd do something like:

docker build --build-arg OAUTH_TOKEN=blah -t example .

hinkley · on March 24, 2016

I think you just proved my point. We're all of us running around with our pants down because we think Docker is taking care of this stuff but it's merely a bunch of features that look like they should be fit for that purpose but aren't.

And this is why I am stuck with a separate build and package phase, because I have to have that separation between the data available at build time and what ends up shipped, but even there I'm pretty sure I'm making mistakes, due to some of the design decisions Docker made thinking they were helping but actually made things worse.

For instance, there's no really solid mechanism for guaranteeing that none of your secret files end up in your docker image, because they decided that symlinks were forbidden. So I have to maintain a .dockerignore file and I can never really be sure from one build to the next that I haven't screwed it up somehow. Which I will, sooner or later.

I'm always one bad merge away from having to revoke my signing keys. It's a backlash waiting to happen.

limelight · on March 24, 2016

I don't see how that's a compelling argument at all.

All that's keeping you from committing your credentials is a .gitignore file. They have the file, it works reliably, don't worry about it.

hinkley · on March 24, 2016

You should know there was a pretty big bug fixed in .dockerignore in just the last release. [edit] That bug was in the logic for white-listing files, which is generally the safest way to keep from accidentally publishing things (that is, if it works).

And it's possible a similar issue still exists in docker-compose but it's still open.

.gitignore keeps me from checking my files into git, but it doesn't keep me from publishing them in a docker image. So now I have a second way to screw up.

colemickens · on March 25, 2016

Can you link to this bug? I thought .dockerignore specifically didn't allow whitelisting and only allowed for blacklisting files that weren't to be included.

Are you saying that docker would include files that should have been excluded by .dockerignore? I'd be interested to learn more. Thanks in advance.

Estragon · on March 25, 2016

You could probably whitelist with a .dockerignore like

    *           # exclude everything
    !README.md  # include the README.
    !run.sh     # include the initiation script

You would want to check exactly what the globbing rules are for the .dockerignore file, though. I don't know whether '*' will catch .dotfiles, for instance.

  https://docs.docker.com/engine/reference/builder/#dockerignore-file
  https://golang.org/pkg/path/filepath/#Match

hinkley · on March 25, 2016

That's it, thanks for following up in my absence.

There are a couple of frameworks where all of the production files end up in, for instance /dist and one other directory. Rather than having to constantly blacklist everything you just say "ignore everything except X and Y"

hinkley · on March 25, 2016

I'm sorry, things got hectic and I bailed on the discussion. I thought I had a handy link to the bug I was thinking of, but I couldn't find a back-link from the issue I'm watching to the one in docker/docker.

I think but am not 100% certain this is the issue I was thinking of, but it seems the most likely, and it was just fixed in 1.10: https://github.com/docker/docker/issues/17911

Some day I'm sure .dockerignore will be solid, but my confidence level isn't high enough yet (it's getting there) to base my trust on.

My point was that there are other ways that directory structures and what is visible to COPY could have played out where vigilance is less of a problem. It's usually immediately obvious if a file you actually needed is missing from a build, but less obvious that a file that you categorically did NOT want to be there is absent.

Because the system runs in one of those scenarios and dies conspicuously in the other.

hinkley · on March 24, 2016

From the horse:

  The build-time environment variables were not designed to handle secrets. 
  By lack of other options, people are planning to use them for this. 
  To prevent giving the impression that they are suitable for secrets, 
  it's been decided to deliberately not encrypt those variables in the process.

colemickens · on March 25, 2016

How would they "encrypt" them that wouldn't be trivial to undo?

I think people aren't concerned about it because it doesn't make sense to try to put secrets into container images. Whatever you're using to deploy your Docker containers should make those secrets available to the appropriate instances at runtime. This is how Kubernetes handles secrets and provides them.

http://kubernetes.io/docs/user-guide/secrets/

(For example, what if you have two instances of a service and they need to have different SSL certs? Are you going to maintain two different containers that have different certs? Or would you have a generic container and mount the appropriate SSL cert as a volume at runtime?)

ownagefool · on March 25, 2016

I've actually read that. For context, it's a comment made before the feature was complete. Said feature, according to the manual, doesn't persist the value, thus is probably suitable to pass a build time secret.

From my testing though, as long as you set the build-arg and consume it directly, it doesn't seem to persist. That said, it's super easy to fuck that up if the tool you consume it with then goes on to save the secret somewhere.

Thus it's no doubt best to use expiring tokens or keep your build seperate. Also don't use it to seed a runtime secret unless you treat, that'd force you to treat the image as a secret itself.

hinkley · on March 25, 2016

I linked to that because it cross references to the PR where the build-args feature was added. If they're out of sync that's 1) news to me and 2) confusing and should be fixed.

I think one of the things we're seeing is that Docker is opinionated, a number of powerful dev tools and frameworks are also opinionated, and us poor developers are stuck between a rock and a hard place when those opinions differ.

For instance I'm still not clear how you'd use the docker-compose 'scale' argument with nginx. Nginx needs to know what its upstreams are, and there's IIRC still an open issue about docker-compose renumbering links for no good reason, and some Docker employee offering up how that's a feature not a bug. I could punch him.

Single use auth tokens and temporary keys sure would fix quite a few things, to be certain, but those opinions keep coming in and messing up good plans :/

ownagefool · on March 26, 2016

I'm not sure if we should be really be having a go at them for whats on their git discussions verses whats in their documentation. I'd presume the documentation is canonical, I'd rather they weren't muting their discussions to remain consistent.

That said, as I said previously --build-args are dangerous, it's trivially easy to store then publish a secret, so it makes sense they weren't jumping for joy about implementing it. I'd say it is needed though, thus its now a thing.

overgryphon · on March 24, 2016

The two most recent technical previews for Windows Server support containers natively. You don't need a VM to run containers on Windows.

cwyers · on March 25, 2016

It supports Windows containers. You still need a Linux VM to run Linux containers.

atombender · on March 24, 2016

Docker does not use LXC. It's a separate project. LXC is similar, but has gone in a different direction.

cyphar · on March 28, 2016

> Docker uses LXC containers.

Nope, we've been using our own implementation of a container runtime for 2 years (libcontainer). LXC is not supported anymore and it was always a hacky execdriver.

> In Linux, these aren't VMs and are light weight user-land separations that use things like cgroups and lots of really special kernel modules for security.

They're kernel-space separations since the kernel understands namespaces (though it doesn't understand the concept of a container and some things aren't namespacrd).

> Unfortunately, this means Docker only runs on Linux .. not even Linux...special Docker Kernel Linux (all the features they need are in the stock Kernel tree, but it's still a lot of modules).

Almost all modern distros have support for all of the modules required to run Docker.

> In Linux, you can just go to localhost. I _think_ FreeBSD has native Docker support with some custom kernel modules. I'm not sure...I've only looked at the Readme. I haven't tried it.

FreeBSD is not supported as a daemon.

> So even in Windows/Mac, all your containers do run in one VM (where as with traditional stuff you mentioned, you'd need a VM for each thing).

Actually, recent versions of Docker can run as a daemon on Windows using some proprietary features I don't care about.

> Docker containers are meant to handle one application (that it runs as root within its container as the init process ... cause wtf?).

All machines have a single process running as root as the init. You can run a proper init inside your container (in fact it's recommended), and run many processes inside the same container. It's discouraged for scalability reasons to stuff your database and front-end in the same container because then it's hard to spin up more than one front-end connected to the same backend.

> In my work with Docker, I'm not sure how I feel. LXC containers have had a lot of security issues. Right now, Docker doesn't have any blaring security holes and LXC has increased security quite a bit.

Again: Docker doesn't and hasn't used LXC for quite a while. In addition, Docker has default selinux, seccomp and apparmour profiles that increase the security (seccomp allows us to disable syscalls that arent namespaced). There is a concern on the kernel side that they don't appear to care about going the Zones or Jails route: actually making the kernel aware about containers so that it can properly namespace things.

> After a while you get a shit ton of images that just waste space you're not using. CoreOS prunes these at regular intervals. A docker command to do this is still a Github issue. Writing one yourself with docker-py is horribly difficult because of image dependencies).

ahem % docker images | awk '/^<none>/ { print $3 }' | xargs docker rmi

Sure, it's not a single command but it isn't impossible to do and doesn't require docker-py. Besides, you should be using engine-api.

> Oh and images. Docker uses images to build things up like building blocks. That's a whole thing I don't want to go into, but look it up. It's actually kind of interesting and allows for base image updates to fix security issues (although you still need to rebuild your containers against the new images ... I think...I haven't looked into that yet).

There's also tools like zypper-docker to allow for hot-patching of images.