Distroless: Language focused docker images, minus the operating system

macrael · on Oct 6, 2017

Minimal docker images feel like the Right Choice for deployment artifacts. For a project I worked on recently that minimal image was a standalone go binary, nothing else in the image at all. But I was not sure how minimal was possible for a python project I was working on. Having standard base images for different languages will be great.

yokaze · on Oct 7, 2017

Funnily enough, I found it rather paradoxical (if you are developing in-house).

You choose a base, and that one is shared across your projects, so the size doesn't matter much within reason. Say, you go for Debian, then you count 50MB download and 100MB space over all your containers / apps, thanks to sharing. So, your stuff goes on top, potentially unshared (say multiple applications).

Which comes to the next contradiction: You then go then for a go-binary, which weighs in by itself easily over a 100MB of binary code. Unshared. Any advantage of self-containment of a go-binary (being statically linked) becomes more a liability, as it is the functionality you gain from docker-images (self-containment), without the sharing of the layers.

If you are focusing on delivering it to some random people, which would maybe care about having to install 100MB on top of whatever you are delivering, then I could see your point, but then you could just go for distributing the go-binaries themselves.

pebers · on Oct 7, 2017

Most Go binaries aren't over 100MB unless they have a bunch of data packed in them. 10-15MB seems more usual to me unless it's a particularly large project.

BraveNewCurency · on Oct 8, 2017

> You then go then for a go-binary, which weighs in by itself easily over a 100MB of binary code.

Hah, I'd love something that small. Right now, we have dozens of Node.JS apps, each one needing over 500MBs of NPM libraries..

mattomata · on Oct 7, 2017

Right, "nothing else" = "FROM scratch".

In the talk, I call this "FROM scratch -- for the rest of us."

That a half-truth because this also tries to set up a minimal compatible environment for common cases, e.g. it sets up ca-certs, a /tmp directory, and an /etc/passwd file.

macrael · on Oct 9, 2017

That makes a lot of sense. How did you decide where to draw the line on what to include and what to leave out?

marmaduke · on Oct 7, 2017

How is a single go bin in an image better than that bin by itself?

macrael · on Oct 9, 2017

Fundamentally it's not, but because Docker is a standard, you can throw that anywhere docker images can be consumed and run it. So, for instance, I was deploying to GKE which was easy to do. My docker deployment image was tiny (just the size of the binary) and I could run it anywhere.

maslam · on Oct 6, 2017

Be oh so careful. We ran into issues with alpine node images that would only show up in minimal images like that. Switched to Debian and never looked back.

jacobparker · on Oct 7, 2017

Was it because of musl libc? These images use glibc.

mattomata · on Oct 7, 2017

Yeah, that's exactly why we chose to start from debs.

jacobparker · on Oct 6, 2017

The images still have some "weird" stuff in them like man pages because they use deb packages from debian for common packages. That's not core to the idea, though (and note that apt is not included inside the container.)

catern · on Oct 6, 2017

This project is still reliant on Debian to build all the packages. This project merely selects a set of packages that Debian has built.

jacobparker · on Oct 7, 2017

They mention that in the video. It's an easy way currently to get built packages.

Bazel has plans on their roadmap (https://www.bazel.build/roadmap.html) to open-source rules for common packages ("Repository of BUILD files for third party OSS libraries open to the community") as a P2 for v1.0. This would presumably switch to using those when they're available.

Note that apt is not actually installed inside the container. Bazel just has a rule that knows how to unpack a deb.

catern · on Oct 7, 2017

It's not just "an easy way to get built packages". It makes this project trivially easy. No need for Bazel, you can do what this project does with a small shell script.

I'm a bit disappointed to see Google releasing a project which ultimately is nothing more than a helpful API around Debian, and claiming it's some exciting new thing "minus the operating system". The README doesn't credit Debian at all.

dlor · on Oct 7, 2017

We should definitely credit Debian more in the readme, but note that the existing package manager rules are actually a bit decoupled from the distroless images themselves, via bazel.

We happen to build these base images with the Debian rules, but we plan to add support for more package managers soon.

You'll be able to start from our base image and install packages via yum/dnf/nix/whatever, or start from a different, more standard base image and install package via bazel.

Disclosure: I'm one of the TLs working on this project.

catern · on Oct 7, 2017

You don't find it a bit odd that a project built using distribution packages, with distribution package managers, with (eventually) a choice between which of several distributions to use, is called "Distroless"?

It's also pretty weird if you do start to "support" Nix, considering Nix already is capable of building the same kind of distroless Docker images on its own, in a much more rigorous way. That is, Nix tracks and builds the entire dependency tree instead of just using existing binaries, and has a uniform system for expressing dependencies on components. This allows, for example, using multiple different languages in the same container, which it doesn't look like Distroless can do?

The way that this project would be genuinely interesting is if you were actually building the system from scratch with Bazel, rather than using existing black-box Debian binary packages. That would be like what Nix and Guix are already capable of, but it would be interesting to get some competition in that space from a different class of tool. Of course I don't know if Bazel is even capable of doing that and producing an image, maybe a large opaque "distroless" base runtime is the finest level of dependency resolution it's capable of, in this area?

yjftsjthsd-h · on Oct 6, 2017

I expect that makes it easier to maintain

earlybike · on Oct 7, 2017

Docker already recommends the tiny 5MB Alpine distro as the default for all containers, they hired Alpine's creator Natanael Copa[2]. Alpine is minimal but still has an awesome package manager[1], is maintained/proven/solid and provides a great UX as a container OS.

So what is my advantage of distroless vs Alpine besides the 5MB? Feels a bit like reinventing the wheel or I missed something.

[1] https://pkgs.alpinelinux.org/packages

[2] https://www.brianchristner.io/docker-is-moving-to-alpine-lin...

mattomata · on Oct 7, 2017

Alpine is built around musl libc, which has numerous compatibility differences from a traditional glibc: http://wiki.musl-libc.org/wiki/Functional_differences_from_g...

Some folks solve this by adding glibc to Alpine (IIUC this is what Envoy is building upon).

It has a package manager, but it is far from as comprehensive. The security database is still essentially an experiment with much less richness than Ubuntu, Debian, Redhat, ...

If what you want is a package manager, you probably want minideb from the Bitnami folks.

earlybike · on Oct 7, 2017

> numerous compatibility differences

I did never experienced any.

> It has a package, but it is far from comprehensive

Still better than the missing package manager of a distroless container (this was the comparison). However, I think it‘s quite good.

> The security database is still essentially an experiment with much less richness than Ubuntu, Debian, Redhat, ...

Do you have some sources proving it‘s an experiment?

mattomata · on Oct 7, 2017

When I have reached out to ncopa to report issues in their feed he responded:

> the secdb has so far been an experiment, but seems like people are actually using it, so I should set up some proper automated testing.

I doubt there is a better source :)

earlybike · on Oct 7, 2017

Thanks, do you have any link?

mattomata · on Oct 7, 2017

Nope, it was over email.

I can point you at the various fixes for things I've reported since this first became available, but given your skepticism I'm sure it would not help since he seems to exclusively use the changeset description: "[add] various fixes" with no attribution.

Here's the link for that last bit: https://git.alpinelinux.org/cgit/alpine-secdb/log/

¯\_(ツ)_/¯

alexnewman · on Oct 6, 2017

Bazel is the worst part of google. It ruined tensorflow as well

CHANCECHANEL · on Oct 6, 2017

Could not agree more. Let's install Java to build something that does not needs Java at all. They could have used Go, Python, hell even something like CMake.

jitl · on Oct 6, 2017

This seems like a packaging problem. In Java 9, it's possible to produce a single binary containing the JRE and application code; same way as go producing a single binary containing the Go runtime and application code.

> Python

Then you have to install the right version of python, and you get even more of a nightmare than a Java install. At least the Java ecosystem doesn't suggest a separate tool (eg, virtualenv or similar) to support things like JAVA_HOME.

This applies equally to any non-statically-compiled application distribution; I don't see CMake files + some specially blessed CMake version that you also have to install as coming close to that.

skybrian · on Oct 7, 2017

It's not immediately useful, but you might find this interesting:

https://github.com/google/skylark

austinjp · on Oct 7, 2017

I may be missing something here, since I've not heard of Skylark (the Python variation itself not Google's Go-based interpreter for Skylark) but my brief searching suggests it's abandoned..?

skybrian · on Oct 8, 2017

Skylark is the name of the build language used by Bazel, Google's build system. The main version is written in Java, but isn't available separately from Bazel as far as I know. Definitely not abandoned; most teams at Google use it.

The Go version is new and I don't know what it's used for.

alexnewman · on Oct 7, 2017

That is so f! Cool!!! I have been looking for this!

steren · on Oct 7, 2017

Are you on Windows? If so, Bazel now bundles the JDK so you should not have to worry about Java.

macrael · on Oct 9, 2017

How did it ruin Tensorflow? My interaction with it has been largely that it is a good idea but too new to be relied upon.

nikolay · on Oct 6, 2017

Being a distroless, it's still a larger base image than many distro-based ones.

mattomata · on Oct 7, 2017

The video explains why. The goal isn't small, it's minimal.

By pursuing small at the expense of compatibility we got things like musl libc (which has tons of compat issues).

If all we wanted was small we could have started from musl libc and unpacked alpine packages the way we unpack debs.

nikolay · on Oct 13, 2017

There are more minimalistic (and smaller!) and more compatible images out there though. So, we got the goal of distroless, but its good execution is yet to be seen!