In terms of the daemon model of Docker, I guess it does look a bit complicated, and is not explained very well.
In production you will do docker run -d nginx, not run it in the foreground, so the client (docker) process is not really in the picture - if you run in the foreground it is just there to stream the standard IO, and so you can kill the process with ^C from the shell.
The docker daemon (dockerd) is there to listen for new requests, but since 1.11 it no longer runs containers. Since 1.12 you can restart it without killing your containers (with the right config option) see https://docs.docker.com/engine/admin/live-restore/
so you can eg do a daemon upgrade without downtime. It is still handling some things, eg logs, so it is best if it does restart.
The process that actually runs containers is containerd. This is a very simple daemon with a grpc socket interface. That uses runc (the OCI standard runner) but that does not stay running, only a small process called containerd-shim does, which is there to act as a parent for the actual container process, so that containerd can be restarted.
You can use containerd as a runtime, with runc containers, but runc is not that user friendly. You can use https://github.com/jfrazelle/riddler to get something you can run from a docker container. You could also use runc from systemd if you want. However runc doesnt do a lot of setup, eg the layered filesystem handling is all part of how dockerd sets things up for runc, so you would have to do that yourself if you dont want to waste a lot of disk space.
It does sound a bit complicated, but it is just separation of concerns and breaking up the once monolithic docker binary into a client and a set of servers that all do smaller tasks and which can be restarted independently.
Such structures can be hard to document with both clarity and brevity. Take a look at how Wietse Venema describes the elements of Postfix, for a nuanced masterclass in the art. http://www.postfix.org/OVERVIEW.html
Both the architecture and the approach to documenting it were pioneered by Bernstein's qmail, which is the first Unix program of comparable ambition to be structured in this way --- it's crazy to think that there was a time when this implementation strategy was groundbreaking, but, it was.
(Fun fact: Venema and Bernstein had a long-running feud, and Postfix exists pretty much entirely because Venema appreciated qmail's architecture but couldn't stomach working with anything Bernstein produced.)
Julia writes: "I think "violates a lot of normal Unix assumptions about what normally happens to normal processes" is basically the whole story about containers."
This is a key point. Lots and lots of standard Unix invariants are violated in the name of abstraction and simplification, and the list of those violations is not popularized; and most of the current systems have different lists.
For example, in Kubernetes (my current love affair), the current idea of PetSets (basically, containers that you want to be carefully pampered, like paxos members, database masters, etc. -- stuff that needs care) /still/ has the notion that a netsplit can cause the orchestrator to create (1 .. #-of-nodes) exact doppelgangers of your container, all of which believe they are the one true master. You can imagine what this means for database masters and paxos members, and that is going to be, as the kids say, surprising af to the first enterprise oracle db admin who encounters this situation.
If you believe in containers, then one thing that you really do have to get to, is that most of your existing apps should not be in them yet, and that if your app is not (a) stateless (b) strongly 12-factor (c) designed for your orchestrator and (d) written not to do things like fork() or keep strong references to IP addresses, then you should probably wait 3-4 years and use VMs in the meantime.
Oracle has had multi-homed master-master RDBMS setups for > 10 years. I'm pretty sure a half-competent Oracle administrator wouldn't be really 'surprised af' at functionality that's been in Oracle for at least a decade.
For things that need 'care', this has been a solved problem for decades. Banks[0] homed in the WTC on Sept 11 kept on running because OpenVMS has had NUMA clusters and multi-node replication since the DEC Alpha days. This is with 100% transactional integrity maintained and DC failovers measured within the order of 500ms to 5s. (Obviously banks don't all run on VMS.)
Platforms exist like IBM z systems let you live upgrade zOS in a test environment hosted within the mainframe to see if anything breaks, in complete isolation from production of course, revert snapshots, and do basically everything the whole ESX suite (from things like live migrations of VMotion, to newer stuff like growing raid arrays transparently / virtual storage solutions where you can add FC storage dynamically and transparently to the end user). Their stock systems let you live upgrade entire mainframes without a blip. They're built to withstand total system failure (i.e. literally processors, RAM, NICs, and PSU's could all fail on one z13 and you'd have fail-over to a hot-backup without losing any clients attached to the server). HP's Non-Stop, with which I have no experience, offers a similar comprehensive set of solutions.
[0] On Sept 11, a bunch of servers went down with those buildings.
* “Because of the intense heat in our data
center, all systems crashed except for our
AlphaServer GS160... OpenVMS wide-area
clustering and volume-shadowing technology
kept our primary system running off the
drives at our remote site 30 miles away.”
--Werner Boensch, Executive Vice President
Commerzbank, North America*
http://ttk.mirrors.pdp-11.ru/_vax/ftp.hp.com/openvms/integri...
I'm saying that an arbitrary number of exact replicas of a master can magically appear on the network believing they are the one true master, identifying themselves as such, and expecting to act that way. Additionally, an arbitrary number of database masters expecting to participate in the cluster may show up or leave at any time. That is somewhat nontrivial for even modern databases to deal with.
Why run your database inside kubernetes though? We've always white gloved our database (and a few other special services). You don't have to put 100% of your infrastructure in docker/kubernetes.
If you're running multiple copies of anything that cares about the concept of a master it better have its own consensus algorithm. Luckily such things exist and are open source.
I think Kubernetes does a good job creating a normal "Unix process environment".
The Pod concept allows for:
- Container processes share localhost, mount points, etc
- Providing a "normal" IP address that is routable
- Ensuring a PID1 can monitor the group of processes (as done by rkt integration)
- Allowing for normal POSIX IPC (signals, etc)
As for PetSets I do agree that they need more work to support things that are replicated but not cluster aware. It doesn't magically solve the issues of distributed systems. Also, natively cluster aware things might be better served by controllers. See this demo of an etcd controller:
It definitely does better than many of the rest, in my experience, and for sure it has better defaults and chooses its violations carefully and generally wisely. In fact, I wrote the first draft of a paper on this specific topic:
Having been inside Google when Docker started to get big, there's a really simple explanation for all of this:
Kubernetes is a well designed descendant of a well-designed API with pretty specific tradeoffs for distributed systems (that mostly still work at the small scale).
Docker is a reverse-engineered mishmash of experiments attempting to replicate the same ancestor. Things like the horrible network abstraction layer - Google had the advantage of being able to move all their apps to a well understood naming scheme, rather than treating IP addresses as immutable. That any app does this is technical debt, but it worked for a long time. Now it doesn't.
Docker has tried to fix these things by wrapping them, not fixing the underlying debt. That only ever accumulates more debt, and rarely even provides the stopgap solution that is required. It's an admirable effort, and they've done a fantastic job - but a fantastic job at a fool's errand is still not behavior to emulate.
It seems that isolation is frequently the cause. E.g.:
* Better developer environment. Actually, I'm not sure anymore. It totally makes sense for testing (all the CI/CD stuff), and - thanks to the packaging aspect - it's easy to set up external dependencies (like databases), but I just wasn't able to grasp how the actual development is better with Docker. Developers tinker with stuff, containers and images are all about isolation and immutability, and those stand in one's way.
* PID1. Obviously, isolation is the cause for this. With `--pid=host` it's gone, but no one does that, probably because of nearly complete lack of UID/GID management, thus the security drawbacks. I guess, it has roots in "all hosts are the same" idea, as UID/GID have to be a shared resource and they're harder to manage than just spawning things into a new PID namespace so processes won't mess with each other.
* Networking. Yes, as it was pointed out, it makes sense due to port conflicts, but usually it's inferior over-complicated version of moving port numbers to environment variables. Instead of binding your httpd to [::]:80 and setting up port mapping, bind it to [::]:${LISTEN_PORT:-80}. All the same stuff, but - IMHO - much more straightforward. Sure, there are (somewhat unusual) cases where separate network namespace is a necessity (or just a good thing), but I don't think they're any common.
So, I think, the question is also: is there (and why) the need for isolation in a way Docker does it? Doesn't the way it does unnecessarily complicate things?
Developer environment/experience is vastly better in my opinion.
All of our dev environments are docker images. Setting up a machine for a developer is install source control, IDE & docker, then pull the latest dev image and they are done. Pre-docker it was several pages of documentation and tracking down various coworkers to make sure you installed&configured things correctly. While yes, scripts helped, people always forgot to update something in the script and didn't notice until someone needed to install the dev environment. The immutability forces people to actually update the dockerfiles with the new dependency/tool/config as that is the only way to do it.
Developers tinker with code, but most of the time you don't tinker with the output of that code, like hot patch your binaries or whatever. Same with systems, you build a container from a Dockerfile and maybe Makefile, you don't then go and change a few things you change the source code. We are just pushing the immutability boundaries further and getting more reproducible environments as we do it.
It depends on the project, I guess. Sometimes, it's not that easy.
For scripting languages that don't have a compile-time the code is what gets executed. So with Docker there's either necessity to rebuild the container (extra delays, and quite noticeable ones) or necessity to maintain a separate Dockerfile.dev and mount-binding the code into the container a-la Vagrant.
Even for compiled stuff, it can be a nuisance with that "Sending build context to Docker daemon" phase. Like when you have a fair chunk of artwork assets next to the code. And the advantage of having the intermediate compiler results are also either lost (adding extra build time) or require extra tricks to make things smooth and nice.
And either way, it also means extra work setting up your debugger toolset jump over the isolation boundaries so you can dig into live processes' guts. One's probably going to abandon PID space isolation.
Those consequences are quite rarely mentioned when the immutability aspects of Docker are advertised. It's usually told as "you'll have a reproducible environment" (yay! great!) but never "you may lose that heartwarming experience of having a new build ready to be tested while you switch from the editor to the terminal/browser/whatever window".
You can debug from the host or from another container using `--pid=container:id` which puts you in the process namespace of a running container.
Build time is important, if you can use build layer cacheing it helps a lot, but how to structure it depends on your project. I don't myself use Dockerfile.dev, but I do sometimes mount the code into the container to build and run it directly. I think it would definitely help for more blogs and examples of how to do these things, as there is a lot of room for improvement.
At scale a single host can be running may be 20 containers and port collision becomes a real problem. So imagine if a container opened a port directly on host -we have to be careful that they don't step
on each other toes.
Even if all containers used some sort of contract about which port they are going to use - there are all sort of corner cases waiting to happen such as ephemeral ports(the port you bind to when you connect externally) taking over a port taken by real server app.
I have seen two approaches being used to solve this problem:
1. Using Smartstack (http://nerds.airbnb.com/smartstack-service-discovery-cloud/) the applications running inside container can run
on any port but the port on which they are externally available is decided by orchestration service. Typically, no one talks to application
inside container directly but they go through the haproxy configured
on localhost. The advantage is - smartstack can remove a service if it
is failing healthcheck etc.
2. The kubernetes/openshift approach of Software defined networking(https://github.com/coreos/flannel). Although they also integrate with load balancers, so that is not the only way.
I know if someone is just getting started with containers, it seems bit overwhelming to digest all this. But having worked in some large
companies which are using containers at scale, it kinda makes sense.
The default setup of Docker does not make any assumptions about the host setup, so it assumes it might only have one IP address, so there is only one set of ports.
It is perfectly ok if you have lots of IPs to put routed IPs on the `docker0` bridge, and never use port publishing at all, or to use some of the other optional setups, such as the new macvlan and ipvlan setups https://github.com/docker/docker/blob/master/experimental/vl... which are the kind of production setups you may want if you run your own networking. But Docker cannot assume anything about the network setup in the default configuration, hence the use of published ports, which is kind of inconvenient but always works in any environment.
> Using smartstack the applications running inside container can run on any port but the port on which they are externally available is decided by orchestration service
AFAIK this is not the case: with smartstack you configure your application to listen on some port, and you configure nerve to register <this machine's ip>:<the app's port> under <service name> in zk/etcd/whatever. You have to do both of these things; smartstack itself doesn't configure your app for you or make sure that nerve and your app agree on the port.
I'm not aware of any such efforts to make smartstack aware of orchestration yet, but I haven't gone looking recently.
It'd be entirely possible to make the app and nerve aware of network orchestration: have the orchestration layer 1. pick a port, 2. tell the app about it, and 3. configure nerve accordingly. In smartstack, each instance of <service name> can have different port numbers, so they could all be arbitrary high-numbered ports.
edit: You could also just have your orchestration layer inform the service discovery layer itself (in the case of smartstack, by writing the app's ip:port to zk/etcd/whatever). nerve contains some local healthchecking (sanity checking that the app is up before registering it) and maybe a few other things, however, so I think if you're already using / considering smartstack it would make sense to keep using nerve instead of having orchestration do it.
"Installing stuff on computers so you can run your program on them really sucks. It's easy to get wrong! It's scary when you make changes! Even if you use Puppet or Chef or something to install the stuff on the computers, it sucks."
I think a lot of people feel this way. I think that fear is born of ignorance, and we should fix that.
Let's say you are working on an application in NewPopularLanguage 2.3.1, using CoolFramework version 3.3. Your Linux distro ships NPL 2.1.7 and CF2.8, which don't support a really nifty feature that you would like to have.
Important questions to ask: what is the distro's support record? Do they have a dedicated security team? Is there significant support for NPL and CF in the distro, or just a single package maintainer?
If the distro's security and NPL packaging team are good, you might want to use their versions even if it means giving up use of the really nifty feature until sometime in the unknowable future. Making an explicit, considered decision is worthwhile.
But if you really need the new versions, you should use a repeatable build system that generates OS packages exactly the way you want them. You should put them into a local repo so that when you install or upgrade a new machine, you get the version you specify, not whatever has just hit trunk upstream. And you may want your versions to be placed in a non-(system)-standard location, so that your application has to specify the path -- but be guaranteed that you can install several versions in parallel, and use the right one.
It feels like a lot of overhead, but it can save you lots of debugging and deployment time. Once you have the infrastructure tools in place, using them is not much of a burden, and pays for itself many times over.
> But if you really need the new versions, you should use a repeatable build system that generates OS packages exactly the way you want them. You should put them into a local repo so that when you install or upgrade a new machine, you get the version you specify, not whatever has just hit trunk upstream. And you may want your versions to be placed in a non-(system)-standard location, so that your application has to specify the path -- but be guaranteed that you can install several versions in parallel, and use the right one.
Exactly. You have to be fucking careful. Or you can just use a container. That's his point.
> I think that fear is born of ignorance, and we should fix that.
Actually I think it's born from having a lot of experience of installing things and it being a total nightmare..
You're right, obviously we should stick to packaged versions of libraries whenever possible, but as you say, it is not always possible.
It's easy to aim at the low hanging fruit of specifying explicit version numbers when installing packages via Puppet, Chef, Ansible or Salt.
This should be common sense because if you build servers/containers at different points in time, it's possible to have 4-5 different versions of libxyz in use depending on when that instance was spun up.
However, if you're writing code in Ruby, Python, Node, Go, or even Java, you're using a version manager for the base interpreter (e.g. rvm, rbenv, conda, etc) because the distribution-packaged version is typically a year behind, or not present at all.
Then you're using the language's package manager (Rubygems, PIP, npm, go get, mvn) to install packages.
Then a lot of these framework maintainers are bundling the necessary libraries with their package for consistent builds (e.g. nokogiri on Ruby, libv8 on Ruby, etc).
You're also making the assumption that the CoolFramework use things like autoconf/automake (which generally has the reputation nowadays of being "bloated") for enabling consistent compilation across OS variants.
It's hard to maintain explicit versions in separate locations when a typical web application nowadays has at least 100 dependencies, and the typical web site has several components (the web app, a queue, a scheduler, maybe some separate workers).
This all sounds great in theory, but I feel it is very hard to maintain in practice with a fast moving ecosystem, which almost all of the above languages are.
Another issue with Docker: it does not interact well with process supervision (say systemd). The "docker run" process that you run with systemd is only a proxy for the real container process, which is started by the Docker daemon - so in reality, you have two init systems, Docker _and_ systemd. This means that many supervision features won't work (signals, seccomp, cgroups...).
cgroups, seccomp etc are set by docker so they do work. I think it is weird to view these as exclusively owned by the init process.
Docker works on systems without systemd (indeed, it runs on Windows), so relying on features that systemd has (currently, many are only recent additions) is not really an option.
I think these are good questions and I am interested in the answers. At least some of the answers are not obvious or not generally agreed by the experts, it seems.
You can use authorization plugins to control what commands are allowed.
However generally you don't give people access to run any docker command in production, you have some system that lets them deploy containers with predetermined settings, which don't include being able to set --privileged or add capabilities or change security policies.
In production, you usually don't want to have users running around and spawning containers anyway. You will likely have an orchestrator like Kubernetes or Mesos or Swarm, who will be running as root and spawning containers for you.
Of course, that just replaces the question of "how is access to the docker daemon secured?" with "how is access to the orchestrator API secured?".
How about something that isn't spawned by an orchestrator - for instance the marathon load-balancer - they provide only a docker image which clearly is meant to be run using `docker run`
You are right that docker currently runs as root. There is some phenomenal work that Aleksa Sarai is doing on getting runc to work as an unprivileged user [1] that Docker should be able to take advantage of at some point. There are still a lot of places which need love without root in the Docker world, but it's a huge step forward.
I would of though that if you want to restrict developer from doing that you would add commands to the sudoes file? isn't that the whole point of sudo?
Correct me if I'm wrong, I'm interested to know? I'm currently looking into docker and the advantages it brings to deployment instead of using a VM.
Kubernetes is the answer to all of your questions.
You shouldn't directly use "docker run" in production. At least not yet.
Think of the docker binary and daemon as development tools not a production platform.
Develop your apps one process per container, microservice style. If you can't do that you should probably use vms.
When it comes time to deploy, kubernetes handles scheduling for you automatically across your fleet.
Kubernetes secrets can be mounted inside the containers so you don't leak them like you can with env vars.
Kubernetes will eventually support other runtimes like rkt. But this abstracted away.
Kubernetes assumes a flat networking space, but this is taken care of with stuff like flannel.
You should probobly use Dockerfiles to create containers in your build process. Packer can create them but I would only reccomend that way if you have other tooling that does that. Spinnaker can leverage that bake-centric stuff very nicely.
Maybe Docker networking gets more complicated later on but for what I do with it I find it pretty easy and useful. Docker compose makes it pretty simple to control which ports get exposed on the host and which are limited the the docker network.
> My coworker told me a very surprising thing about containers. If you run just one process in a container, then it apparently gets PID 1?
That's true for Docker (and possibly rkt, I don't know) but not for LXC. Docker is intended to provide isolation for a single service/app, so having an init process (arguably) doesn't make sense. For LXC, it's more like a separate OS, so it does need an init process.
These two operating models are referred two as "application containers" and "system containers". It seems that the former is more popular for service deployment situations, but if you want a virtual dev environment / sandbox to play in, I would think the latter is a better choice.
So how do you handle what she addressed under secrets? How do you share passwords between containers? For quick and dirty stuff, I use environment variables that are set in my docker-compose file, but I have no experience running docker in production.
Kubernetes has first-class support for secrets: key-value lists that are stored on the API server, and can be mounted into a container as a directory of files (key -> filename, value -> content). See http://kubernetes.io/docs/user-guide/secrets/#security-prope...
How do you securely consume those secrets though - from everything I have seen with vault or consul you end up with the secret as a environmental variable that is then visible in ps listing.
Kubernetes for instance bind mounts secrets by default on read only in memory filesystems (and on Red Hat systems, with unique SELInux labels) that disappear on reboot. You can of course use secrets in env vars if you want, since sometimes it is easier. The hard part is a lot of handy public docker images use env by default, so you end up being tempted into env for convenience.
And does that Docker instance need a token to read the password out of key/value store somewhere? How then do you securely distribute the token? It seems like that would just be pushing the problem elsewhere.
Also I am assuming that something is preventing that tmpfs filesystem from swapping to disk?
The initial secret can be passed using the cubbyhole technique -- a time- and use- limited token that retrieves the actual token from a 'cubbyhole'.
The long-term secret can be accessed through the native clients for many PLs, which are basically just wrappers around the HTTP(S) API. The long-term secret is never exposed.
environment variables are problematic, as they can be read by other processes potentially. Vault or another secrets management tool is a better option. A secrets management solution integrated into Docker is planned, as it is difficult to get right without tooling support.
I never quite understood how Docker lets developers share the same development environment. Most Dockerfiles that I have seen are a series of apt-get install commands. If different people build images using the same Dockerfile at different times, isn't there a chance that they will pick up different package versions? What am I missing?
Create a dockerfile that does performs installation of all the tools that you need, execute that once to create an image and then share the image with everyone else who needs it, possibly through a private registry.
We use that approach now to store some build environments for embedded systems, where our prebuilt and shared images contain all 3rd party dependencies (which are only slowly changing). We use then those images to build our own software. Depending on the use case we create new images from it, or only spawn containers for compiling something, copy the artifacts outside of the container and remove them again. Works really well for us.
I build base images and tag them with a hash of the packages installed in them (this is quite easy using Alpine Linux, I use sha1sum /lib/apk/db/installed), and then explicitly use those. If a package is upgraded or a new package installed in the base image then the image tag is updated.
People just don't care about those small differences. (Since solutions of pedantic version pinning or vendoring are known, anyone who doesn't adopt them clearly doesn't care that much.)
Or you can have all the devs pull images built by a central CI system, but you'll still have package differences creep in over time.
1) packaging: this is the feature that's easiest to see benefits from. Having a single artifact that can be run on your CI infrastructure, development machine, and production environment is a massive win.
2) scheduling: there are big cost savings to be had by packing your application processes more efficiently onto your infrastructure. This might not be a big deal if you're a startup, and you haven't yet hit scale.
3) dev environment: It's powerful to be able to run exactly what's been deployed to prod, on your local machine. I've not found developing in a container to be great though; I still use the Django local dev server for fast code-loading. (It's possible to mount your working directory into your built container; this is just personal taste).
4) security: containers are not as robust a security boundary as hypervisors, so they are less suitable for multi-tenant architectures. The most common use-case is to run your containers in a VM, so this isn't necessarily a problem. As an additional defense-in-depth perimeter, containers are great.
5) networking: think of network namespaces as a completely isolated network stack for each container. You can run your containers in the host namespace using `--net=host`, but this is insecure [1]. Using host networking can be useful for development though. In general the port forwarding machinery allows your orchestrator to deploy multiple copies of the same container next to each other, without the deployed apps having to know about other container's port allocations. This makes it easier to pack your containers densely. (More concretely, your app just needs to listen on port 8000, even if Kubernetes is remapping one copy of it to 33112, and another copy to 33111 on the host).
6) secrets: containers force you to be more rigorous with your handling of secrets, but most of the best practices have been established for some time [2]. The general paradigm is to mount your keys/secrets as files, and consume them in the container; Kubernetes makes this easy with their "Secrets" API. You can also map secret values into env variables if you prefer.
7) container images: the Dockerfile magic is a pretty big win for building artifacts; the build process caches layers that haven't changed, which can make builds very fast when you're just updating code (leaving OS deps untouched). Having written and optimized a build pipeline that produced VMDK images, and experiencing the pain of cloning and caching those artifacts, I can attest that this a very nice 80/20 solution out of the box.
Personally, I really like her writing style. She's very honest about the things she doesn't know, which I find a lot easier to relate to than blogposts by some random internet gurus. Behind the self-deprecation she has a lot of really interesting things to say and is really enthusiastic about learning new technologies. Her full archive is well worth reading.
I don't think this is ranting. It reads to me more like a formatted dump of someone's evaluation notes. Some stream-of-consciousness is to be expected.
It's like "I bought this car and it has this engine and stuff and I have to put oil in it. I'm not sure why but why can't it just run on sunlight? And why do we need 4 wheels because I heard that bikes can travel around 2. Not exactly sure why but that must be better."
Admit I'm a Docker fanboi but I don't think you can simulateously analyse and critique something when you admit you do not understand it.
I understand that it can come off like that, but from reading the rest of her articles you can tell these kind of questions arise from sincerity and not some kind of conceited dismissal.
This kind of approach is really useful for someone coming in with zero prior knowledge of the situation - these are the exact questions you'd ask if you were oblivious to the subject. Something like "I don't understand what the Docker daemon is for" isn't meant to dismiss the daemon, but just sincerely pointing out that she doesn't understand what it's for.
I feel posting articles like this is a good way to move forward in learning something. When learning something and writing about it exposing your ignorance directly rather than posturing puts you in a good position. Less knowledgable readers know to take this point into consideration, and more knowledgable (or angry) readers may point out the errors in your statements.
Being proved wrong is a great way to learn more, as long as one can accept it without taking it personally (hence pointing out unknowns in the post). And as being proved wrong is a luxury that doesn't really exist in the programming world outside of education, exposing your views to the community is an important way to learn from others.
I love that she posts stuff like this. It's not meant to be a review of anything, it's a journal of how she learns things, and I've learnt a lot through her writings because of it. Docker is confusing and almost everything written about it is written by those who have already achieved enlightenment so to speak, instead of by those who are in the process of achieving enlightenment.
I don't think that's particularly fair on Julia. I have found her posts very useful, even the ones written in this style. Documenting her journey from newbie to proficiency is very useful to others wanting to learn the topic in question as they can relate to the questions in posts like these and see whether she found answers later on in her archive.
But these are exactly the questions that I want answered. Much too often I have people talking about certain parts of a system that are so far above my current understanding that I'm too intimidated to jump in and ask "But why <basic-design-decision>?".
Docker is still pretty new. If you were the owner of a horse and someone was trying to sell you one of those new fangled cars you would have a bunch of questions about why exactly this car is better than your horse. And should you be going with the cheaper Ford? or one of the more expensive models?
She's not writing a critique she's asking questions about tradeoffs from the viewpoint of someone who doesn't know the answers.
I agree. The content is most definitely there and I learned a few things, but the signal-to-noise ratio is a bit high for my tastes. The "like"s and whatnot are distracting, I find.
I would humbly suggest that Julia work on her writing style. She's in a good position to do so: there's stuff to be said and now it's just a question of tweaking the manner in which it's said.
Edit: why the hate? I didn't think I was being a jerk...
She wrote a blog post, put it online for ... who knows? It could be literally any reason. For her own later reference, to help crystallize her thoughts, or just for fun. Somebody else posts it to Hacker News. Dudes (always dudes!) tell her to change her writing style. Why? Why should she? Why do they care?
Again, I thought the piece was informative. I figured I'd voice my opinion in the form of suggestions, since this is a forum for such things.
Lesson learned. The only feedback one can give is positive.
>dudes, always dudes
What does sex/gender have to do with this? From your comment, it follows that you'd be less upset if women had commented. That feels wrong and regressive.
In production you will do docker run -d nginx, not run it in the foreground, so the client (docker) process is not really in the picture - if you run in the foreground it is just there to stream the standard IO, and so you can kill the process with ^C from the shell.
The docker daemon (dockerd) is there to listen for new requests, but since 1.11 it no longer runs containers. Since 1.12 you can restart it without killing your containers (with the right config option) see https://docs.docker.com/engine/admin/live-restore/ so you can eg do a daemon upgrade without downtime. It is still handling some things, eg logs, so it is best if it does restart.
The process that actually runs containers is containerd. This is a very simple daemon with a grpc socket interface. That uses runc (the OCI standard runner) but that does not stay running, only a small process called containerd-shim does, which is there to act as a parent for the actual container process, so that containerd can be restarted.
You can use containerd as a runtime, with runc containers, but runc is not that user friendly. You can use https://github.com/jfrazelle/riddler to get something you can run from a docker container. You could also use runc from systemd if you want. However runc doesnt do a lot of setup, eg the layered filesystem handling is all part of how dockerd sets things up for runc, so you would have to do that yourself if you dont want to waste a lot of disk space.
It does sound a bit complicated, but it is just separation of concerns and breaking up the once monolithic docker binary into a client and a set of servers that all do smaller tasks and which can be restarted independently.