If they had write access, then leaked personal data is the least of anyone's worries. The real concern is how close the hackers came to infiltrating the image source for virtually every modern microservices system. If you could put a malicious image in say alpine:latest for even a minute, there's no telling how many compromised images would have been built using the base in that time.
Yes, huge poisoning target enhanced by the fact images/tags are not immutable, you really have no idea what you are fetching straight from dockerhub, one pull of the same image/tag may be different to the next pull. Most people blindly fetch without verifying regardless with multiple images of varying quality for software packages.
tag are not immutable, but images (manifests) are. Much like git commit vs branches/tags. That is why best practice is to resolve docker image tag into "@sha256:..." digest and pull that, instead of tag. It guarantees that image you are pulling stays byte to byte the same.
You can't. Not without end-to-end integrity with nonrepudiation. Checksums aren't anywhere near enough. But that's Docker.. security optional and run random, untrusted code from the internet.
And Docker has a signing system but it's only enabled for the official-library builds! So all user images are completely unsigned despite all of the discussions of how secure the notary project might be.
And even if you hosted your own distribution and notary (like we now do for openSUSE and SUSE), you can't force Docker to check the signatures of all images from that server!
Only docker.io/library/* has enforced image signing and the only other option is to globally enforce image signing which means "docker build" will result in unusable images out-of-the-box.
If you look at something like OBS (the Open Build Service that openSUSE and SUSE use to provide packages as well as host user repos), the signing story is far better (and OBS was written a very long time ago). All packages are signed without exception, and each project gets it's own key that is managed and rotated by OBS. zypper will scream very loudly if you try to install a package that is unsigned or the key for a repo changes without the corresponding rollover setup. And keys are associated with projects so a valid rpm from a different project will also produce a warning and resolution prompt. That's how the Docker Hub should've been designed from the start.
(Disclaimer: I work for SUSE, on containers funnily enough.)
The company I work for, Sylabs, is taking what I think to be a pretty great approach to solving this problem. Essentially we've introduced a container image format where the actual runtime filesystem can be cryptographically signed (you can read about that here: https://www.sylabs.io/2018/03/sif-containing-your-containers...). The Singularity container runtime we develop treats this concept of "end-to-end integrity" as a core philosophy. Our docker hub analogue, the container Library, is working to make cryptographic signing one of the fundamentals of a container workflow. We're also actively working on container image encryption, which I think will bump container integrity up a few notches.
Well I was approaching it from the point of view that you verify the image is correct and then you guarantee you'll always use that image and not some other version given that tags are mutable.
If you're the entity who created the image you can retain the original hash and verify it against the downloaded copies. But that kind of defeats the purpose of being able to download docker images across distributed hosts.
They'd really need to be signatures attached to the images, not just hashes.
No, I'm saying you only need integrity to validate you are getting the same thing each time. If I checked and made sure an image is safe, then I can save that hash and know that as long as the has matches, I'm always getting that same safe image.
This is useless without authentication though. You're opening yourself up to attacks on the first retrieve. Sure, you can make sure you're getting the file they want you to have, but you don't know _who_ is giving you that file.
You can tamper with data protected by checksums: they are not designed to be irreversible, just fast to calculate and good at detecting errors, not deliberate manipulations.
But that's not the point here. The point is that you choose an image and verify that it is safe and then pin the hash. So I can pull that hash a thousand times over from whatever source I want and I can be sure it is always the same image that 8 originally verified. I don't care who has it or who now has control over the site because if the image doesn't match the hash, then it isn't mine.
I think yall are using the terminology differently from each other in this thread. "Checksum" historically did not imply resilience against intentional modifications.
Nowadays, it's arguably a best-practice when designing a new protocol or storage format to simply make all checksums cryptographically strong unless there's a reason not to. I think that might be where the confusion is coming from.
The issue is, how do you verify the checksum you are using is valid. If you obtain the checksum from the same place you get the image, then an attacker can simply calculate a new checksum for the malicious image and publish it too.
I guess if you were really sure you had obtained a checksum prior to the service compromise, then that would give reasonable assurance the image was not tampered with.
Checksums/fingerprints can help mitigating the problem of _changing_ images people already use. As you correctly point out they don't solve the problem of authenticated distribution.
Assuming you have fetched a given image and captured its sha in a config file in your version control (e.g. a kubetnetes manifest), then whenever you deploy a container you are sure that you're not affected by exploits happening _after_ you saved the fingerprint.
You create the docker image on your local computer, create checksum and write it down / remember it. Then you just use this checksum when downloading the image from other computers to check it's the same one. This only works for images created and uploaded by you of course, for images created by other people it does not work.
Second preimage mostly, which is harder than collision with most common algorithms (even MD5 might still be secure for this, not that anyone should use it for anything at this point). Collision resistance is only important if someone can create two versions at the same time that have the same hash and wants to quietly switch them later.
Using SHA-256 as you describe works well and is widely used by package systems to ensure that source code being built does not change. Signatures can be better for determining if the initial image is valid if the signing is done away from the distribution infrastructure since development machines might be more secure than distribution infrastructure (and if not you will have problems either way). You still need to get the correct public key to start with. However, if you do have a good image by luck or actually secure enough distribution infrastructure then SHA-256 will certainly let you know that you get the same one later. Signatures almost always sign a cryptographic hash of the message and not the message itself.
I think his point is that for some checksums it could be trivial (and for some, tools already exist). Checksums aren't designed for this, while on the other hand secure hashing is. As a result, authors of hashing algorithms often attempt to mathematically prove their strength and resistance to a collision.
Docker Hub's Github integration requires both read and write access for private repositories. In fact, if you want it to set up deploy keys I think it requires Admin access.
I'd like to mention that Docker recently changed their automated builds to require giving them access to GitHub instead of just using a webhook. Glad I disabled access but no telling how long this was undiscovered.
I don't know for sure, but I would image it has something to do with wanting to make a unified solution with how to manage things, but I see a lot of great options, such as setting up a free GitLab pipeline to build and push your image. You don't even have to use Docker with kaniko, if you want a Kubernetes-native image builder and there are great registries that can be deployed in Kubernetes like Harbor, with automated security scanning. This can all be done in GitLab as well with paid features. I also recommend checking out building and deploying rootless containers for builds.
Pretty sure (don't quote me) those are read only and repo specific but that could contain all sorts of juicy info depending how lax you are with security of configs in private repos.
Even then just read access to code often allows enough info for leveraging/escalating privilege.
When you connect your Github account to Docker Hub, that will give DH full access to all repos (https://i.imgur.com/4jJWrez.png). I'm not even sure if Github's permission model supports adding only read access to private repositories.
I'm not 100% sure if Docker hub uses deploy keys for repos it has access to thru the integration, but at least previously there was an option to manually add one to repository if it couldn't access it otherwise.
> I'm not even sure if Github's permission model supports adding only read access to private repositories.
Their newer GH apps permission model allows fine-grained access to only specific repos (and also only read access e.g.).
However their older Oauth flow only allows full access to everything. And 99% of GH integrations still seem to use the older authentication method.
This is also something that many CI providers suffer from. There are only few that already support GH apps.
If they had write access, then leaked personal data is the least of anyone's worries. The real concern is how close the hackers came to infiltrating the image source for virtually every modern microservices system. If you could put a malicious image in say alpine:latest for even a minute, there's no telling how many compromised images would have been built using the base in that time.