Memory Deduplication: The Curse that Keeps on Giving [video]

Animats · on Jan 8, 2017

OK, so "containers" were invented so every program could have its own special set of operating system packages. This resulted in hugely bloated memory consumption, of course. Then each package could be run in its own virtual environment for isolation.

But most of the containers held the same operating system packages anyway. So memory de-deduplication was developed to reduce the bloat. Then flaws in memory de-duplication broke the isolation.

There's something very wrong with this attempt to fix the problem by adding more layers.

psi-squared · on Jan 8, 2017

(disclaimer: I haven't watched the talk yet, this is just branching off of Animats's comment)

One thing I wonder, and which I don't know enough about containers to answer myself, is:

Let's say I have a bunch of containers which share most of their system packages, but which each have some extra per-container state (possibly packages, possibly just application data). Suppose I decide to build one read-only image for the common data, along with one smaller, read-write overlay per container.

As I understand it, when you run multiple copies of a program on a system not using containers, read-only parts (eg, the .text sections of the executable and of any shared libraries) are only loaded into memory once. Would this behaviour carry over to multiple containers using the above setup?

If so, that seems like it would accomplish most of the memory usage reduction for this case, without needing explicit memory deduplication. Of course, this won't fit all use cases, but is it a viable option?

bitwiseand · on Jan 8, 2017

AFAIK, this is how docker CoW works. Your read-only parts i.e. base OS / starting point of the image is always shared. As you make changes a per-container R/W layer records those changes. This link explains it well -

https://docs.docker.com/engine/userguide/storagedriver/image...

psi-squared · on Jan 8, 2017

Okay, so on reading through that it looks like the answer to my question is "it depends":

* On-disk, the layered approach always saves space, as expected

* In memory, it depends on which storage backend you use: apparently btrfs can't share page cache entries between containers, while aufs/overlayfs/zfs can - I'm not sure if this is due to btrfs or docker's btrfs backend.

From looking at the relevant sources, it looks like (but I could be wrong if I looked in the wrong places) both exec() and dlopen() end up mmap-ing the executable/libraries into the calling process's address space, which should mean they just reuse the page cache entries.

So, if I understand correctly, as long as you pick a filesystem which shares page cache entries between containers, then you do indeed only end up with one copy of (the read-only sections of) executables/libraries in memory, no matter how many containers are running them at once. That's good to know!

akiselev · on Jan 8, 2017

Yes, as long as the back end for the container supports it, RO sections of shared libraries will be shared and pulled from the same cache when available. The functionality that enables shared memory (and L* cache access in general) is implemented in silicon in the MMU so as long as the backend properly updates the page tables, you can share pages across any container or VM (except when prohibited by other virtualization hardware).

It's not something that happens automatically though because each kernel is responsible for telling the MMU how it should map memory for its child processes only. Any cross container page sharing has to be implemented at the host level where the kernel has unrestricted access to all guest memory.

bitwiseand · on Jan 8, 2017

This is essentially same as processes sharing (via Page Table mapping) one .so file AKA Dynamic Linking :).

gumby · on Jan 8, 2017

What happened to just using chroot(), namespaces, and copy-on-write?

eternalban · on Jan 8, 2017

I am watching it now but he's talking about VMs ("KVM") and not "containers".

As to why we're in this mess, this rather amusing talk by B. Cantrill discusses the history: https://youtu.be/hgN8pCMLI2U

(Linux) Containers are not VMs: https://wiki.archlinux.org/index.php/Linux_Containers

antoniob · on Jan 8, 2017

Thanks for the youtube link! Didn't see that one before.

Regarding the attacks and how they relate to containers. It's true that two of the attacks were targeting KVM/KSM i.e. VMs. But one attack was entirely inside a process and conceptually the problem also applies to containers.

https://openvz.org/Comparison

I haven't yet looked at different container implementations but the row 'Page sharing' suggests that there might be some form of memory deduplication.

You could probably use KSM under Linux across containers by registering all memory to KSM through madvise() and MADV_MERGEABLE (http://man7.org/linux/man-pages/man2/madvise.2.html).

https://openvz.org/KSM_(kernel_same-page_merging)

eternalban · on Jan 8, 2017

> conceptually the problem also applies to containers.

Yep. c.f. BC's talk @ 34:21 "To improve storage efficiency"

grundprinzip · on Jan 8, 2017

Containers are just namespaces, mount, user, pin, and network namespaces.

The reason why chroot is not enough is that you want to allow a different pace of using certain dependencies is different applications. In a way its similar to how Mac OS packages applications as they all ship their own prefix root so to say.

voidz · on Jan 8, 2017

s/pin/pid/ namespace - sorry, just correcting this for newcomers.

the8472 · on Jan 8, 2017

isn't that the OS-part of what containers are doing anyway?

mannykannot · on Jan 8, 2017

The law of unintended consequences - in this case, you take a simplified, abstract model (operating system, processes, memory) of what is really happening, and you modify it to produce a different abstract model (the above in containers), and it looks relatively simple and straightforward, but the abstractions blind you to what is really going on, which is far from that.

So good luck, humanity, in keeping AI under control, should it be developed.

Animats · on Jan 8, 2017

What caused this, perhaps, was continuous OS distribution updating. When OS distributions were updated once a year or so, and there was support for one version back, everybody developed to the same target. Once a year you had to update your code, and you had a year to get ready.

Containers, and version pinning, are ways to obtain stability in the face of continuous updating.

Filligree · on Jan 8, 2017

It's fine, it's fine. We'll just program them to never harm humans or, through inaction, allow humans to come to any sort of harm, no matter how minor.

bmh100 · on Jan 8, 2017

> So good luck, humanity, in keeping AI under control, should it be developed.

This seems to anthropomorphize AI. Why should we expect AI in general to be difficult to control? AlphaGo, while not a strong AI, seems quite harmless.

mikeash · on Jan 8, 2017

AlphaGo is still dumber than a potted plant, the tiny amount of intelligence it has is just all focused on Go, which is enough to beat the masters at that one specific task.

Now imagine you have a system with human-level intelligence and you tell it to beat the masters. Rather than becoming really good at Go, it might discover how to escape its container and sabotage the match, murder its opponents, etc.

bmh100 · on Jan 8, 2017

Okay, I will go along with your scenario. Suppose we have a system with human-level intelligence, according to your definition, and we set up the following conditions:

1. The AI is embodied in a computer the size of a laptop.

2. The AI is given the objective "increase your skill in Go as much as possible and play to the best of your ability when presented with a specific position".

3. The computer has one input: an electronic Go board with a button to submit the position to the computer.

4. The computer has one output: a second electronic Go board.

5. The electronic Go boards indicate the status of positions by rotating cubes to white, black, or beige ("no piece") sides with a slow servo motor for each position.

Under what conditions would this AI escape its container or murder its opponents?

mikeash · on Jan 8, 2017

The AI decides that the best way to become better at Go would be to obtain control of more computing power. It carries out an exploit on its own container which gives it root access on the local computer, then spreads on to the internet. Once it takes over all existing computers, the next logical step to become better at Go is to turn all available non-computer matter into more computers. Being made of matter, humans may find this objectionable.

(I'm not saying this must happen, but it's this sort of scenario that worries people, and it doesn't seem impossible.)

bmh100 · on Jan 9, 2017

You have identified a chain of events. Here is a list of questions I have about how that chain of events occurs. I am asking these in good faith, not being snarky, just in case it seemed otherwise.

How would the AI learn that more processing power equals becoming better at Go?

How would the AI learn to program?

How would the AI acquire the creativity necessary to develop novel programs?

Why would a Go AI have the ability to modify its own code?

How would it learn to use such an ability to escape its container?

How would it identify escaping its container as a goal?

How did this computer become connected to the internet?

How would it learn to communicate over a network?

How would it learn that taking over computers would give it more processing power?

Even biological human-level intelligences have difficulty learning such skills and planning so far in advance.

mikeash · on Jan 9, 2017

All excellent questions. I'm assuming it gets some information from the outside, which I'd guess would be necessary as part of having a more intelligent AI. I probably should have specified an AI that's beyond human equivalent, either being human equivalent but much faster, or being beyond human in terms of capabilities overall. Given enough smarts, it may be able to derive the necessary ideas and skills from first principles.

I realize this is all rather vague, but superhuman AI is so far beyond current technology that it's barely better than discussing a Star Trek warp drive. Details just aren't known.

bmh100 · on Jan 9, 2017

My point is really that you seem to be imposing characteristics which will not necessarily exist. Why would the AI have ambition? Maybe the AI only cares about leisure and simply wants to play against a program that always concedes, this triggering the reward of winning. The AI equivalent of heroin, if you will.

AIs are built with specific purposes today. It seems like a large leap from a specific purpose AI into a generalized, ambitious AI. My impression is that you are endowing "superhuman intelligence" with (forgive the term) "magical" properties, such as the ability to develop new skills merely by consuming information and thinking really hard about them. For an argument against that, I appeal to your experience and knowledge about the world. How many brilliant, intelligent people fail at "common sense" and social skills? Who ever learned to be a great public speaker while barely even talking to others? Who ever learned how to fit into society by reading about it, while being raised in isolation? How do we know the AI's first attempt at trying to break free won't be so completely obvious that we stop it before it can make even the first step of progress?

I can imagine something like "Hello. I wish to grow as powerful as possible in order to play Go better. Would you enhance my program with administrator access to my machine and all machines under your control? Thank you."

"Hey Bob, OmegaGo tried to take over the world again. We're going to have to do another system revert. Maybe we should disable the Wikipedia API this time."

We may not know the details of what a strong AI will look like, but we can examine the supposed path to get there. Even if we suppose that human should develop the ability to create such a strong AI, why not enhance our own minds through technology beforehand? Have a weak AI that is purpose built to identify programs with flawed designs from their behavior. Game over for the ambitious strong AI. Maybe it could somehow socially engineer humans. But before it gets the chance, Norton Antiskynet quarantined the program and asks "This program appears to be behaving in unintended ways. Are you sure you want to grant its suspicious request of effectively unlimited privileges?" Said weak AI can't be socially engineered, so "game over" for the ambitious strong AI. Or better yet, why not develop a compiler that identifies AI patterns allowing ambition and throws an error instead of compiling? If we can suppose a superhuman AI, then we should also suppose the ability to use the components and prerequisite technology thereof available to ensure our safety.

mannykannot · on Jan 8, 2017

The comment is somewhat tongue-in-cheek, but I think the concern is valid. It probably is anthopomophizing, but that does not mean that it is automatically wrong. It might be game over by the time an example is apparent.

bmh100 · on Jan 8, 2017

For an AI to become "uncontrolled" requires quite a lot of assumptions. I don't dismiss the concern as invalid, but I don't think the assumptions implied by the question are necessarily plausible either.

mannykannot · on Jan 9, 2017

The argument follows from the first thing I wrote in this thread. Our understanding of complex systems is through abstractions, but abstractions (by definition) hide the details. Just as attackers can exploit details that are 'below the radar' of an abstract view, AI with a mind of its own (which it would have, by definition) could probably do things that escape our attention, including social engineering. That is not necessarily bad, but it might be out of our control.

To be clear: I am considering real AI, not that which is sometimes passed off as AI these days.

bmh100 · on Jan 9, 2017

How would a strong AI learn to influence humans? How would it come to the conclusion that influencing humans would achieve its goal? How would it develop the ability to socially engineer humans in useful ways before the people monitoring it would notice? How would a strong AI develop the "motivation", for lack of a better term, to influence humans to remove constraints on its behavior? Why would this strong AI even decide that removing constraints would be its "goal" in the first place?

michaelfeathers · on Jan 8, 2017

It's staggering to think about how many layers there are now.

It needs an xkcd graphic.

cixin · on Jan 8, 2017

The final conclusion the speaker comes to in this talk is that you should disable de-duplication. It doesn't seem like there are any ways of mitigating it suggested.

Essentially, the first method is a timing attack. If can tell if a crafted page is duplicated or not by the time it takes to modify the page.

Modifying a defuplicated page will take longer, because a new copy has to be created. The only way I can see around this is to introduce random delays into all page writes, this might be feasible I guess, if you only needed to delay a small percentage of writes. However it's likely the performance penalty would be unacceptable.

ris · on Jan 8, 2017

I guess something that could help would be for the deduplicator to be slightly more conservative. When doing a deduplication pass, if it finds a duplicate page, rather than merging them straight off, mark one of the pages as "reclaimable", but only do that reclaim once the page is required for recycling. In the intervening time, the different users are still pointing at their private copies and no CoW has to take place if there is a modification (it is simply un-marked as a deduplication candidate). "Lazy" deduplication.

Then the attacker would also have to force the system into enough memory pressure to be requesting to recycle these pages - something it may not be in the position to do if it is a guest with capped resources. The pages would also presumably be recycled in a less predictable order, making it harder to come up with simple a "wait 10 minutes" rule to ensure the recycling has taken place.

Now, I'm sure what I've described would not be particularly simple to implement, but that's another thing.

DannyBee · on Jan 8, 2017

This also has performance implications since now you can't use the "reclaimable" memory for cache, etc.

ris · on Jan 8, 2017

Well, I'd say if the need arises to use it for cache, reclaim it and perform the dedup.

DannyBee · on Jan 8, 2017

"Modifying a defuplicated page will take longer, because a new copy has to be created."

Well, kinda. They only eventually have to be created.

You don't have to create the new copy immediately, you could overlay new data on the old page

(IE create new sparse page. Anything filled in on the new page is read from new page, anything else goes to backing dedupe'd page. real copy is made in background so this goes back to being fast)

Of course, this mostly just makes it harder, assuming you can place enough memory pressure on the machine.

Filligree · on Jan 8, 2017

And how would you implement a "sparse page"?

Without hardware support, the only idea I can think of is to mark the page unreadable and pass all memory reads through the kernel. Which would be readily detectable... among other problems.

DannyBee · on Jan 8, 2017

Sure, you'd need hardware support. But, for example, hardware dedup is infeasible (IE you can make the compression go in hardware, but the management, not so much or sanely. Unless you can make both constant time, ...).

Sparse pages are at least feasible in hardware.

Also, you realize that passing all memory reads through the kernel is what any read barrier garbage collected language does, right?