How a double-free bug in WhatsApp turns to remote code execution

pjmlp · on Oct 2, 2019

> WhatsApp parses it with a native library called libpl_droidsonroids_gif.so to generate the preview of the GIF file.

Another victim of Android not exposing the image codecs to the NDK.

I guess, like many, they decided to use libpl_droidsonroids_gif instead of having to deal with JNI to call BitmapFactory or ImageDecoder.

Because of these kind of exploits codecs have their own hardness processes, https://source.android.com/devices/media/framework-hardening, but naturally it doesn't help when applications bring their own native alternatives along instead.

izacus · on Oct 2, 2019

ImageDecoder didn't support animated gif for all APIs they support so you're again looking for reasoning in the wrong bush - exposing codecs wouldn't help.

pjmlp · on Oct 2, 2019

So I should look into careless programming, lack of fuzzing and proper testing practices bush instead?

Because that is what that tone reminds me of.

mentat · on Oct 2, 2019

Your proposed solution wouldn't work. You appear to have been unaware and instead of saying "thanks, I didn't realize the limitation existed, that's unfortunate" you instead attack the messenger. All file parsers should be throughly fuzzed, yes. Not sure what tone has to do with it.

pjmlp · on Oct 2, 2019

That would have been indeed my answer if the comment didn't look like a snarky attack.

"so you're again looking for reasoning in the wrong bush"

thenewnewguy · on Oct 2, 2019

https://rationalwiki.org/wiki/Tone_argument

viraptor · on Oct 2, 2019

> In Android, a double-free of a memory with size N leads to two subsequent memory-allocation of size N returning the same address.

I get that freelists are fast and great for small chunk reuse, (I'm guessing that's the implementation that allows double return) but is there no way to protect against double-free here? I wish it crashed instead.

Edit: it does indeed look like a form of a freelist - https://blog.nsogroup.com/a-tale-of-two-mallocs-on-android-l...

rkangel · on Oct 2, 2019

At what level are you looking to cause the crash? Some options:

Compiler - insert code and memory usage to show area as free/not-free and check on malloc/free. Data has to be recorded at runtime because the type system does not provide enough information to be sure. Static analysis can only guess with C.

OS - Every malloc would need to be in a separate page

Libc - could do something similar to compiler

They all come with overhead in performance and/or memory usage, unless you change the technology stack in some way. Garbage collection is one approach. Adding extra type information to make it clear from static analysis (so you get something like Rust) is another. Or, some hardware support that gives more granular permissions.

mixedbit · on Oct 2, 2019

Perhaps malloc() could allocate 1 byte extra at the start of the allocated block with some magic number written to it and free() could assert that the number is correct and change the number to mark the block as freed.

vardump · on Oct 2, 2019

Problems: double free might still happen, except now some innocent memory block can be corrupted as well, which just happened to have the magic value in that address.

All memory allocations are now misaligned, decreasing performance. On some platforms misaligned access causes a bus error exception.

The marker would generally need to be at least 8 bytes. This would alleviate the problems, but also cause pretty significant overhead.

laughinghan · on Oct 2, 2019

On the other hand, the chances of an innocent memory block having a random 8-byte magic value is vanishingly small, so that would work even better. If free() took only 1ns to run and you called it 2^64 times, that would take >500 years.

saagarjha · on Oct 2, 2019

I feel like the allocation header might already have a flags bit field where this might go.

seppel · on Oct 2, 2019

The metadata (size of allocated block, and whether it is in use) is already there. I really wonder why this is not checked. In a normal libc you should get a "double free" panic.

kccqzy · on Oct 2, 2019

malloc() doesn't give you memory on the granularity of bytes. On x86_64 for example malloc has to assume you'll use the memory for 16-byte aligned access (say, movdqa). So your allocating an extra byte could become 16 extra bytes.

bluecalm · on Oct 2, 2019

Why not just always set the pointer to NULL after free is called in it? It would be fantastic if the operation is atomic as well but even if it isn't I like doing it. Freed address is not something you ever want to just again anyway.

mixedbit · on Oct 2, 2019

There can be many pointers pointing to the same allocated memory. For example, if you have a cyclic list with two elements head.next and head.prev pointers will both point to the second element and free(head.next); free(head.prev); would be double-free.

bluecalm · on Oct 2, 2019

When many pointers point to the same memory then naked free shouldn't be used. Instead you implement a counter and decrease it (and free at zero when the last pointer dies). Imo code where a free is called on something other pointers point to is just a basic level design error.

jandrese · on Oct 2, 2019

Because in this case the problem is a second pointer to the same chunk of data. Nulling the first doesn't null the second.

viraptor · on Oct 2, 2019

I guess the naive solution would be to walk the freelist/tree on free() to avoid duplicates. I have no idea what the real world impact would be though. Probably very workload-dependent. At least checking for duplicate would be free on insert in a tree, but in a list that's both a long walk and lots of thrashing.

rkangel · on Oct 2, 2019

It also wouldn't work at all when free space is coalesced.

Memory allocations vary in size (obviously). The simplest algorithm would be to take the block of memory and start allocating chunks from the beginning. The problem is that if your app does a lot of small allocations, frees them and then wants to do a large allocation, the free list might not contain a block large enough. The job of your allocator is to manage that fragmentation. There are a lot of approaches, but fundamentally at some point it has to put two deallocations 'back together' to make a larger one.

If you free a chunk, and then that chunk is coalesced with some data just before it, then the free list will no longer contain a pointer that matches the one that you then try to double free.

viraptor · on Oct 2, 2019

I get that, but I believe this issue doesn't apply in this case (I could be wrong of course). Specifically this double-allocation seems to come directly from a freelist for a specific allocation size. Going with the description from the link I posted about the allocator, this exploit should work only if the freed space is not coalesced - otherwise you wouldn't be guaranteed that the same memory is returned twice.

gpderetta · on Oct 2, 2019

It would completely defeat the purpose of using a free list.

josefx · on Oct 2, 2019

A call to free could write a gravestone into the freed memory, each free call could check for it and only check the free list if it sees that value. This would make a check of the free list rare, especially if malloc overrides the value again.

anon4242 · on Oct 2, 2019

How would you know it's a gravestone? What if what you consider a gravestone is a perfectly legit byte sequence for a certain program?

josefx · on Oct 2, 2019

That is the reason you check the freelist in the unlikely case that you see it, the cost would average to near nothing except for the memory accesses to set, check and clear the value.

You could also xor the memory location into the marker to avoid multiple collisions for applications that allocate a large amount of similar objects.

laughinghan · on Oct 2, 2019

Besides, what's the minimum chunk size? Can't be less than the 64-bit word size can it? The chances of a random 8-byte gravestone appearing in legit memory is practically nil. If free() took only 1ns to run and you called it 2^64 times, that would take >500 years.

viraptor · on Oct 2, 2019

Thats a really cool idea!

Someone · on Oct 2, 2019

free on insert in a tree isn’t really free, as the only reason to replace insert in a list by insert in a tree would be this check.

I think it also would make the memory manager less cache-friendly, as reusing just freed memory blocks would be impossible.

DoofusOfDeath · on Oct 2, 2019

One solution is to check each release candidate using extra instrumentation and/or enabled expensive checks.

pjmlp · on Oct 2, 2019

It won't help now, but because of issues like this, memory tagging is going to be a requirement for native code in Android.

https://security.googleblog.com/2019/08/adopting-arm-memory-...

Diggsey · on Oct 2, 2019

Using Rust may be the easiest option because there are existing crates for decoding GIF files (and other image formats) for example:

https://crates.io/crates/image

It should be straightforward to swap out the vulnerable C library with a thin wrapper around the above crate.

pjmlp · on Oct 2, 2019

The safest option would have been to use JNI to call Android hardened codecs instead of using their own decoding library in process.

pcwalton · on Oct 2, 2019

But it was pointed out upthread that animated GIFs aren't fully supported by the native Android libraries. So Rust would actually be a reasonable solution here.

It's true that there are costs associated with using Rust in the build process and so forth, but WhatsApp is trying to be a secure messenger and is FAANG-backed. The challenges can be overcome.

pjmlp · on Oct 2, 2019

Indeed, however Android does support other image formats with animation, which could have been another solution, much cheaper than introducing yet another language not directly supported by the OS SDK tooling.

Diggsey · on Oct 3, 2019

Changing to a different image format may not be possible because WhatsApp still needs to show historical images, and these images are not stored on a central server where they can be migrated.

Also, it's called the "Android NDK" and not the "Android C SDK" - a big part of what the NDK gives you is not C-specific. It makes almost no difference whether you compile your code with GCC vs rustc, at the end of the day you're creating a native shared library and loading it from a java application.

For example, these instructions from 2017 show how little work is involved https://mozilla.github.io/firefox-browser-architecture/exper... and support has only improved since then.

Using static analysis tooling on C/C++ code is more work than the above to set up, so I really do believe that Rust is the easiest way to reduce the chance of these kinds of bugs happening again.

pjmlp · on Oct 3, 2019

Then they could add a memory safe Java library dependency like Glide.

Yes it is called the NDK and the workflow that you describe doesn't support the bullet points I described.

If Rust wants to be embraced by Android developers it needs to up its game in Android Studio tooling, Binder generation and FFI integration.

Those instructions from 2017 are completely outdated given Android 3.5 and NDK r20.

GCC is no longer around, C++ support has improved quite substantially, static analysis tooling is integrated into NDK, Android Studio 3.7 (planned for next year) will bring support for mixed mode NDK libraries in AAR format.

bluejekyll · on Oct 2, 2019

> not directly supported by the OS SDK tooling.

If that's an issue, I bet the Rust community is interested in making it easier and better supported.

pjmlp · on Oct 2, 2019

Think about the Android development experience, using Android Studio, Gradle + CMake/ndk-build, mixed mode debugging across Java/Kotlin and native, Android Studio project templates for native code, packaging Android libraries, Bundles and Instant Apps, NDK APIs and Google libraries for Android intended for NDK consumption like Oboe.

This are the expectations that any external language should meet versus what platform languages offer out of the box.

littlestymaar · on Oct 2, 2019

I haven't done any android dev in a while so it may have changed in between, but it sounds that your expectations are higher than what the Android experience looks like when JNI is involved.

pjmlp · on Oct 2, 2019

My expectations are what an Android developer using Studio 3.5 with NDK r20 would expect.

While the experience is still found lacking versus what iOS or UWP tooling are capable of, it still is much better than any third party language integration.

layoutIfNeeded · on Oct 2, 2019

Sure, the “easiest” option is bringing in a whole new language and its toolchain just to decode gifs. Lol!

stevekablink · on Oct 2, 2019

[flagged]

_bxg1 · on Oct 2, 2019

Much as I love Rust, some of its advocates have a tendency to drop into conversations like this and oversimplify the situation, suggesting it be immediately used for everything under the sun, with limited context.

There are many factors that go into deciding whether to integrate a new language into your project. For example:

- Tooling

- Build process

- Library support

- Developer expertise

- Interop

Any of these can cause lots of friction, especially when your codebase has more than one language. Not to mention hiring challenges, man-hours required to do a conversion, etc. That's not to say it's automatically the wrong decision, but it's certainly not a simple one.

The case may be stronger on a greenfield project - though some of the above still applies - but then in many cases you may care more about iteration speed than performance, landing you with a GCed language.

Edit: I've been informed that the parent comment was trolling

bluejekyll · on Oct 2, 2019

So that was a troll impersonating a well known Rustacean, but your comment is interesting in that it tries to discount a valid suggestion perhaps Rust is a good option in this situation. (Though pjmlp makes a good point about using the OS toolkit).

But to your criticisms, Rust has had a lot of attention to making it a decent drop in tool for integrating with C. There are multiple tools to produce the correct ABI either from C or to C from Rust. This is to your point about tooling and interop.

Putting rust into an existing project, is generally as easy as linking against any foreign C library, I’ve personally done this enough to know it’s easy to do either as a static library or a dynamic library. To your point about build process.

As to developer learning and choice of another language, there may be reasons why C was used in the first place that disqualify GC’ed languages. So if you want and need a safe C replacement that is easy to insert, Rust is a good option.

People shouldn’t disqualify it out of hand. In this particular place perhaps it’s not the right solution, and maybe a poor time to suggest it, but your reasons are not accurate to the realities of that work.

adwn · on Oct 2, 2019

You're inadvertently replying to a troll account impersonating steveklabnik [1] (note the position of the 'l' and 'n'). Please don't judge the Rust community based on such idiots – most of us are quite reasonable and don't go around proselytizing others :-)

[1] https://news.ycombinator.com/user?id=steveklabnik

_bxg1 · on Oct 2, 2019

Ah- well then I would say the fact that I couldn't distinguish the post from other, real comments I've seen on HN just reinforces my point ;)

But yeah, I personally know those are a minority of the Rust community; I'm in the same boat where I don't want the whole thing being judged by a vocal minority.

littlestymaar · on Oct 2, 2019

An interesting thing about aggressive Rust evangelists here on HN or on /r/programming is that many of them are actually trolls accounts. Try looking at their comment history next time you encounter one.

jonreem · on Oct 2, 2019

We don’t appreciate impersonation on HN.

debatem1 · on Oct 2, 2019

Seems like you could use an invertible bloom filter or similar, but I'm not sure what the performance hit would look like.

amalcon · on Oct 2, 2019

Address sanitizer can detect this sort of thing, so yes, you can protect against it. It comes at a performance cost, though.

jwilk · on Oct 2, 2019

ASAN was designed for debugging, not for hardening:

https://seclists.org/oss-sec/2016/q1/363

Zitrax · on Oct 2, 2019

You wouldn't release with that on, and for testing you would have needed to craft a relevant input to detect it or fuzzed it.

kccqzy · on Oct 2, 2019

Why wouldn't you release with sanitizers on? The performance impact really is negligible depending on the workload. It's basically like that tiny slowdown when you go from C++ to Java. If you have benchmarks showing the slowdown is acceptable, just ship with it enabled. Saves lots of headaches.

pnako · on Oct 3, 2019

Might be true for UBSan and LSan, that have a moderate impact, but not for ASan and certainly not for MSan or TSan.

(as a _very_ crude approximation, with UBSan on, you have the performance of Java or Go, and with the other sanitizers you get closer to Ruby territory)

You can't combine the sanitizers anyway. They're really designed to be enabled for a testsuite and debug builds.

gpderetta · on Oct 2, 2019

Is it though? I suspect that the cost of the sanitizer is significantly higher than the overhead of switching, say, to Java.

Not that I don't appreciate sanitizers, I just used asan to find the cause of a memory corruption bug just a couple of hours ago and it was great. In the past I would have used valgrind which is significantly slower.

archi42 · on Oct 2, 2019

This is actually a good point. I suspect using asan on this lib as it used by WA would have had a negligble performance impact on the overall UX.

(That is, assuming asan would have really captured this specific instance).

pjmlp · on Oct 2, 2019

Sure you would. This kind of attacks keep forcing Google security team to increasingly lock down Android native code, including shipping some sanitizers enabled in production devices.

https://security.googleblog.com/2019/05/queue-hardening-enha...

nroets · on Oct 2, 2019

free() could be wrapped in a macro, but it would not catch the case where multiple copies of the pointer is stored.

#define my_free(x) { free(x); (x) = NULL; }

Note that the extra assignment will often be optimized away because

(a) the pointer is assigned to a malloc() after the call to my_free() OR

(b) the pointer goes out of scope e.g. is popped off the stack.

simias · on Oct 2, 2019

I guess it's better than nothing but that won't work if x is some temporary local variable instead of the long term storage location. It also won't help much if there are several pointers to the resource. As such I wonder if a macro is better that just explicitly nulling the pointer after free, at least there's no obfuscation of what's going on.

Generally I agree with you though, setting pointers to NULL after free is good practice and probably worth enforcing in the coding style.

bluecalm · on Oct 2, 2019

Well, if there are several pointers to the same memory you need some kind of logic to handle the number anyway. I don't think using free is a good idea in such cases. You need a wrapper which checks for the counter and then decreases it and frees on zero.

mgrviper · on Oct 2, 2019

At first i was wondering how one can get RCE out of double-free and then author proceed to drop a bomb - android would reliably return same adress to the next two allocations of same size as freed memory. Android behaviour here is simply unacceptable. One would expect (yeah) memory managment bugs from user space applications, but return same memory from a default allocator twice because of double-free is a terrible peculiarity, undefined behavour or not.

umanwizard · on Oct 2, 2019

How do other malloc implementations avoid this? It seems natural if what “free” does involves adding the pointer to some free list. Obviously you wouldn’t want to scan the whole free list every time looking for duplicates - is there another way to avoid this behavior?

tedunangst · on Oct 2, 2019

Bitmaps don't require scanning.

saagarjha · on Oct 3, 2019

They do, you'd just scanning a smaller thing.

tedunangst · on Oct 3, 2019

What are you scanning the bitmap looking for? Why not just index into it and look at the relevant bit?

shaklee3 · on Oct 2, 2019

This has happened to me in ubuntu 18.04 frequently. Do you have something showing that this is really that rare? If anything, it might help you track down bugs quicker.

not2b · on Oct 2, 2019

If the user hasn't messed up (with a double free), re-using the same block if the next malloc/new requests the same size block is the most efficient approach; it will have better caching behavior than selecting a completely different block. So this behavior isn't surprising. It seems you are asking for the allocator to spend extra cycles and produce worse caching behavior as a defensive measure. It might be possible to cheaply check for this particular error condition (the double free is two consecutive free calls with no intervening malloc or free) but the exploit writer will be able to see the code and work around it. The right solution is to guarantee that your codec doesn't do a double free.

nothrabannosir · on Oct 2, 2019

Malloc is a user space lib, not a syscall. The OS only deals in pages, on Linux accessed using brk and mmap.

lubesGordi · on Oct 2, 2019

You say this is a peculiarity, but then don't say what it should do instead. Is there some other widely used implementation that doesn't do this? Like others say, scanning the free list for dupes seems inefficient.

gpderetta · on Oct 2, 2019

The allocator in question runs in userspace.

sebastianconcpt · on Oct 2, 2019

Affected versions The exploit works well until WhatsApp version 2.19.230. The vulnerability is official patched in WhatsApp version 2.19.244

The exploit works well for Android 8.1 and 9.0, but does not work for Android 8.0 and below. In the older Android versions, double-free could still be triggered. However, because of the malloc calls by the system after the double-free, the app just crashes before reaching to the point that we could control the PC register.

danso · on Oct 2, 2019

Even as someone who hasn't had to deal with memory allocation since college CS classes many years ago, I found this explainer to be easy to follow and enlightening. Well done!

tangent: Your "About" link goes to a 404: https://awakened1712.github.io/about/

jor-el · on Oct 2, 2019

The author of the blog post is my friend, I will ask him to fix it. The funny thing is, he don't even knows about HN at all and he has no idea that his post is trending. :P

Dumbdo · on Oct 2, 2019

Yeah I thought the same, looked like he decided to delete the about page but forgot to delete the references: https://github.com/awakened1712/awakened1712.github.io/commi...

captn3m0 · on Oct 2, 2019

Somewhat related question: Does anyone know when the fixes for CVE-2019-11927 will be released for iOS? The advisory[0] says:

>This issue affects WhatsApp for iOS before version v2.19.100 and WhatsApp for Android before version 2.19.243.

But the latest version I see on the App Store is 2.19.92 (iPhone S3, iOS 13). The AppStore website says the same[1].

The Android version[2] seems updated (2.19.271)

[0]: https://www.facebook.com/security/advisories/cve-2019-11927

[1]: https://apps.apple.com/in/app/whatsapp-messenger/id310633997

[2]: https://www.whatsapp.com/android/

jannes · on Oct 2, 2019

Whenever Apple finishes the app review, I suppose. It's a bit of a shame that people are running vulnerable versions of WhatsApp because Apple is taking its time.

tinus_hn · on Oct 2, 2019

There is an expedited review process for this kind of thing. It’s impossible to tell who is to blame here. It could be that they abused the expedited review process before and are excluded, or they never requested it, or perhaps they were late in sending in the new version. Or Apple found a problem in the new version, or they were slow.

wil421 · on Oct 2, 2019

Can someone explain the RCE part? I understand the double free bug but not the exploit part.

l33tman · on Oct 2, 2019

The double-free bug means that a double-free of a chunk of size X might lead to subsequent allocs of that size X to return the same pointer.

The lib in question parses the GIF twice.

In the first run, it allocs an internal info struct of size X, then you trigger the double-free so this info struct is freed twice.

In the second run, it allocs the same internal info struct of size X and gets a ptr Y. Then, by crafting the GIF so one of the frames to be decoded is also of the size X, it will alloc intermediary space for this frame, but due to it being the same size, you'll get the same ptr Y returned, and the frame gets unpacked over the info structure...

So, you get your user-provided frame data placed in the internal info struct. Luckily for the exploiter, there are function pointers inside the info struct which are called a bit later.

So you can provide a memory address to jump to by putting it in the right place in the magic frame. You can't also put executable code in the magic frame due to restrictions, but you can place shell commands in the frame and shuffle registers by using already available executable chunks by selecting the right jump address (this is explained well in the post I think).

lubesGordi · on Oct 2, 2019

Awesome explanation! Thank you! Does this mean also that function pointers inherently weaken security (by providing a means of code execution given other faults)?

l33tman · on Oct 2, 2019

They are a very common design-pattern in traditional C, as you kind of emulate object-oriented virtual function overloading by using them in structs. I don't think it's realistic to avoid them just by principle..

As in network server hardening, you can usually start by following where user-provided data goes, and harden its path.

In this case however the first problem was a clandestine bug (the double free in some cases), probably induced by the programmer trying to be a bit too clever with the management of these structs (if you're not threaded you could simply have a single static declaration for this info struct and forget the malloc/freeing, or at least malloc it just once upon each thread init, haven't looked at this code..).

Generally keeping track of all malloc/free pairs is tricky and you can go a long way by trying to simplify your logic, I mean even if you don't get exploited like this, you might simply crash sometime or leak memory or in general behave badly.

There are good reasons why there exist all kinds of memory-managed languages :)

saagarjha · on Oct 3, 2019

Well, yes, but the existence of function pointers is basically a given for most languages. For example, your C++ may not have an explicit function pointer in it anywhere, but its vtables are just as good.

rptr_87 · on Oct 3, 2019

Thanks a lot for explanation. Do you suggest any blog/articles to gain such a deep understanding of security vulnerabilities.. thank you.

wyldfire · on Oct 2, 2019

> We need to first let PC jumps to an intermediate gadget,

IMO ROP gadgets are terribly clever. Since you can't just execute new malicious code but you do have control over which instruction to execute next, you have to scour the executable and its libraries for any content that would have the effects that you require.

Take a look at the output from ROPGadget [1] (look at "ROP Chain Generation" screenshot) to see an example of how it works.

[1] https://github.com/JonathanSalwan/ROPgadget

kardos · on Oct 2, 2019

ROP gadgets are indeed clever. Meanwhile there are efforts afoot [1,2] to mitigate this line of attack by removing all of the ROP gadgets, which is also rather clever.

[1] https://www.openbsd.org/papers/asiabsdcon2019-rop-paper.pdf

[2] https://www.openbsd.org/papers/eurobsdcon2018-rop.pdf

aloknnikhil · on Oct 2, 2019

Yea. They hand-waved over the key part of the RCE. I know they exploited the function call in the struct, but how exactly did they re-write the program counter to point to the address of the system call? They mention that's another vulnerability not part of this article. Also, in the demo, they read a WhatsApp DB file as if it was plain-text from the terminal. I thought WhatsApp DB files were encrypted? How was 'more' able to just read the file?

icebraining · on Oct 2, 2019

> how exactly did they re-write the program counter to point to the address of the system call

My understanding is: with the double-free, they managed to overwrite "info", including the location of the function "rewindFunction". When the parsing process calls that function, it will therefore actually call whatever function the attacker has pointed "rewindFunction" to.

aloknnikhil · on Oct 2, 2019

I see now. That's really clever. The fact that there's a double free AND that the WhatsApp gallery parses a GIF twice for some reason is a great catch.

l33tman · on Oct 2, 2019

Yes, it's not something you typically think about when writing code even if you're security-conscious. IMO it's not good that the double-free bug has these consequences.

codezero · on Oct 2, 2019

Because one can overwrite the struct with arbitrary content, and that after the function that corrupts the struct is called, it executed a function that lives inside that struct, so the attacker can redefine the function to call out to the system (after a few other tricks - I think)

Someone · on Oct 2, 2019

If I understand this correctly, the underlying problem is in GIFLib, which calls reallocatearray, a wrapper around realloc that guards against overflows when computing the size of the memory buffer to reallocate from the number of items and item size.

However, reading https://github.com/aseprite/giflib/blob/master/lib/openbsd-r..., I don’t see how that could lead to a double-free, unless realloc double-frees, or unless a different reallocatearray gets linked in.

Also, that comment on how realloc isn’t portable feels scary. I can see that introduce subtle bugs in libraries used on a different platform from where it is developed.

Hence, I think one should forbid the use of raw ‘realloc’ in portable code.

totony · on Oct 2, 2019

Most likely this is getting linked https://code.woboq.org/userspace/glibc/malloc/reallocarray.c...

realloc on linux frees the ptr https://linux.die.net/man/3/realloc

jwilk · on Oct 2, 2019

I thought Adroid had its own BSDish libc, not glibc? (But the implementation is likely mostly-identical anyway.)

saagarjha · on Oct 3, 2019

Android uses its own implementation, called Bionic: https://en.wikipedia.org/wiki/Bionic_(software)

jwilk · on Oct 2, 2019

android-gif-drawable has an older copy of this lib:

https://github.com/koral--/android-gif-drawable/blob/dev/and...

Or perhaps the function from the libc was being used?

pratio · on Oct 2, 2019

The post is impressive with the demo and explanations

daenz · on Oct 2, 2019

That was surprisingly easy to follow. I always thought these exploits were quite a bit more magical, but it seems pretty straightforward. Scary.

tzs · on Oct 2, 2019

  int_fast32_t widthOverflow = gifFilePtr->Image.Width - info->originalWidth;
  int_fast32_t heightOverflow = gifFilePtr->Image.Height - info->originalHeight;
  const uint_fast32_t newRasterSize =
          gifFilePtr->Image.Width * gifFilePtr->Image.Height;
  if (newRasterSize > info->rasterSize || widthOverflow > 0 ||
      heightOverflow > 0) {
  ...
      info->rasterSize = newRasterSize;

OT, but isn't that test redundant? For rasterSize = height x width to increase, at least one of height or width must increase, and so anytime the first term of the || is true, at least one of the other terms will also be true, so the first term is redundant. It seems it could be simply this:

  int_fast32_t widthOverflow = gifFilePtr->Image.Width - info->originalWidth;
  int_fast32_t heightOverflow = gifFilePtr->Image.Height - info->originalHeight;
  if (widthOverflow > 0 || heightOverflow > 0) {

saagarjha · on Oct 3, 2019

> rasterSize = height x width to increase, at least one of height or width must increase

Not if they're negative.

tzs · on Oct 3, 2019

height and width are of type GifWord, which is typedefed to uint_fast16_t.

swagonomixxx · on Oct 2, 2019

Does anyone know if such behavior is possible on iOS as well?

And maybe if Signal has a similar issue? I'm not sure if they use the same GIF decoding library on Android? And iOS?

If someone doesn't know off the top of their head maybe they can point me to some docs.

gok · on Oct 2, 2019

Is there a legitimate reason for the Android sandbox to allow calls to `system()`?

caleb-allen · on Oct 2, 2019

Is issue related to what you're talking about?

https://issuetracker.google.com/issues/128554619

As of Android Q you can no longer execute native binaries that aren't shipped in the .apk, is that related? (not a systems dev, hence the question)

archi42 · on Oct 2, 2019

I suppose in case you ship some prebuilt executable to do stuff? E.g. ffmpeg. Just a guess, though.

asveikau · on Oct 2, 2019

IMO execve with its already-parsed-argv is a much safer way to invoke another program. You don't need the shell to interpret the boundaries of command line arguments, it invites trouble.

archi42 · on Oct 2, 2019

Doesn't execve() end in the same syscall as system()? Not as comfortable for exploitation, but maybe still feasible? (Embedded systems I take a look at usually don't have an OS with either of the two, so I honestly don't know how the implementations look like).

asveikau · on Oct 3, 2019

system() basically does a fork and an execve of sh -c 'arg passed to system()', then waitpid to block on the result. execve is a syscall, system is a libc function.

saagarjha · on Oct 3, 2019

execve("/bin/sh", NULL, NULL) is a quite common replacement for system() in exploits.

userbinator · on Oct 2, 2019

The GIF format is over 30 years old. One would think that decoding them was a problem solved long ago, and that decoding libraries would have all the bugs found and fixed by now.

As someone who has also written a GIF decoder, more for learning purposes than anything, I checked what mine would do with that GIF: it does reallocate twice, but since the first time already nulls the buffer pointer, it doesn't actually double-free (since free()'ing a NULL pointer is defined by the standard to have no effect.)

big_chungus · on Oct 2, 2019

The demo clip has been disabled by google drive due to too many plays. Does anyone have a copy/backup he'd be willing to share?

_fbpt · on Oct 3, 2019

I recall someone once designed an image format which stored (image width - 1), and added 1 to the image dimensions while decoding. That eliminates an edge case, but no clue what would happen with MAX_INT image size...

gosuri · on Oct 3, 2019

To prevent falling victim to this attack, it is highly recommended that all WhatsApp users update the app to the latest version.

Do I have to do this myself? Doesn't update itself?

GrayShade · on Oct 4, 2019

It usually does, but you can disable updates in Play Store.

> The exploit works well until WhatsApp version 2.19.230. The vulnerability is official patched in WhatsApp version 2.19.244

nielsbot · on Oct 2, 2019

This is a very easy to follow explanation, even for those not familiar with these types of attacks. (me)

notzuck · on Oct 2, 2019

Assuming version numbers of acceptable, he should have sold this to Zerodium.

billpg · on Oct 2, 2019

Exactly how terrified should I be right now?

mruts · on Oct 2, 2019

How does one go about finding stuff like this? Also how much time did it take to research and develop the full exploit? Sounds like a lot of work, especially if you’re not getting paid.

archi42 · on Oct 2, 2019

I think the author can answer this better, but I'd guess it's mixture of hobby, curiosity and luck (or rather good intuition where $stuff breaks). [edit] Can be anything from a Saturday if it is the first thing you poke into (and easily grasp the call/stack structure) to a few weekends trying different vectors (or figuring out how to layout your fake memory).[/edit]

At least from my experience breaking stuff for security lectures/CTFs.

This is probably also a good way to make a name in the security industry (RCE against WA on CV should look pretty neat). [edit]So not getting paid is relative (and again, if happens out of curiosity - not a huge difference spending a weekend binging some anime or breaking stuff).[/edit]

fourier_mode · on Oct 2, 2019

> 24 minute read

A simple (number_of_words)/(200WPM) isn't a great metric when words from code snippets are also counted.

hk__2 · on Oct 2, 2019

What would you suggest instead? We still read words from code snippets.

fourier_mode · on Oct 3, 2019

According to that metric you spend 1 minute to read this --

    notroot@osboxes:~/Desktop/gif$ ./exploit
    buffer = 0x7ffc586cd8b0 size = 266
    47 49 46 38 39 61 18 00 0A 00 F2 00 00 66 CC CC
    FF FF FF 00 00 00 33 99 66 99 FF CC 00 00 00 00
    00 00 00 00 00 2C 00 00 00 00 08 00 15 00 00 08
    9C 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    00 00 00 00 00 00 00 00 00 00 00 00 84 9C 09 B0
    C5 07 00 00 00 74 DE E4 11 F3 06 0F 08 37 63 40
    C4 C8 21 C3 45 0C 1B 38 5C C8 70 71 43 06 08 1A
    34 68 D0 00 C1 07 C4 1C 34 00 00 00 00 00 00 00
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    00 54 12 7C C0 C5 07 00 00 00 EE FF FF 2C 00 00
    00 00 1C 0F 00 00 00 00 2C 00 00 00 00 1C 0F 00
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    00 00 00 00 00 00 00 00 00 00 00 2C 00 00 00 00
    18 00 0A 00 0F 00 01 00 00 3B