ImageDecoder didn't support animated gif for all APIs they support so you're again looking for reasoning in the wrong bush - exposing codecs wouldn't help.
Your proposed solution wouldn't work. You appear to have been unaware and instead of saying "thanks, I didn't realize the limitation existed, that's unfortunate" you instead attack the messenger. All file parsers should be throughly fuzzed, yes. Not sure what tone has to do with it.
> In Android, a double-free of a memory with size N leads to two subsequent memory-allocation of size N returning the same address.
I get that freelists are fast and great for small chunk reuse, (I'm guessing that's the implementation that allows double return) but is there no way to protect against double-free here? I wish it crashed instead.
At what level are you looking to cause the crash? Some options:
Compiler - insert code and memory usage to show area as free/not-free and check on malloc/free. Data has to be recorded at runtime because the type system does not provide enough information to be sure. Static analysis can only guess with C.
OS - Every malloc would need to be in a separate page
Libc - could do something similar to compiler
They all come with overhead in performance and/or memory usage, unless you change the technology stack in some way. Garbage collection is one approach. Adding extra type information to make it clear from static analysis (so you get something like Rust) is another. Or, some hardware support that gives more granular permissions.
Perhaps malloc() could allocate 1 byte extra at the start of the allocated block with some magic number written to it and free() could assert that the number is correct and change the number to mark the block as freed.
Problems: double free might still happen, except now some innocent memory block can be corrupted as well, which just happened to have the magic value in that address.
All memory allocations are now misaligned, decreasing performance. On some platforms misaligned access causes a bus error exception.
The marker would generally need to be at least 8 bytes. This would alleviate the problems, but also cause pretty significant overhead.
On the other hand, the chances of an innocent memory block having a random 8-byte magic value is vanishingly small, so that would work even better. If free() took only 1ns to run and you called it 2^64 times, that would take >500 years.
The metadata (size of allocated block, and whether it is in use) is already there. I really wonder why this is not checked. In a normal libc you should get a "double free" panic.
malloc() doesn't give you memory on the granularity of bytes. On x86_64 for example malloc has to assume you'll use the memory for 16-byte aligned access (say, movdqa). So your allocating an extra byte could become 16 extra bytes.
Why not just always set the pointer to NULL after free is called in it? It would be fantastic if the operation is atomic as well but even if it isn't I like doing it. Freed address is not something you ever want to just again anyway.
There can be many pointers pointing to the same allocated memory. For example, if you have a cyclic list with two elements head.next and head.prev pointers will both point to the second element and free(head.next); free(head.prev); would be double-free.
When many pointers point to the same memory then naked free shouldn't be used. Instead you implement a counter and decrease it (and free at zero when the last pointer dies). Imo code where a free is called on something other pointers point to is just a basic level design error.
I guess the naive solution would be to walk the freelist/tree on free() to avoid duplicates. I have no idea what the real world impact would be though. Probably very workload-dependent. At least checking for duplicate would be free on insert in a tree, but in a list that's both a long walk and lots of thrashing.
It also wouldn't work at all when free space is coalesced.
Memory allocations vary in size (obviously). The simplest algorithm would be to take the block of memory and start allocating chunks from the beginning. The problem is that if your app does a lot of small allocations, frees them and then wants to do a large allocation, the free list might not contain a block large enough. The job of your allocator is to manage that fragmentation. There are a lot of approaches, but fundamentally at some point it has to put two deallocations 'back together' to make a larger one.
If you free a chunk, and then that chunk is coalesced with some data just before it, then the free list will no longer contain a pointer that matches the one that you then try to double free.
I get that, but I believe this issue doesn't apply in this case (I could be wrong of course). Specifically this double-allocation seems to come directly from a freelist for a specific allocation size. Going with the description from the link I posted about the allocator, this exploit should work only if the freed space is not coalesced - otherwise you wouldn't be guaranteed that the same memory is returned twice.
A call to free could write a gravestone into the freed memory, each free call could check for it and only check the free list if it sees that value. This would make a check of the free list rare, especially if malloc overrides the value again.
That is the reason you check the freelist in the unlikely case that you see it, the cost would average to near nothing except for the memory accesses to set, check and clear the value.
You could also xor the memory location into the marker to avoid multiple collisions for applications that allocate a large amount of similar objects.
Besides, what's the minimum chunk size? Can't be less than the 64-bit word size can it? The chances of a random 8-byte gravestone appearing in legit memory is practically nil. If free() took only 1ns to run and you called it 2^64 times, that would take >500 years.
But it was pointed out upthread that animated GIFs aren't fully supported by the native Android libraries. So Rust would actually be a reasonable solution here.
It's true that there are costs associated with using Rust in the build process and so forth, but WhatsApp is trying to be a secure messenger and is FAANG-backed. The challenges can be overcome.
Indeed, however Android does support other image formats with animation, which could have been another solution, much cheaper than introducing yet another language not directly supported by the OS SDK tooling.
Changing to a different image format may not be possible because WhatsApp still needs to show historical images, and these images are not stored on a central server where they can be migrated.
Also, it's called the "Android NDK" and not the "Android C SDK" - a big part of what the NDK gives you is not C-specific. It makes almost no difference whether you compile your code with GCC vs rustc, at the end of the day you're creating a native shared library and loading it from a java application.
Using static analysis tooling on C/C++ code is more work than the above to set up, so I really do believe that Rust is the easiest way to reduce the chance of these kinds of bugs happening again.
Then they could add a memory safe Java library dependency like Glide.
Yes it is called the NDK and the workflow that you describe doesn't support the bullet points I described.
If Rust wants to be embraced by Android developers it needs to up its game in Android Studio tooling, Binder generation and FFI integration.
Those instructions from 2017 are completely outdated given Android 3.5 and NDK r20.
GCC is no longer around, C++ support has improved quite substantially, static analysis tooling is integrated into NDK, Android Studio 3.7 (planned for next year) will bring support for mixed mode NDK libraries in AAR format.
Think about the Android development experience, using Android Studio, Gradle + CMake/ndk-build, mixed mode debugging across Java/Kotlin and native, Android Studio project templates for native code, packaging Android libraries, Bundles and Instant Apps, NDK APIs and Google libraries for Android intended for NDK consumption like Oboe.
This are the expectations that any external language should meet versus what platform languages offer out of the box.
I haven't done any android dev in a while so it may have changed in between, but it sounds that your expectations are higher than what the Android experience looks like when JNI is involved.
My expectations are what an Android developer using Studio 3.5 with NDK r20 would expect.
While the experience is still found lacking versus what iOS or UWP tooling are capable of, it still is much better than any third party language integration.
Much as I love Rust, some of its advocates have a tendency to drop into conversations like this and oversimplify the situation, suggesting it be immediately used for everything under the sun, with limited context.
There are many factors that go into deciding whether to integrate a new language into your project. For example:
- Tooling
- Build process
- Library support
- Developer expertise
- Interop
Any of these can cause lots of friction, especially when your codebase has more than one language. Not to mention hiring challenges, man-hours required to do a conversion, etc. That's not to say it's automatically the wrong decision, but it's certainly not a simple one.
The case may be stronger on a greenfield project - though some of the above still applies - but then in many cases you may care more about iteration speed than performance, landing you with a GCed language.
Edit: I've been informed that the parent comment was trolling
So that was a troll impersonating a well known Rustacean, but your comment is interesting in that it tries to discount a valid suggestion perhaps Rust is a good option in this situation. (Though pjmlp makes a good point about using the OS toolkit).
But to your criticisms, Rust has had a lot of attention to making it a decent drop in tool for integrating with C. There are multiple tools to produce the correct ABI either from C or to C from Rust. This is to your point about tooling and interop.
Putting rust into an existing project, is generally as easy as linking against any foreign C library, I’ve personally done this enough to know it’s easy to do either as a static library or a dynamic library. To your point about build process.
As to developer learning and choice of another language, there may be reasons why C was used in the first place that disqualify GC’ed languages. So if you want and need a safe C replacement that is easy to insert, Rust is a good option.
People shouldn’t disqualify it out of hand. In this particular place perhaps it’s not the right solution, and maybe a poor time to suggest it, but your reasons are not accurate to the realities of that work.
You're inadvertently replying to a troll account impersonating steveklabnik [1] (note the position of the 'l' and 'n'). Please don't judge the Rust community based on such idiots – most of us are quite reasonable and don't go around proselytizing others :-)
Ah- well then I would say the fact that I couldn't distinguish the post from other, real comments I've seen on HN just reinforces my point ;)
But yeah, I personally know those are a minority of the Rust community; I'm in the same boat where I don't want the whole thing being judged by a vocal minority.
An interesting thing about aggressive Rust evangelists here on HN or on /r/programming is that many of them are actually trolls accounts. Try looking at their comment history next time you encounter one.
Why wouldn't you release with sanitizers on? The performance impact really is negligible depending on the workload. It's basically like that tiny slowdown when you go from C++ to Java. If you have benchmarks showing the slowdown is acceptable, just ship with it enabled. Saves lots of headaches.
Might be true for UBSan and LSan, that have a moderate impact, but not for ASan and certainly not for MSan or TSan.
(as a _very_ crude approximation, with UBSan on, you have the performance of Java or Go, and with the other sanitizers you get closer to Ruby territory)
You can't combine the sanitizers anyway. They're really designed to be enabled for a testsuite and debug builds.
Is it though? I suspect that the cost of the sanitizer is significantly higher than the overhead of switching, say, to Java.
Not that I don't appreciate sanitizers, I just used asan to find the cause of a memory corruption bug just a couple of hours ago and it was great. In the past I would have used valgrind which is significantly slower.
Sure you would. This kind of attacks keep forcing Google security team to increasingly lock down Android native code, including shipping some sanitizers enabled in production devices.
I guess it's better than nothing but that won't work if x is some temporary local variable instead of the long term storage location. It also won't help much if there are several pointers to the resource. As such I wonder if a macro is better that just explicitly nulling the pointer after free, at least there's no obfuscation of what's going on.
Generally I agree with you though, setting pointers to NULL after free is good practice and probably worth enforcing in the coding style.
Well, if there are several pointers to the same memory you need some kind of logic to handle the number anyway. I don't think using free is a good idea in such cases. You need a wrapper which checks for the counter and then decreases it and frees on zero.
At first i was wondering how one can get RCE out of double-free and then author proceed to drop a bomb - android would reliably return same adress to the next two allocations of same size as freed memory. Android behaviour here is simply unacceptable. One would expect (yeah) memory managment bugs from user space applications, but return same memory from a default allocator twice because of double-free is a terrible peculiarity, undefined behavour or not.
How do other malloc implementations avoid this? It seems natural if what “free” does involves adding the pointer to some free list. Obviously you wouldn’t want to scan the whole free list every time looking for duplicates - is there another way to avoid this behavior?
This has happened to me in ubuntu 18.04 frequently. Do you have something showing that this is really that rare? If anything, it might help you track down bugs quicker.
If the user hasn't messed up (with a double free), re-using the same block if the next malloc/new requests the same size block is the most efficient approach; it will have better caching behavior than selecting a completely different block. So this behavior isn't surprising. It seems you are asking for the allocator to spend extra cycles and produce worse caching behavior as a defensive measure. It might be possible to cheaply check for this particular error condition (the double free is two consecutive free calls with no intervening malloc or free) but the exploit writer will be able to see the code and work around it. The right solution is to guarantee that your codec doesn't do a double free.
You say this is a peculiarity, but then don't say what it should do instead. Is there some other widely used implementation that doesn't do this? Like others say, scanning the free list for dupes seems inefficient.
Affected versions
The exploit works well until WhatsApp version 2.19.230. The vulnerability is official patched in WhatsApp version 2.19.244
The exploit works well for Android 8.1 and 9.0, but does not work for Android 8.0 and below. In the older Android versions, double-free could still be triggered. However, because of the malloc calls by the system after the double-free, the app just crashes before reaching to the point that we could control the PC register.
Even as someone who hasn't had to deal with memory allocation since college CS classes many years ago, I found this explainer to be easy to follow and enlightening. Well done!
The author of the blog post is my friend, I will ask him to fix it. The funny thing is, he don't even knows about HN at all and he has no idea that his post is trending. :P
Whenever Apple finishes the app review, I suppose. It's a bit of a shame that people are running vulnerable versions of WhatsApp because Apple is taking its time.
There is an expedited review process for this kind of thing. It’s impossible to tell who is to blame here. It could be that they abused the expedited review process before and are excluded, or they never requested it, or perhaps they were late in sending in the new version. Or Apple found a problem in the new version, or they were slow.
The double-free bug means that a double-free of a chunk of size X might lead to subsequent allocs of that size X to return the same pointer.
The lib in question parses the GIF twice.
In the first run, it allocs an internal info struct of size X, then you trigger the double-free so this info struct is freed twice.
In the second run, it allocs the same internal info struct of size X and gets a ptr Y. Then, by crafting the GIF so one of the frames to be decoded is also of the size X, it will alloc intermediary space for this frame, but due to it being the same size, you'll get the same ptr Y returned, and the frame gets unpacked over the info structure...
So, you get your user-provided frame data placed in the internal info struct. Luckily for the exploiter, there are function pointers inside the info struct which are called a bit later.
So you can provide a memory address to jump to by putting it in the right place in the magic frame. You can't also put executable code in the magic frame due to restrictions, but you can place shell commands in the frame and shuffle registers by using already available executable chunks by selecting the right jump address (this is explained well in the post I think).
Awesome explanation! Thank you!
Does this mean also that function pointers inherently weaken security (by providing a means of code execution given other faults)?
They are a very common design-pattern in traditional C, as you kind of emulate object-oriented virtual function overloading by using them in structs. I don't think it's realistic to avoid them just by principle..
As in network server hardening, you can usually start by following where user-provided data goes, and harden its path.
In this case however the first problem was a clandestine bug (the double free in some cases), probably induced by the programmer trying to be a bit too clever with the management of these structs (if you're not threaded you could simply have a single static declaration for this info struct and forget the malloc/freeing, or at least malloc it just once upon each thread init, haven't looked at this code..).
Generally keeping track of all malloc/free pairs is tricky and you can go a long way by trying to simplify your logic, I mean even if you don't get exploited like this, you might simply crash sometime or leak memory or in general behave badly.
There are good reasons why there exist all kinds of memory-managed languages :)
Well, yes, but the existence of function pointers is basically a given for most languages. For example, your C++ may not have an explicit function pointer in it anywhere, but its vtables are just as good.
> We need to first let PC jumps to an intermediate gadget,
IMO ROP gadgets are terribly clever. Since you can't just execute new malicious code but you do have control over which instruction to execute next, you have to scour the executable and its libraries for any content that would have the effects that you require.
Take a look at the output from ROPGadget [1] (look at "ROP Chain Generation" screenshot) to see an example of how it works.
ROP gadgets are indeed clever. Meanwhile there are efforts afoot [1,2] to mitigate this line of attack by removing all of the ROP gadgets, which is also rather clever.
Yea. They hand-waved over the key part of the RCE. I know they exploited the function call in the struct, but how exactly did they re-write the program counter to point to the address of the system call? They mention that's another vulnerability not part of this article. Also, in the demo, they read a WhatsApp DB file as if it was plain-text from the terminal. I thought WhatsApp DB files were encrypted? How was 'more' able to just read the file?
> how exactly did they re-write the program counter to point to the address of the system call
My understanding is: with the double-free, they managed to overwrite "info", including the location of the function "rewindFunction". When the parsing process calls that function, it will therefore actually call whatever function the attacker has pointed "rewindFunction" to.
I see now. That's really clever. The fact that there's a double free AND that the WhatsApp gallery parses a GIF twice for some reason is a great catch.
Yes, it's not something you typically think about when writing code even if you're security-conscious. IMO it's not good that the double-free bug has these consequences.
Because one can overwrite the struct with arbitrary content, and that after the function that corrupts the struct is called, it executed a function that lives inside that struct, so the attacker can redefine the function to call out to the system (after a few other tricks - I think)
If I understand this correctly, the underlying problem is in GIFLib, which calls reallocatearray, a wrapper around realloc that guards against overflows when computing the size of the memory buffer to reallocate from the number of items and item size.
Also, that comment on how realloc isn’t portable feels scary. I can see that introduce subtle bugs in libraries used on a different platform from where it is developed.
Hence, I think one should forbid the use of raw ‘realloc’ in portable code.
OT, but isn't that test redundant? For rasterSize = height x width to increase, at least one of height or width must increase, and so anytime the first term of the || is true, at least one of the other terms will also be true, so the first term is redundant. It seems it could be simply this:
IMO execve with its already-parsed-argv is a much safer way to invoke another program. You don't need the shell to interpret the boundaries of command line arguments, it invites trouble.
Doesn't execve() end in the same syscall as system()? Not as comfortable for exploitation, but maybe still feasible? (Embedded systems I take a look at usually don't have an OS with either of the two, so I honestly don't know how the implementations look like).
system() basically does a fork and an execve of sh -c 'arg passed to system()', then waitpid to block on the result. execve is a syscall, system is a libc function.
The GIF format is over 30 years old. One would think that decoding them was a problem solved long ago, and that decoding libraries would have all the bugs found and fixed by now.
As someone who has also written a GIF decoder, more for learning purposes than anything, I checked what mine would do with that GIF: it does reallocate twice, but since the first time already nulls the buffer pointer, it doesn't actually double-free (since free()'ing a NULL pointer is defined by the standard to have no effect.)
I recall someone once designed an image format which stored (image width - 1), and added 1 to the image dimensions while decoding. That eliminates an edge case, but no clue what would happen with MAX_INT image size...
How does one go about finding stuff like this? Also how much time did it take to research and develop the full exploit? Sounds like a lot of work, especially if you’re not getting paid.
I think the author can answer this better, but I'd guess it's mixture of hobby, curiosity and luck (or rather good intuition where $stuff breaks). [edit] Can be anything from a Saturday if it is the first thing you poke into (and easily grasp the call/stack structure) to a few weekends trying different vectors (or figuring out how to layout your fake memory).[/edit]
At least from my experience breaking stuff for security lectures/CTFs.
This is probably also a good way to make a name in the security industry (RCE against WA on CV should look pretty neat). [edit]So not getting paid is relative (and again, if happens out of curiosity - not a huge difference spending a weekend binging some anime or breaking stuff).[/edit]
Another victim of Android not exposing the image codecs to the NDK.
I guess, like many, they decided to use libpl_droidsonroids_gif instead of having to deal with JNI to call BitmapFactory or ImageDecoder.
Because of these kind of exploits codecs have their own hardness processes, https://source.android.com/devices/media/framework-hardening, but naturally it doesn't help when applications bring their own native alternatives along instead.