I'm very curious about the jump obfuscation. Maybe somebody who's done more reve...

mahmoudimus · 2025-01-21T01:47:10 1737424030

There's some other cool tricks you can do, where you symbolically execute using angr or another emulator such as https://github.com/cea-sec/miasm to be able to use control flow graph unflattening. You can also use Intel's PIN framework to do some interesting analysis. Some helpful articles here:

- https://calwa.re/reversing/obfuscation/binary-deobfuscation-...

- https://www.nccgroup.com/us/research-blog/a-look-at-some-rea...

russdill · 2025-01-21T00:55:19 1737420919

Unconditional jumps are very common and everything in x86 assembly is very very messy after optimizations. Many functions do not end in ret.

jychang · 2025-01-21T02:16:33 1737425793

How do functions that not end in ret work?

mananaysiempre · 2025-01-21T04:40:56 1737434456

A function with an unlikely slowpath can easily end up arranged as

    top part
    jxx slow
    fast middle part
  end:
    bottom part
    ret
  slow:
    slow middle part
    jmp end

There may be more than one slow part, the slow parts might actually be exiled from inside a loop and not a simple linear code path and can themselves contain loops, etc. Play with __builtin_expect and objdump --visualize-jumps a bit and you’ll encounter many variations.

DSMan195276 · 2025-01-21T03:46:49 1737431209

In addition to what others said, I'd simply point out that all 'ret' does on x86 is pop an address off the top of the stack and jump to it. It's more of a "helper" than a special instruction and it's use is never required as long as you ensure the stack will be kept correct (such as with a tail-call situation).

dcrazy · 2025-01-21T05:11:44 1737436304

`ret` also updates the branch predictor’s shadow stack. Failing to balance `call` and `ret` can seriously impact performance.

dkersten · 2025-01-21T07:02:23 1737442943

If anyone else is looking for more information on this, like I was, this stack is called the “return stack buffer”.

DSMan195276 · 2025-01-21T14:48:22 1737470902

Right, I didn't want to get into it but definitely using 'ret' "properly" has big performance benefits. My point was just that it won't prevent your code from running, it's not like x86 will trigger an exception if they don't match up.

ack_complete · 2025-01-21T20:34:10 1737491650

RET does more these days. If Intel CET is enabled then it also updates the hardware shadow stack, and the program will crash if RET is bypassed unless the SSP is adjusted. IIRC Windows x64 also has pertinent requirements on how the function epilog restores registers and returns since it will trace portions of the instruction stream during stack unwinding.

duskwuff · 2025-01-21T02:40:02 1737427202

The return is somewhere before the end of the function, e.g.

  loop:
    do stuff
    if some condition: return
    do more stuff
    goto loop

Alternatively, the function might end with a tail-call to another function, written as an unconditional branch.

jcranmer · 2025-01-21T02:53:06 1737427986

There are things like compiling a tail call as JMP func_addr.

frogsRnice · 2025-01-21T03:36:55 1737430615

Would you not have to use a jump instead of call for it to be a tail call at all- ie otherwise a new frame is created on each call

nagaiaida · 2025-01-21T08:15:25 1737447325

the call is still in tail position whether or not it reuses the stack frame. there are also more involved ways to do tail call optimization than a direct single-jump compilation when you leave ret behind entirely, such as in forth-style threaded interpreters

frogsRnice · 2025-01-21T09:58:03 1737453483

I guess were talking about optimising tail recursion. Would there be any reason to refer to a tail call other than that optimisation?

I’ll do some reading on the latter part of your post, thank you!

biodniggnj · 2025-01-21T14:59:53 1737471593

You don’t need recursion to make use of tail call elimination. In Scheme and SML all tail calls are eliminated. GCC also does it, but less often. Still, it’s not recursion that triggers it.

nagaiaida · 2025-01-21T10:09:43 1737454183

i only meant that "optimized/eliminated tail call" is more useful terminology than an uneliminated tail call not counting as "a tail call". i find this distinction useful when discussing clojure, for instance, where you have to explicitly trampoline recursive tail calls and there is a difference between an eliminated tail call and a call in tail position which is eligible for TCO

i'm not sure how commonly tail calls are eliminated in other forthlikes at the ~runtime level since you can just do it at call time when you really need it by dropping from the return stack, but i find it nice to be able to not just pop the stack doing things naively. basically since exit is itself a threaded word you can simply¹ check if the current instruction precedes a call to exit and drop a return address

in case it's helpful this is the relevant bit from mine (which started off as a toy 64-bit port of jonesforth):

  .macro STEP                                                                             
    lodsq                                                                               
    jmp *(%rax)                                                                         
  .endm  

  INTERPRET:                                                                              
    mov (%rsi), %rcx                                                                    
    mov $EXIT, %rdx                                                                     
    lea 8(%rbp), %rbx                                                                   
    cmp %rcx, %rdx     # tail call?                                                     
    cmovz (%rbp), %rsi # if so, we                                                      
    cmovz %rbx, %rbp   # can reuse                                                      
    RPUSH %rsi         # ret stack                                                      
    add   $8, %rax                                                                      
    mov %rax, %rsi                                                                      
    STEP

¹ provided you're willing to point the footguns over at the return stack manipulation side of things instead

russdill · 2025-01-21T04:41:41 1737434501

Yes, I think the most common is a tail call. There also of course can be several ret's from a single function.

to11mtm · 2025-01-21T02:22:17 1737426137

My gut (been a while since I've been that low level) is various forms of inlining and/or flow continuation (which is kinda inlining, except when we talk about obfuscation/protection schemes where you might inline but then do fun stuff on the inlined version.)

ngneer · 2025-01-21T04:24:22 1737433462

If compilation uses jmp2ret mitigation, a trailing ret instruction will be replaced by a jmp to a return thunk. It is up to the return thunk to do as it pleases with program state.

0xC0ncord · 2025-01-21T06:12:51 1737439971

This video[1] on reverse-engineering parts of Guitar Hero 3 covers a few similar techniques that were used to heavily obfuscate the game code that you might find interesting.

[1] https://www.youtube.com/watch?v=A9U5wK_boYM

maldev · 2025-01-21T09:19:50 1737451190

Few common issues.

1. Some jumps will be fake. 2. Some jumps will be inside an instruction. Decompilers can't handle two instructions are same location. (Like jmp 0x1234), you skip the jmp op, and assume 0x1234 is a valid instruction. 3. Stack will be fucked up in a branch, but is intentional to cause an exception. So you can either nop an instruction like lea RAX, [rsp + 0x99999999999] to fix decompilation, but then you may miss an intentional exception.

IDA doesn't handle stuff like this well, so I have a Binary Ninja license, and you can easily make a script that inlines functions for their decompiler. IDA can't really handle it since a thunnk (chunk of code between jmps), can only belong to one function. And the jmps will reuse chunks of code between eachother. I think most people don't use it since there was a bug with Binary Ninja in blizzard games, but they fixed it in a bug report a year or so ago.

Fokamul · 2025-01-21T09:52:19 1737453139

Why you cannot make same script for IDA. Anyway I don't like them, Hexrays are POS. Just curious.

phire · 2025-01-21T01:13:39 1737422019

Yeah, should be easy enough to filter these particular jumps out. It's an obfuscation designed to annoy people using common off-the-shelf tools (especially IDA pro)

Most obfuscations are only trying to annoy people just enough that they move on to other projects.

ackbar03 · 2025-01-21T03:16:56 1737429416

What are off the shelf tools/methods people use now? Ida was pretty standard goto when I was into RE

mahmoudimus · 2025-01-21T03:26:57 1737430017

Not much has changed, except there are more entrants. Binary Ninja, Ghidra, radare (last two being open source). For debugging, there's x64dbg. Some use windbg and gdb (for non windows os), but it still is mostly IDA as king though the others are catching up.

I evaluated entering the space by building something with AI native however, the business case just didn't make sense

jamesfinlayson · 2025-01-21T06:01:45 1737439305

I tried Ghidra recently and the decompilation seemed decent enough. The UI seemed a bit less complete than IDA's though (I couldn't see a couple of things that IDA does/has though they might just be hidden away in menus).