Hacker News new | past | comments | ask | show | jobs | submit login
Two handy GDB breakpoint tricks (nullprogram.com)
212 points by goranmoomin 12 months ago | hide | past | favorite | 36 comments



Don't mess around with silly tricks like that. Do it the proper way:

    #include <signal.h>
    
    int main()
    {
        signal(SIGTRAP, SIG_IGN);
        raise(SIGTRAP);
    }
When outside a debugger, the `raise`d signal will be ignored (of course, if you want a coredump you can remove the `signal` line).

When inside a debugger, requests to ignore `SIGTRAP` have no effect, unlike other signals. That's because this is literally what `SIGTRAP` is for.


Yes, although for usage in an assertion like in TFA, you want SIGTRAP outside of a debugger to cause a core dump, which is the default, so you must not SIG_IGN it.

In priority order:

- msvc: __debugbreak()

- clang: __builtin_debugtrap()

- linux (and probably some other posix-ish systems): raise(SIGTRAP)

- gcc x86: asm("int3; nop")

- otherwise: you're out of luck, just abort()/__builtin_trap()


well you can always while (1) {} -- run the program, and control-C when you hear the CPU fan :-)


Just

  catch infinite-loop
in gdb and you're done!


Would anyone know how that works internally? Sounds very useful


Sadly it does not work at all, as it was a joke :/.

But I did think about how it could work. It could work by peridically sampling the instructions the current thread is running and detect trivial infinite loops that way.

With more effort more complicated infinite loops could be detected, but in the end not all, due to the halting problem.

edit: Actually maybe halting problem is not (theoretically) involved if you have a lot of memory: take snapshots every after every instruction. Once you take a snapshot you have already taken previously, you have found a loop. However you would still might need to somehow account for (emulate and include in snapshot?) or disallow external IO, such as time of day.


QuakeC automatically detected "infinite" loops (actually >100k loops). It was very useful!


(deleted)


Are you sure? The article makes the point that the nop is actually required for this to work in GDB because the instruction pointer might otherwise point at an entirely different scope.

I have to admit I didn't try it out though. Maybe this changed in the meantime and it is not needed anymore.


See also this comment in the Unreal Engine code about putting a nop in before as well: https://github.com/EpicGames/UnrealEngine/blob/26677ca1b3c97...

    // Q: Why is there a __nop() before __debugbreak()?
    // A: VS' debug engine has a bug where it will silently swallow explicit
    // breakpoint interrupts when single-step debugging either line-by-line or
    // over call instructions. This can hide legitimate reasons to trap. Asserts
    // for example, which can appear as if the did not fire, leaving a programmer
    // unknowingly debugging an undefined process.
(This comment has been there for at least a couple of years, and I don't know if it still applies to the newest version of Visual Studio.)


The annoying aspect of this is that you can end up with a deep call stack. For example, on macOS:

      * frame #0: 0x00007ff818dd6fce libsystem_kernel.dylib`__pthread_kill + 10
        frame #1: 0x00007ff818e0d1ff libsystem_pthread.dylib`pthread_kill + 263
        frame #2: 0x00007ff818d1b2c8 libsystem_c.dylib`raise + 26
        frame #3: 0x0000000100003f6e sigtrap`main + 14
        frame #4: 0x000000010001552e dyld`start + 462
(I vaguely remember it being similar on Linux.)

Regular POSIX programmers might scoff at the very idea that this is an issue, but I always found it rather tedious having to piss about just to get back to the actual place where the break occurred so that you can inspect the state. The whole point of an assert is that you don't expect the condition to happen, so the last thing I want is to make things any more fiddly than necessary when it all goes wrong.


What if you're a library and don't want to alter global signal handling?


If you're about to fail an assertion (which is semantically similar to a SEGV), you probably shouldn't care about messing up signal handling anymore ;)

FWIW the actual answer is that libraries should be good at reporting / returning errors to the calling application in a reasonable manner instead of tripping assert()s or crashing the entire process. Which makes the question moot because ideally the library has few asserts to begin with.


Asserts are to make your code more sensitive to defects, for example, checking function invariants. Ideally the library has more, not less, since they show care and thought has gone in. They should never be used to handle errors.

Another feature of asserts is that `-DNDEBUG` disables them.


Sorry, yeah, I'm just living in a world where asserts being misused to "handle errors" is unfortunately rather common. My argument is specifically against that kind of assertion, I should've been more clear.


  #define assert(c)  while (!(c)) __builtin_trap()
  #define assert(c)  while (!(c)) __builtin_unreachable()
  #define assert(c)  while (!(c)) *(volatile int *)0 = 0
> Each […] has the most important property: Immediately halt the program directly on the defect.

No, only 2 of these 3 have this property. __builtin_unreachable() tells the compiler that this code path can be assumed to never execute, with the intent to allow the compiler to optimize however it wants based on this assumption. Whether the program halts, goes into an infinite loop, executes "rm -rf /", or starts a nuclear war is up to the compiler.

(What will happen in reality is that the compiler deduces that the loop condition will never be true and therefore it can elide the entire loop.)

Sadly, other posts by this author have similar issues.

P.S.: https://sourceware.org/gdb/current/onlinedocs/gdb.html/Jumpi...


Probably it should be stated explicitly, but this is implied if you add `-fsanitize=undefined`, where you will get a diagnostic if the unreachable code is reached. With `-fsanitize-trap=undefined`, you get a trap instruction there, the compiler won't delete anything. In a "release build" without sanitizers, this also serves as an optimization hint. It's a sensible idea, this snarky comment is unwarranted.

Also, code after __builtin_trap() is treated as dead.


It's nice that it works if you adjust your compilation environment like that, but the thing with the "release build" is actually the wrong way around.

- in a debug build (assertions enabled) with UBSAN you get a diagnostic or trap, so far so good.

- in a debug build without UBSAN, you get… undefined behavior, possibly some optimizations. Optimizations would generally be set to a low level, but that doesn't preclude you from getting UB.

- in a release build without assertions, … the assertion would be disabled … so you don't get any optimizations; instead you get proper "DB" for the case that should never happen.

It's essentially the wrong way around. If anything this is an argument for converting assertions into this definition in a release build, which might be reasonable in some code bases. (In my main code base, we have an "assume()" macro instead; it is by policy very sparingly used to avoid risks of running into UB.)

> It's a sensible idea, this snarky comment is unwarranted.

It's not intended to be snarky, just as noted I've run into similar issues with posts by this author. I haven't counted it up but it feels like a 90% rate of being just dangerously half-right about something. (Considering how subjective perception works, 90% subjective is probably 40% actual, but still.)

(I probably should've deleted the P.S.; I reworded my post and it was a leftover.)


Why would I ever have a debug build without sanitizers? Nothing stops you from removing the asserts in this case, anyway. I don't see what the problem is with using __builtin_unreachable in this context.

I think you didn't understand - what I was implying is to leave the asserts in for a release build, because they (theoretically) never happen, because they are free optimization hints.

So effectively, instead of a macro, your asserts become controlled by the presence of sanitizers.

I think it's nice because it lets you keep the same definition for both debug and release.


> I think you didn't understand - what I was implying is to leave the asserts in for a release build, because they (theoretically) never happen, because they are free optimization hints.

Remember we're not discussing our personal approaches, we're discussing what the linked post describes. It does mention UBSAN down one level of link but does not point out the pitfalls, which is the thing I'm taking issue with. I agree your approach is reasonable. It is, however, not what people reading the linked article would arrive at.

> I think it's nice because it lets you keep the same definition for both debug and release.

I would never consider turning blanket-all assert()s into assumptions for a release build. Every single assert would become a source of potential UB. This is why we have separate assert() and assume() in our codebase.


These actually are quite useful tricks, and simple too.

Usually articles with a title like this just find something in the documentation that the author hadn’t known, but I do often read them in the hope they’ll be like this.


C++26 is slated to include std::breakpoint and a couple related features. I'm expecting we'll see more portable and consistent tech for writing recoverable assertions going forward.

https://en.cppreference.com/w/cpp/utility/breakpoint


Instead of using labels (e.g. to avoid unused_label) and because labels are most commonly used as the target of a goto.

gdb also supports `break -probe foo:bar` when placing some `DTRACE_PROBE(foo, bar)` at the appropriate line. I think this is likely preferable to `break func:label` now.


CosmopolitanC has this cool feature that launches gdb immediately and execs it.


> On x86 it inserts an int3 instruction, which fires an interrupt, trapping in the attached debugger, or otherwise abnormally terminating the program. Because it’s an interrupt, it’s expected that the program might continue.

In x86 lingo, it’s a trap, and calling it an interrupt misses the point. int3 is a “software interrupt,” but the vector 3 is universally configured by kernels as a “trap” (by setting the appropriate mode in the IDT). So the instruction completes, IP is incremented, and the kernel is notified.

FRED tidies this all up. The kernel is notified that the event occurred and is given the length of the offending instruction.

> However, regardless of how you get an int3 in your program, GDB does not currently understand it.

GDB can’t really do better. If you have:

    jnz 1f
    int3
    1: [non-error case here]
Then all gdb knows is that IP points to 1. The issue may not be as bad with the int3 out of line:

    jnz 2f
    1: [non-error case here]
    
    2: int3
    jmp 1b <— GDB sees this
Then at least GDB knows what’s up. But that jmp may be omitted if gcc thinks that the instruction after int3 is unreachable.

Maybe a future FRED-requiring debug API could report the address of the offending int3 and GDB could use it. But ptrace is awful, gdb is awful, and I’m not convinced this will happen.


I sometimes used a placeholder function call to make it easier to place a breakpoint. Of course it shouldn't be ever inlined.


Same. I typically define a function "void bp1(){}" in a common utility file, put "bp1();" where I need a breakpoint, and "b bp1" in gdb. Hacky, sure, but really convenient.


What's the equivalent of int3 on an ARM machine? Is there any attempt to provide a header that "does the right thing" on a wide variety of platforms?


TFA says

> As far as I know there is no ARM equivalent compatible with GDB (or even LLDB). The closest instruction, brk #0x1, does not behave as needed.

But doesn't elaborate on why


The reason is because the ARM `brk #0x1` instruction doesn't set the exception return address to the next instruction, so the debugger will just return to the breakpoint when it tries to resume, instead of running the next instruction. Recent versions of LLDB will recognize `brk #0xf000` and automatically step over it when returning, but I don't think GDB does this. With GDB you would have to run something manually like `jump *($pc + 4)` to resume at the next instruction. Clang's `__builtin_debugtrap()` will generate `brk #0xf000` automatically.


I've been using this macro with GCC / GDB for years without running into the issue you're describing:

#define DEBUG_BREAK() do{__asm__("BKPT");} while(0)

I can continue just fine with it. Granted, this is on the various Cortex M0/M3/M4 chips, so I can't say for sure if it works on any of the bigger, fancier ARMs.


I think it's a difference between ARMv8 and ARMv6/7 (I believe BKPT on ARMv6/7 sets the exception return address to `addr(instruction)+4`).


Followup: I found such a header! https://github.com/scottt/debugbreak


> As far as I know there is no ARM equivalent compatible with GDB

Sad... This would've made debugging much more ergonomic on my phone.


this look at hw/sw breakpoints for arm stuff.. the int3 is a exception provided by the architecture, which coders leverage. so hence its not available on arm, but there's other breakpoint / debug functionality since arm devs also need to debug stuff :> happy hacking!

https://interrupt.memfault.com/blog/cortex-m-breakpoints


int 3 is actually is a single byte opcode operation, 0xCC. All other interrupts are two byte 0xCD <intr_num>.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: