Consistently Making Wrong Decisions Whilst Writing Recreational C

nneonneo · 2024-08-25T07:03:39 1724569419

This is cute! It’s worth pointing out that strace ships a similar feature (-e fault) which works for any syscall, even if the binary is statically linked. It works using ptrace, which is lower level than LD_PRELOAD. Although -e fault doesn’t support probabilistic failure, it does provide a flexible way to target specific invocations of a syscall. For example, to fail every second fork() call: -e fault=fork,errno=ENOMEM,when=1+2.

nneonneo · 2024-08-25T16:07:23 1724602043

Of course, typing this on my phone, I didn't get the syntax exactly right. Sorry! The actual strace argument should be

    -e fault=clone:error=ENOMEM:when=1+2

First, modern glibc doesn't use the `fork` syscall directly; it uses clone() instead, so you have to fault clone to see anything happen on a glibc-based system. Second, I botched the syntax, oops!

For practical use, you probably want something like this:

    strace -e fault=clone:error=ENOMEM:when=2+2 -e fault=write:error=EFAULT:when=3+3 -c -o /dev/null bash

This faults both clone and write at different rates (try it, the results are sort of funny), sets strace to only count events (reducing its overhead), and then suppresses the count printout at the end. Might be fun to use as a prank login shell if you want someone to think their machine is broken :)

dmlorenzetti · 2024-08-24T23:32:14 1724542334

It seems like it would be a lot easier to just have students call, e.g., `ta_fork()` rather than `fork()`, and then provide an implementing file to be linked with their program. Then `ta_fork()` allows the TA to trigger errors, either probabilistically or deterministically (say, by setting environment variables).

This approach would also give students insight into testing strategies like mocking, plus it would work on more operating systems.

[Edit: Not to disparage this project. It seems like it would have lots of uses, and it was probably a lot of fun to develop.]

foobarbecue · 2024-08-25T12:30:25 1724589025

Seems pretty clear from the article that this person had no interest in doing this in any of the easy ways!

l33t7332273 · 2024-08-25T01:18:15 1724548695

My intro to systems course did something similar with shared memory calls if I recall correctly

woodruffw · 2024-08-25T04:32:22 1724560342

Nice writeup! One limitation to this approach is that the fault injections happen at the dynamic linkage to libc layer, meaning that an enterprising student who either statically links their binary or invokes syscalls directly will circumvent the interposed functions. But in a teaching setting I could imagine this isn’t a practical concern :-)

(I built a similar tool[1] a few years ago, but at the syscall layer to ensure that statically linked binaries could also have faults injected into them reliably. My colleagues used it to find a handful of bugs on prominent Go codebases.)

[1]: https://blog.trailofbits.com/2019/01/17/how-to-write-a-rootk...

jcalvinowens · 2024-08-25T06:00:45 1724565645

You can use pthread_once() to simplify the initialization part: https://man.archlinux.org/man/pthread_once.3.en

I don't understand the desire not to link to pthread, it's about as ubiquitous as a library can be.

I doubt it's really a problem in this application... but naive userspace spinlocks are absolutely horrendous, see NOTES here: https://man.archlinux.org/man/pthread_spin_init.3.en

  User-space spin locks [...] are, by definition, prone to priority inversion and unbounded spin times. A programmer using spin locks must be exceptionally careful not only in the code, but also in terms of system configuration, thread placement, and priority assignment.

ashvardanian · 2024-08-24T23:30:53 1724542253

Very cool! I've started a similar project around `LD_PRELOAD` a few months ago to profile the time different programs spend on LibC calls. Provoking failures was the next step :)

Logging nicely was also an issue. I decided to avoid linking to any other symbols and implemented it with inline Assembly for x86/64 and aarch64: https://github.com/ashvardanian/LibSee/blob/fdae92e71c449c91...

fuhsnn · 2024-08-25T00:50:30 1724547030

Perl, at build time, get errno numbers of the system in a similar way[0]: preprocess errno.h with `$CC -E` and recursively scan all files in # markers for macro defines.

The configure script even checks the existence of several system headers this way, so if your C compiler don't support # markers in -E output, you get missing includes everywhere.

[0] https://github.com/Perl/perl5/blob/blead/ext/Errno/Errno_pm....

jrpelkonen · 2024-08-25T00:11:17 1724544677

Great article. Although I’ve been a Linux user since the time when stack of Slackware floppies was the prevailing installation media, I just recently learned that libc.so is also an executable.

kstrauser · 2024-08-25T04:04:21 1724558661

Fun trivia: so is /lib/ld-*.so*. I busted that out in an interview at a FAANG when the question was "how could you recover from accidentally running 'chmod a-x /bin/*'"? My answer: '/lib/ld.so /bin/chmod a+x /bin/*'. The interviewer paused to get out his laptop and confirm it because he had never heard of it. After a fun detour of geeking out over something new an interesting, the followup question was modified to "How else would you do it?"

It's spelled "/lib/ld-linux-aarch64.so.1" on my nearest Linux box but is still executable today.

mandarax8 · 2024-08-25T10:53:40 1724583220

What was the solution he was looking for?

t-3 · 2024-08-25T13:06:32 1724591192

In the unlikely case that you actually have a static sbin and not just a symlink you could hack together a one-liner using file or objdump to check headers and set correct perms... but considering that everything in the dir should be either an executable or a directory, chmod a+x would work just as well.

If /usr/bin isn't a symlink to /bin or vice-versa, then you should have tools there to do the same thing.

If you somehow still have a working C compiler (or access to another language that can do syscalls or has C bindings), it's pretty easy to write a wrapper for the syscall.

If there's an rsync daemon, nfs share, etc. running, you can copy over a static busybox and fix the system that way.

If you're allowed to take the system down, it's really easy - just boot up a live image and change the permissions.

Filligree · 2024-08-26T15:31:21 1724686281

> If you're allowed to take the system down, it's really easy - just boot up a live image and change the permissions.

Having held a couple interviews like this, if you suggest taking the system down, I'll tell you that would work -- now tell me another way to do it.

It's still a plus that you suggested it, even if it isn't enough of an answer on its own.

t-3 · 2024-08-26T18:19:03 1724696343

I thought of another thing: many desktop systems have automount daemons, so USB could also be used to transfer over a busybox in that case.

Filligree · 2024-08-25T12:26:30 1724588790

There’s a couple. One would be to run a Python interpreter and fix it with that.

zbentley · 2024-08-25T14:06:54 1724594814

How would you launch the python interpreter if its binary is missing execute permissions?

kstrauser · 2024-08-25T15:17:13 1724599033

It’s traditionally been in /usr/bin.

t-3 · 2024-08-25T20:07:18 1724616438

On quite a few systems I've seen /usr/bin is a symlink to /bin or /bin is a symlink to /usr/bin (and /sbin is a symlink to /bin).

kstrauser · 2024-08-25T20:35:33 1724618133

That’s a relatively new phenomenon, and not uncontroversial.

cxr · 2024-08-26T13:44:30 1724679870

It's possible to achieve a similar effect in the browser. The <script> element doesn't have any Content-Type restrictions; it won't reject something outright solely on the basis you serve it something other than application/javascript. So if you managed to point it at an HTML file that's also valid JS (and you can manage to do that, since the set of all possible HTML inputs and the set of all possible JS programs isn't disjoint)... You can use components and libraries in your app that are pages/programs unto themselves, and in particular ones that are self-describing.

Imagine having in front of you a webpage that doesn't work the way you want. You use View Source in your browser, find the URL of some component written in JS that probably contains the offending logic, navigate to that URL, and the source code for that library unfolds on your screen and is free to do whatever it wants to best present itself to you for your understanding. This could include doing syntax highlighting for you (rather than the default black-on-white text that most browsers use when showing you a JS file), putting helper widgets on the screen (e.g. object/configuration inspectors), other on-screen editor components e.g. for any DSLs used in the file, base64-encoded data, etc.

Or imagine a file that knows that it was compiled from TypeScript and knows where the TypeScript source that it was compiled from lives. When you open the URL for the compiled form in your browser, when it loads it dynamically fetches the original .ts sources from elsewhere on the server and then puts that on the screen—maybe even going so far as to put it inside an editor that forms a TypeScript playground.

jagged-chisel · 2024-08-25T00:39:21 1724546361

In what sense do you mean “executable?” Does it include a main() and can you launch it? Or do you mean because its code is invoked before your own executable’s code?

olalonde · 2024-08-25T00:57:12 1724547432

It's in the article, it includes a main():

    icterid$ /usr/lib/x86_64-linux-gnu/libc.so.6
    GNU C Library (Debian GLIBC 2.36-9+deb12u7) stable release version 2.36.
    Copyright (C) 2022 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.
    There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
    PARTICULAR PURPOSE.
    Compiled by GNU CC version 12.2.0.
    libc ABIs: UNIQUE IFUNC ABSOLUTE
    Minimum supported kernel: 3.2.0
    For bug reporting instructions, please see:
    <http://www.debian.org/Bugs/>.

rzzzt · 2024-08-25T08:34:48 1724574888

Both the Makefile as well as the source file were linked from the article, ld's "-e" flag is used to set a custom entry point to "__libc_main".

rapidlua · 2024-08-25T09:22:31 1724577751

It was fun to read, but it would’ve probably been easier to rely on seccomp filters instead.

petters · 2024-08-25T08:16:03 1724573763

> and parsing it out of the man pages is not something I’d like to imagine doing reliably. So I must satisfy myself by manually writing these facts down. And this turns out to be the bottleneck of the entire operation.

You can probably use an LLM for this.

lionkor · 2024-08-25T08:45:11 1724575511

If you do, you must unit-test the LLM stage. How do you do that without wasting a lot of time and resources? If the unit tests run through a few thousand times, would you bet your life on it never failing? I would if it was any other code.

samatman · 2024-08-25T14:24:10 1724595850

Not necessarily.

I find LLMs very helpful when the task is annoyingly underdefined / understructured, but the result I want is easy to eyeball-audit.

This seems like one of those. Boiling down manpages to a consistent structure which a program can consume is going to involve a lot of special-casing a script, because they aren't written to be scraped like that.

But opening the result in one window, then loading the manpages one at a time in the other, and sanity-checking the contents, is less effort than manually copy-pasting everything and getting it into a consistent data format by hand.

Feeding the result of an LLM-grep sight-unseen into another program is an insane thing to do, of course. But using it like the above could save a lot of time.

supriyo-biswas · 2024-08-25T14:06:42 1724594802

From what I understand, it'd had been far easier to use a seccomp filter instead. Would have worked with statically linked binaries too.

Joker_vD · 2024-08-25T03:26:16 1724556376

> (cannot dynamically load position-independent executable)

...why though? I mean, it's position-independent, just load and relocate it wherever? Or does "PIE" mean something different in Linux from what it does in Windows?

pkal · 2024-08-25T07:17:28 1724570248

I am not sure what the rationale was (probably security), but this was a concious upstream change: https://bugzilla.redhat.com/show_bug.cgi?id=1764223

d110af5ccf · 2024-08-25T08:09:49 1724573389

The upstream bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24323

An example of a problem provided in a different bug: https://sourceware.org/bugzilla/show_bug.cgi?id=11754#c15

But I don't really understand what's going on in that example.

Aside, it only took 9 years for that patch to get conclusively rejected.