Hacker News new | past | comments | ask | show | jobs | submit login
ELF101 a Linux executable walkthrough (code.google.com)
102 points by ange4771_ on Nov 20, 2013 | hide | past | favorite | 20 comments



A good intro but doesn't go in depth enough, and it glosses over the interface between user and kernel mode.

Particularly, in modern Linux binaries, syscalls are not hardcoded as int instructions, but are dynamically mapped via a "virtual DSO" mechanism to the best instruction for the current architecture (int 80, or syscall/sysenter).

Also, in practice, the ELF would be dynamically linking to libc, and libc would be making the calls.


An introduction on a single page can't go deep!

"This is the whole file, however, most ELF files contain many more elements. Explanations are simplified, for conciseness."


It also glosses over relocation mechanisms. I was curious if relocation info is stored in a section (like .reloc in Windows).


I think the point of it was to not explain how elf works but to explain the file format it self.


United857 is right. This totally gives a false impression that regular C code I write will use syscall instead of functions that exist in libc that are dynamically mapped in using the plt. My C code will never say int 0x80

Otherwise this I'd quite good. Helps people make the connection between Hello World in rodata and how it gets used


I just verified you are right in most cases it won't use int 0x80.

sample C code where you would expect to see int 0x80: #include <stdio.h>

void main(int argc, char argv) { char name[2]; name[0] = "/bin/sh"; name[1] = NULL; execve(name[0], name, NULL); exit(0);

}

when you build it and then use objdump -D: the relevant code is as follows: 8048419: 8d 54 24 18 lea 0x18(%esp),%edx 804841d: 89 54 24 04 mov %edx,0x4(%esp) 8048421: 89 04 24 mov %eax,(%esp) 8048424: e8 eb fe ff ff call 8048314 <execve@plt> 8048429: c7 04 24 00 00 00 00 movl $0x0,(%esp)

But if you use something that executes a payload it uses int 0x80.

Sample C code :

#include <unistd.h>

char shellcode[] = "\x31\xc0\xb0\x46\x31\xdb\x31\xc9\xcd\x80\xeb\x16" "\x5b\x31\xc0\x88\x43\x07\x89\x5b\x08\x89\x43\x0c" "\xb0\x0b\x8d\x4b\x08\x8d\x53\x0c\xcd\x80\xe8\xe5" "\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68\x4e\x41" "\x41\x41\x41\x42\x42\x42\x42";

int main () {

int (func)();

func = (int ()()) shellcode;

(int)(func)(); }

objdump -D will give you that: 0804a040 <shellcode>: 804a040: 31 c0 xor %eax,%eax 804a042: b0 46 mov $0x46,%al 804a044: 31 db xor %ebx,%ebx 804a046: 31 c9 xor %ecx,%ecx 804a048: cd 80 int $0x80 804a04a: eb 16 jmp 804a062 <shellcode+0x22> 804a04c: 5b pop %ebx 804a04d: 31 c0 xor %eax,%eax 804a04f: 88 43 07 mov %al,0x7(%ebx) 804a052: 89 5b 08 mov %ebx,0x8(%ebx) 804a055: 89 43 0c mov %eax,0xc(%ebx) 804a058: b0 0b mov $0xb,%al 804a05a: 8d 4b 08 lea 0x8(%ebx),%ecx 804a05d: 8d 53 0c lea 0xc(%ebx),%edx 804a060: cd 80 int $0x80 804a062: e8 e5 ff ff ff call 804a04c <shellcode+0xc>

I guess you can force it, but in general the compiler will replace it with calls to libs.


Yeah you can force it if you call shellcode as a function but other than that, libc functions, if the elf is dynamically linked, will be mapped in using the procedure linkage table before main runs

but...this is because printf != the write syscall


Can you use pastebin or a github gist for the code?


Very nice, too bad there are a few typos. For example, the "write" function arguments are not ordered correctly and it's using the wrong line terminator. It should be:

    write(STDOUT, "Hello World!\n", len("Hello World!\n"));


Uh, if that's supposed to be C, then there's no STDOUT and the way to compute the length of a string is with strlen(), not len(). It should be STDOUT_FILENO as pointed out below.


thx, will fix it. any other typo ?


In code example, the hexdump starts with B9 90 00 00 08, but in the disassembly is listed as 80 00 00 90. I don't think endianness can change 08 to 80.


not even German endianness ? ;)

thanks for pointing this out.


there is no such predefined thing as STDOUT.

you are probably thinking of STDOUT_FILENO. can you just put "1" there?

also, reordering the "mov"s (so they appear in actual call order: edx, ecx, ebx) will simplify the diagram somewhat.


FYI it's now available in professional-looking version http://i.imgur.com/m6kL4Lv.png and booklet https://speakerdeck.com/ange/elf101-a-linux-executable-walkt... .


What does the number 101 refer to? It doesn't seem to be a version number. The diagram nicely expands ELF to Executable and Linkable Format but no explanation of the number.



That mouse-over trick is really annoying.


Edit: Apologies it appears to be my browser doing something unexpected. I haven't used this computer in a while.


where?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: