Great work. Therefore that's one more disassembler in the wide. Some of the great open source ones, reverse oriented, that I have been able to test are:
Capstone is based on LLVM, that you cannot beat in term of architectures and industrial quality, and has great plugins, which make it kind of my favorite.
Of course, for non-scripting needs, you need decent graphical interfaces, as provided by these ones:
FYI you most definitely can beat LLVM in terms of quality. Its diassembler was designed originally only to support the instructions that the compiler could generate. While they've since filled out most of the instruction set, there are still a whole set of subtleties it can't deal with well, simply because the compiler part of it was never designed to output them. (e.g. see segment prefixes)
If you take a look to e.g. McSema https://github.com/trailofbits/mcsema, these guys have been able to use LLVM to transform assembly back to LLVM intermediate representation (IR). The IR is an abstract language used by LLVM during compilation to optimize the generated code. Being abstract basically allows to use optimizers across all LLVM supported architectures.
In these terms, I think LLVM has a clear advantage over competitors.
hi, I'm one of the mcsema authors and the original author who was responsible for mcsema using the LLVM instruction decoder and it was a mistake. if I had a choice between going back in time and assassinating Hitler, or convincing myself to use XED instead of mcinst...
What you did with McSema was really impressive and I’m glad discussing with one of the authors! I get the point you regret using LLVM, it seemed to me to be the best choice since LLVM has wide instruction semantics implementations, that more basic tools such as Xed do not offer.
If I get it correctly… you would rather re-implement such semantic yourself? Or have you got any other tool / idea I’m not aware of?
I don't regret using LLVM at all. I regret using the instruction decoding features in LLVM. if I could do it again (and someone younger, smarter, and better looking than me is) I would combine the LLVM IR with the XED instruction decoder. we would still emit the semantics of the instructions as LLVM, but we would use XED to figure out which instruction we were decoding.
A bit late to show, and surely offtopic, but as alternative links were already posted, here's another one: ScratchABit https://github.com/pfalcon/ScratchABit . Features: written in well-known interpreted language (Python), hacking on ScratchABit is actually easy. Implements subset of IDAPython API, so existing community modules for various processors support, etc. can be reused.
Been using it for a while and like it allot, has a great interface for reversing and diffing win binaries. Will be keeping an eye on Panopticon, looks promising :)
Let's assume you have reversed engineered some kind of boolean check and you now want to patch it to always return true or false. What does that process look like at a high level?
Not the platform you're asking about, but in Binary Ninja, you right-click on the conditional jump and choose "Always Branch", "Never Branch", or "Invert Branch".
Binary Ninja also supports inline editing of assembly and a custom compiler for dropping simple C replacements directly on top of existing functionality.
As long as all the instructions are the same size (or smaller padded with no-operation instructions) then yes. If, however, you do change the size of the application all relocation deltas need to be changed, and all relative jumps and calls need to be recalculated.
There are sometimes tricks that get you around this problem, too: you can sometimes patch in a trampoline, which gives you some flexibility in the instructions you get to use.
An alternate direction rather than your standard jmp/displacement/nop'ing-to-align would be to use Dyninst[1] and live patch in memory[2]. Really though, your standard hex-editor will have facilities to alter all the necessary relatives. If you have access to modify the binary, no need to put in a trampoline (though it's awfully handy when playing CTFs!)
Honestly, do we really need another static analysis tool? Hopper and radare2 have covered the open source gap fairly well. I'd put r2 on near-power-parity of IDA if you invest the time to learn and configure it, which admittedly is an expensive proposition in labor who already knows IDA. It'll take him more days in salary to learn a new platform than just to pay the 5k and get him a IDA/HexRays license.
>Honestly, do we really need another static analysis tool?
Definitely. IDA Pro is proprietary software and the possibilities of IDAPython are limited. IDA Pro mostly ignores the last two decades of research done in the field of binary program analysis. It still relies on pattern matching compilers instead of using semantics driven methods that have been around for >10ys. While there exist tools like BAP, BitBlaze, Jackstab and Bindead. They are not really usable for people w/o graduate student level understanding of program analysis. This is where Panopticon fits in.
i was trying to edit one byte in an ELF, no change in file size and it kept crashing. i read that each section of code is hashed and obviously my byte edit changed the hash. i was pretty out of my depth tbh.
Sometimes code will add additional checks such as hashes to verify that parts of the text section haven't been modified. Should be able to remove those checks, too. Just have to track them all down.
To detect hashes like that, use a debugger that supports memory break points and set a read breakpoint on the instruction you changed. It usually makes it a lot easier to identify where the checksum is calculated.
Yes you can. An executable (using ELFs as an example, most formats are similar) is nothing more than some headers and bytes. The executable code can be modified in any way you please, as long as you're writing valid opcodes (the program will crash if it hits a bad opcode).
That said, it's usually much more complicated to change the size of the executable section (this requires modifying the headers and this tends to be a rather involved process), so usually if people are doing binary patches they are only modifying bytes, not adding or removing them.
The code that checks the signature. If you can identify all the places that checks the signature and disable that code, then it no longer matters if you are unable to correctly update the signature itself.
I was talking about changing the signature/checksum to match the new code.
Corecoder pointed out that sometimes you just need to patch the checksum checking code, and not the checksum itself.
Yes, there's no recompiling that can be done anyway if you only have the binary and no source code. Writing the new machine code would overwrite some code at the start of the function you're modifying, but that doesn't matter if you just want the function to return true or false.
You could edit the binary manually with a hex editor, but some disassemblers like Hopper have a feature where you can type new instructions in assembly and it will assemble and insert them for you. I'm sure IDA pro has something like that as well.
IDA Pro as of 6.9 wasn't designed to act as a hex editor as such it's not the "ideal" but there are tons of scripts[1] that people use[2] to craft it into whatever you want. Likewise, it wasn't really a dynamic analysis tool but the healthy ecosystem kes it feel sorta-kinda powerful with the proper tooling + WinDBG. I'm using a fairly old setup (old dog, new tricks and all that - I stuck with SoftICE as long as I could) so there are likely better solutions out there.
[1] https://github.com/iphelix/ida-patcher/blob/master/idapatche... is what I have in my scripts dir, but I'm sure there are dozens of others out there. That specific Python script has the added benefit of being really approachable for the average user. Check Woodmann or Github or wherever people post their scripts these days if it doesn't meet your needs.
[2] IDA's basically turned into emacs, where the real power comes from all of the tooling you can conf into it. A stock 6.9 + HexRays license is worth it just for the free tooling you can find.
Hey I'm Kai Michaelis, I'm in an IT-Security Masters program in Bochum, Germany and work part-time for people who use the term Cybersecurity unironically.
Yeah, it's really interesting how many iterator invalidation bugs you find in real C++ codebases. It's so easy to think that they don't exist until you have a tool (or a human) actively searching for them…
Those do not remove any bugs; it just changes how they are handled (ignore and hope for the best vs print diagnostic and terminate program, and the latter not in all cases)
I don't know of any large, widely used, security critical, heavily scrutinized C++ program that uses those flags in production. I also don't know of any such C++ program that hasn't had game-over use after free bugs. So the answer to your question is somewhere ranging from "we don't know, but unlikely" to "no".
A big part of what makes Rust great is that it catches those issues at compile-time, no test coverage required, no extra run-time checks required (not that it moots tests, of course). In my experience so far this is a very powerful ally to have when writing software, as it's very useful to surface those issues as early as possible.
EDIT: Also:
> Assuming that tests exist, and have good coverage
That's a very big assumption. Granted my current job is often to clean up codebases that are the opposite of that, so maybe I've got a bias here.
For some definition of ‘good coverage’ and ‘most’: yes.
I do not think 100% coverage on the code you write alone is enough to get to ‘all', though. For example, if your test declares a vector and calls a function that does
v.erase( i, v.end());
with i == v.end(), you get 100% coverage, but in production, that part of the code might be called with i == v.begin(), and that invalidates all iterators on the vector.
"Panopticon is under heavy development and its feature set still very basic. It's not yet able replace programs like radare2 and IDA Pro. We are working on it, check back regularly!"
This has almost none of the important features of IDA Pro:
- advanced interactivity (function boundary change, switch table options...)
- stack pointer tracking and stack recovery
- structure and enum definition, use in disassembly
- compiler and static library recognition
- an industry-standard decompilation plugin (Hex-Rays Decompiler)
- support for quick scripting using a high-level interpreter language (IDC/IDAPython)
My friend's grandfather was an economist. When he was a little boy he saw a $20 bill on the sidewalk on a walk with with his grandfather. He pointed to it and said "Grandpa look someone dropped a twenty dollar bill! We will be rich!" His grandfather dismissed his enthusiasm and explained that if there was free money on the ground somebody would have picked it up by now.
On a side note; the visual callgraph is interesting.. It'd be interesting for other IDEs to implement something similar with a historical listing of input/output values for the corresponding displayed methods.
"Panopticon is a disassembler that understands the semantics of opcodes. This way it's able to help the user by discovering and displaying invariants that would have to be discovered "by hand" in traditional disassemblers."
Doesn't every disassembler have to understand the opcode semantics in order to disassemble and make sense of them or am I misinterpreting that statement?
Also can anyone explain what the "invariants" are? When I hear the word I can only think of loop invariants and I'm guessing that is not what the author means here.
You say "in order to disassemble and make sense of them", but the "sense" that a disassembler makes can just be the direct conversion from machine code to equivalent assembley. Machine code is just another syntax, directly converting that into assembly can happen instruction-by-instruction.
Making more sensible assembly that a human would write is another thing, if you want to follow jumps to figure out where instructions are and regenerate labels, then that's one step towards semantics.
I don't know what kinds of invariants panopticon is looking at, but there seems to be code for static analysis using abstract interpretation[1]. This could do the data-flow analysis, determine types and recognize persistent variables.
I noticed that this only supports AVR. what is your plan to support x86? would you think about linking against some external code that provides semantics for x86?
How do you find adding new architectures to it? I've been meaning to add some architectures I want to Radare for a while now, but it's a bit daunting due to lack of docs and the build system is argh. And I've also been wanting to learn Rust, so this seems like an obvious fit for me...
Panopticon includes a simple disassembler framework. The basic idea is that you provide opcode patterns akin to regexp and write a function that emits mnemonics and code describing opcode semantics. There isn't much documentation, but you can take a look at the AVR and MOS 6502 disassemblers in lib/src/mos resp. lib/src/avr.
Metasm: https://github.com/jjyg/metasm/
Radare: http://radare.org/
Capstone: http://www.capstone-engine.org/
Capstone is based on LLVM, that you cannot beat in term of architectures and industrial quality, and has great plugins, which make it kind of my favorite.
Of course, for non-scripting needs, you need decent graphical interfaces, as provided by these ones:
IDA: https://www.hex-rays.com/products/ida/
Hopper: http://www.hopperapp.com/
IDA is clearly the best, but Hopper is a fair choice if you need it and you can't afford IDA license for personal use.
[Edit: add forgotten IDA and Hopper...]