Hacker News new | past | comments | ask | show | jobs | submit login
Panopticon: A libre, cross platform disassembler for reverse engineering (panopticon.re)
353 points by adamnemecek on May 9, 2016 | hide | past | favorite | 83 comments



Great work. Therefore that's one more disassembler in the wide. Some of the great open source ones, reverse oriented, that I have been able to test are:

Metasm: https://github.com/jjyg/metasm/

Radare: http://radare.org/

Capstone: http://www.capstone-engine.org/

Capstone is based on LLVM, that you cannot beat in term of architectures and industrial quality, and has great plugins, which make it kind of my favorite.

Of course, for non-scripting needs, you need decent graphical interfaces, as provided by these ones:

IDA: https://www.hex-rays.com/products/ida/

Hopper: http://www.hopperapp.com/

IDA is clearly the best, but Hopper is a fair choice if you need it and you can't afford IDA license for personal use.

[Edit: add forgotten IDA and Hopper...]


FYI you most definitely can beat LLVM in terms of quality. Its diassembler was designed originally only to support the instructions that the compiler could generate. While they've since filled out most of the instruction set, there are still a whole set of subtleties it can't deal with well, simply because the compiler part of it was never designed to output them. (e.g. see segment prefixes)


It depends on what you are looking to achieve.

If you take a look to e.g. McSema https://github.com/trailofbits/mcsema, these guys have been able to use LLVM to transform assembly back to LLVM intermediate representation (IR). The IR is an abstract language used by LLVM during compilation to optimize the generated code. Being abstract basically allows to use optimizers across all LLVM supported architectures.

In these terms, I think LLVM has a clear advantage over competitors.

The following blog post give a nice overview of the McSema achievements: http://blog.trailofbits.com/2014/06/23/a-preview-of-mcsema/


hi, I'm one of the mcsema authors and the original author who was responsible for mcsema using the LLVM instruction decoder and it was a mistake. if I had a choice between going back in time and assassinating Hitler, or convincing myself to use XED instead of mcinst...


Hi Munin,

What you did with McSema was really impressive and I’m glad discussing with one of the authors! I get the point you regret using LLVM, it seemed to me to be the best choice since LLVM has wide instruction semantics implementations, that more basic tools such as Xed do not offer. If I get it correctly… you would rather re-implement such semantic yourself? Or have you got any other tool / idea I’m not aware of?


I don't regret using LLVM at all. I regret using the instruction decoding features in LLVM. if I could do it again (and someone younger, smarter, and better looking than me is) I would combine the LLVM IR with the XED instruction decoder. we would still emit the semantics of the instructions as LLVM, but we would use XED to figure out which instruction we were decoding.



There's also Binary Ninja. Still in beta, but available to anyone who requests to join the beta before it's public release: https://binary.ninja


The only mention of "beta" that I can find on the site is: http://binary.ninja/eula.txt

Where is the request form?


Email, slack, or the "Questions or Comments?" popup are all acceptable


A bit late to show, and surely offtopic, but as alternative links were already posted, here's another one: ScratchABit https://github.com/pfalcon/ScratchABit . Features: written in well-known interpreted language (Python), hacking on ScratchABit is actually easy. Implements subset of IDAPython API, so existing community modules for various processors support, etc. can be reused.



Another alternative disassembler to Hopper/IDA is Relyze: https://www.relyze.com/

Been using it for a while and like it allot, has a great interface for reversing and diffing win binaries. Will be keeping an eye on Panopticon, looks promising :)


Let's assume you have reversed engineered some kind of boolean check and you now want to patch it to always return true or false. What does that process look like at a high level?


Not the platform you're asking about, but in Binary Ninja, you right-click on the conditional jump and choose "Always Branch", "Never Branch", or "Invert Branch".

Binary Ninja also supports inline editing of assembly and a custom compiler for dropping simple C replacements directly on top of existing functionality.


You insert the equivalent of "mov eax, 0x1; ret" (or 0x0) in x86 for whatever architecture you're using as the first instructions of the function.


I mean even more high level than this. If I open a binary, can I just write the new machine code to it directly and not be concerned with recompiling?


As long as all the instructions are the same size (or smaller padded with no-operation instructions) then yes. If, however, you do change the size of the application all relocation deltas need to be changed, and all relative jumps and calls need to be recalculated.


There are sometimes tricks that get you around this problem, too: you can sometimes patch in a trampoline, which gives you some flexibility in the instructions you get to use.


An alternate direction rather than your standard jmp/displacement/nop'ing-to-align would be to use Dyninst[1] and live patch in memory[2]. Really though, your standard hex-editor will have facilities to alter all the necessary relatives. If you have access to modify the binary, no need to put in a trampoline (though it's awfully handy when playing CTFs!)

Honestly, do we really need another static analysis tool? Hopper and radare2 have covered the open source gap fairly well. I'd put r2 on near-power-parity of IDA if you invest the time to learn and configure it, which admittedly is an expensive proposition in labor who already knows IDA. It'll take him more days in salary to learn a new platform than just to pay the 5k and get him a IDA/HexRays license.

[1] U of Maryland holds the patent; information here, https://www.google.co.uk/patents/US8510723 [2] https://www.cs.umd.edu/class/fall2005/cmsc714/Lectures/byrd-... Though, I'm sure you've seen it already


>Honestly, do we really need another static analysis tool?

Definitely. IDA Pro is proprietary software and the possibilities of IDAPython are limited. IDA Pro mostly ignores the last two decades of research done in the field of binary program analysis. It still relies on pattern matching compilers instead of using semantics driven methods that have been around for >10ys. While there exist tools like BAP, BitBlaze, Jackstab and Bindead. They are not really usable for people w/o graduate student level understanding of program analysis. This is where Panopticon fits in.


IDA's licensing is also onerous.


i was trying to edit one byte in an ELF, no change in file size and it kept crashing. i read that each section of code is hashed and obviously my byte edit changed the hash. i was pretty out of my depth tbh.


Sometimes code will add additional checks such as hashes to verify that parts of the text section haven't been modified. Should be able to remove those checks, too. Just have to track them all down.


To detect hashes like that, use a debugger that supports memory break points and set a read breakpoint on the instruction you changed. It usually makes it a lot easier to identify where the checksum is calculated.


Yes you can. An executable (using ELFs as an example, most formats are similar) is nothing more than some headers and bytes. The executable code can be modified in any way you please, as long as you're writing valid opcodes (the program will crash if it hits a bad opcode).

That said, it's usually much more complicated to change the size of the executable section (this requires modifying the headers and this tends to be a rather involved process), so usually if people are doing binary patches they are only modifying bytes, not adding or removing them.


Usually. Just make sure the new machine code is the same size as the old.

That said, if the code is signed and there's a signature check - the check will fail if you modify the code.

If the signature is a simple crc/checksum, you could also update the checksum. If it's a cryptographic signature, it might be a lot more difficult.


Or, you also have to modify the check code.


Huh? Isn't that what he said - or did I miss something? How is your check code related comment different from his checksum related statement?


The code that checks the signature. If you can identify all the places that checks the signature and disable that code, then it no longer matters if you are unable to correctly update the signature itself.


I was talking about changing the signature/checksum to match the new code. Corecoder pointed out that sometimes you just need to patch the checksum checking code, and not the checksum itself.


Yes, there's no recompiling that can be done anyway if you only have the binary and no source code. Writing the new machine code would overwrite some code at the start of the function you're modifying, but that doesn't matter if you just want the function to return true or false.

You could edit the binary manually with a hex editor, but some disassemblers like Hopper have a feature where you can type new instructions in assembly and it will assemble and insert them for you. I'm sure IDA pro has something like that as well.


IDA Pro as of 6.9 wasn't designed to act as a hex editor as such it's not the "ideal" but there are tons of scripts[1] that people use[2] to craft it into whatever you want. Likewise, it wasn't really a dynamic analysis tool but the healthy ecosystem kes it feel sorta-kinda powerful with the proper tooling + WinDBG. I'm using a fairly old setup (old dog, new tricks and all that - I stuck with SoftICE as long as I could) so there are likely better solutions out there.

[1] https://github.com/iphelix/ida-patcher/blob/master/idapatche... is what I have in my scripts dir, but I'm sure there are dozens of others out there. That specific Python script has the added benefit of being really approachable for the average user. Check Woodmann or Github or wherever people post their scripts these days if it doesn't meet your needs. [2] IDA's basically turned into emacs, where the real power comes from all of the tooling you can conf into it. A stock 6.9 + HexRays license is worth it just for the free tooling you can find.


Yes, use a hex editor. Not a whole lot different from a gameshark, really.


I like the About me:

Hey I'm Kai Michaelis, I'm in an IT-Security Masters program in Bochum, Germany and work part-time for people who use the term Cybersecurity unironically.


Requisite Rust fanboy'ing: GitHub stats say 90% Rust code.


The author posted on /r/rust with some details: https://www.reddit.com/r/rust/comments/4ihtfa/panopticon_a_l...

A particularly interesting comment:

  > Also I found iterator invalidation bugs simply by translating
  > C++ to Rust, thanks Borrow Checker!


Yeah, it's really interesting how many iterator invalidation bugs you find in real C++ codebases. It's so easy to think that they don't exist until you have a tool (or a human) actively searching for them…


Isnt this whole category of bugs easily avoided by using

1. _SECURE_SCL on windows

2. _GLIBCXX_DEBUG on glibc based toolchains


Those do not remove any bugs; it just changes how they are handled (ignore and hope for the best vs print diagnostic and terminate program, and the latter not in all cases)


Assuming that tests exist, and have good coverage, wont the checked iterator implementations mentioned above catch most issues with invalid iterators?


I don't know of any large, widely used, security critical, heavily scrutinized C++ program that uses those flags in production. I also don't know of any such C++ program that hasn't had game-over use after free bugs. So the answer to your question is somewhere ranging from "we don't know, but unlikely" to "no".


A big part of what makes Rust great is that it catches those issues at compile-time, no test coverage required, no extra run-time checks required (not that it moots tests, of course). In my experience so far this is a very powerful ally to have when writing software, as it's very useful to surface those issues as early as possible.

EDIT: Also:

> Assuming that tests exist, and have good coverage

That's a very big assumption. Granted my current job is often to clean up codebases that are the opposite of that, so maybe I've got a bias here.


For some definition of ‘good coverage’ and ‘most’: yes.

I do not think 100% coverage on the code you write alone is enough to get to ‘all', though. For example, if your test declares a vector and calls a function that does

    v.erase( i, v.end());
with i == v.end(), you get 100% coverage, but in production, that part of the code might be called with i == v.begin(), and that invalidates all iterators on the vector.


No. You can dereference those "safe" iterators into dangling references.


Does somebody know how it compares to radare2?


From their GitHub repository[1]:

"Panopticon is under heavy development and its feature set still very basic. It's not yet able replace programs like radare2 and IDA Pro. We are working on it, check back regularly!"

[1] https://github.com/das-labor/panopticon/blob/master/doc/feat...


Anyone know how this compares to IDA Pro ?


This has almost none of the important features of IDA Pro:

  - advanced interactivity (function boundary change, switch table options...)
  - stack pointer tracking and stack recovery
  - structure and enum definition, use in disassembly
  - compiler and static library recognition
  - an industry-standard decompilation plugin (Hex-Rays Decompiler)
  - support for quick scripting using a high-level interpreter language (IDC/IDAPython)


Panopticon is libre software. You can improve it and use it as basis for your own research on program analysis.


If you can afford an IDA Pro license and cost of plugins, does it matter?


Yeah, it matters. If this is better, I'll use this; if IDA is better, I'll use IDA. I'm not 100% sure I understand your comment.

Btw, most companies can afford an IDA license. It's pretty cheap compared to the salary of a developer, at least in the coastal US.


If your job can justify it, great, but mine can't. I want it for personal use and I can't justify it.


If you have more time than money, here's a possible option:

  - Buy IDA Pro using a credit card
  - Find bug or two, submit to bug bounty [1]
  - Pay back credit using bug bounty money
[1] - https://www.hex-rays.com/bugbounty.shtml


If life were so simple, everyone would be doing it don't you think?


My friend's grandfather was an economist. When he was a little boy he saw a $20 bill on the sidewalk on a walk with with his grandfather. He pointed to it and said "Grandpa look someone dropped a twenty dollar bill! We will be rich!" His grandfather dismissed his enthusiasm and explained that if there was free money on the ground somebody would have picked it up by now.


There's an evaluation version of IDA that's an older version and doesn't include 64-bit disassembly (just x86 and ARM IIRC), but it's free.


And is not available for my operating system.


What's your operating system? I usually run it under Wine on Linux without a problem.


OK, sure, but I still don't see how this is relevant to whether IDA is better than Panopticon.


On a side note; the visual callgraph is interesting.. It'd be interesting for other IDEs to implement something similar with a historical listing of input/output values for the corresponding displayed methods.


What's the visual callgraph? Is it something different than the CFG graph in the screenshot? I feel like most disassemblers do that.


Never worked with a disassembler. Didn't realize that this is commonplace. Could be beneficial for other languages as well..


The project states:

"Panopticon is a disassembler that understands the semantics of opcodes. This way it's able to help the user by discovering and displaying invariants that would have to be discovered "by hand" in traditional disassemblers."

Doesn't every disassembler have to understand the opcode semantics in order to disassemble and make sense of them or am I misinterpreting that statement?

Also can anyone explain what the "invariants" are? When I hear the word I can only think of loop invariants and I'm guessing that is not what the author means here.


You say "in order to disassemble and make sense of them", but the "sense" that a disassembler makes can just be the direct conversion from machine code to equivalent assembley. Machine code is just another syntax, directly converting that into assembly can happen instruction-by-instruction.

Making more sensible assembly that a human would write is another thing, if you want to follow jumps to figure out where instructions are and regenerate labels, then that's one step towards semantics.

I don't know what kinds of invariants panopticon is looking at, but there seems to be code for static analysis using abstract interpretation[1]. This could do the data-flow analysis, determine types and recognize persistent variables.

[1] https://en.wikipedia.org/wiki/Abstract_interpretation


Now I see. Thanks for the explanation. this is pretty neat then. Seems like a great learning tool


I noticed that this only supports AVR. what is your plan to support x86? would you think about linking against some external code that provides semantics for x86?


The README is a bit out of date. It support MOS 6502 and parts of x86_64 too. I currently writing the lifting code that translates x86 opcodes to a simpler intermediate language: https://github.com/flanfly/panopticon/blob/feature/rreil/lib...


How do you find adding new architectures to it? I've been meaning to add some architectures I want to Radare for a while now, but it's a bit daunting due to lack of docs and the build system is argh. And I've also been wanting to learn Rust, so this seems like an obvious fit for me...


Panopticon includes a simple disassembler framework. The basic idea is that you provide opcode patterns akin to regexp and write a function that emits mnemonics and code describing opcode semantics. There isn't much documentation, but you can take a look at the AVR and MOS 6502 disassemblers in lib/src/mos resp. lib/src/avr.


godspeed! that's hard work, especially if you want to deal with MMX/SSE and the FPU.


What is the best guide to learning how to reverse engineer programs?


I suggest "Reverse Engineering For Beginners"[1] and "Hacking the Xbox"[2].

[1]: http://beginners.re [2]: http://libgen.io/book/index.php?md5=64E13DD5E86FE633A48C2261...


Slightly OT, but are there any good libre tools for reverse engineering binary file formats rather than executables?


Start with binwalk.


GPL means I can't use it as a library unless I also GPL.

No thanks.



I like "ht editor".


[flagged]


Btw, that guy isn't me.


Weird. And now banned.


>Panopticon

Good lord what a cliche name.


Please don't post snarky dismissals. If you think there's an important point about the name, you're welcome to make it substantively.


> Good lord what a cliche name.

This is the name that Google should have used instead :)




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: