Pin - A Dynamic Binary Instrumentation Tool

xal · on April 15, 2013

Here is an actual Hacker tool that allows all sorts of insane CPU close performance work. No comments all day. This is heartbreaking.

lallysingh · on April 15, 2013

First, it's Sunday night.

Second, PIN gives you very low-level access, sure, but it also requires quite a bit of work to get that data. You can get a lot of the same data (as seen in the examples) with libpfm4 (http://perfmon2.sourceforge.net/). CPU Performance counters can give you a lot of this data with much lower run-time overhead, and a lot less work.

Also, PIN's a little hairy. If you just want to generate code for run-time execution, LLVM's your best bet. If you want to diddle with a running executable, you can always use libelf(3) and ptrace(2) to read and diddle with the running process. It may be useful for specific sorts of analyses you want to run on an executable, but it's messy. If you're doing performance instrumentation, dynamically modifying the code is going to alter your results that can be hard to compensate for.

tptacek · on April 15, 2013

You can't effectively do things like instrumenting every write instruction in a program using ptrace. Also, the techniques Pin uses sound hairy, but they're the same things software virtualization does.

lallysingh · on April 15, 2013

True, but what do you do with that data? Transfer it out of process? Analyze it? And how many systems can withstand that sort of slowdown without timeouts?

Performance counters can tell you quite a bit, and they cost very little to set up. Snapshots and a little differential analysis can get you more comprehend-able data without transfer/storage problems.

lgeek · on April 15, 2013

I think it's all about using the right tool for the right job. Sure enough, there are some things that dynamic binary modification(DBM) does that can also be implemented another way (e.g. generic application level profiling), but even some of those things are sometimes done faster/better by DBM tools: Want to log the system calls to RANDOM_SYSCALL? Sure, use ptrace and do these context switches: app>kernel>ptrace_app>kernel(actual syscall)>ptrace_app>kernel>app every time any system call executes... Or use PIN and take the usually small performance hit. ptrace also messes up signal delivery.

Want to profile dynamically generated code? libelf won't be much help there. Or do you want to run memory accesses through a cache simulator? ptrace won't be much help. There are plenty of uses when DBM is the right (or only) tool.

nkurz · on April 15, 2013

You can get a lot of the same data (as seen in the examples) with libpfm4 (http://perfmon2.sourceforge.net/). CPU Performance counters can give you a lot of this data with much lower run-time overhead, and a lot less work.

What's the preferred way of using libpfm4? I've ended up using it's sample program as a way to convert from readable counter names to hex to put into a perf command line. I've found several defunct patches to give perf this functionality directly, and am confused why this seemingly essential functionality is left out.

What I'd like is the ability to measure just sections of code, and access to all available counters without needing to copy-and-paste hex. I'm getting the sense that perf is not the tool for this:

http://lwn.net/Articles/441209/

http://www.mail-archive.com/linux-perf-users@vger.kernel.org...

Is Likwid a viable option? https://code.google.com/p/likwid/

lallysingh · on April 15, 2013

I wrapped the perf_event syscall with something easier to use, and just call it directly around the code I care about.

nkurz · on April 15, 2013

I came across it as a platform for running and debugging code for a non-yet-released CPU: http://software.intel.com/en-us/articles/intel-software-deve...

A couple other uses that I thought it would be good for were marking instructions with unaligned loads and stores and flagging switches to and from 256-bit VEX code. Although probably you can do these with other tools as well.

I posted this here because it seemed potentially useful, and I was surprised I'd never heard of this project. I'm mostly familiar with the ones Lally mentioned, but Thomas brought up DynamoRIO which is also new to me. Are there other niche optimization tools I should know about?

A specific question would whether there is anything more useful than IACA and gut instinct for determining optimal instruction order. Even just something that would generate an easy to parse data dependency graph for a short section of code?

tptacek · on April 15, 2013

So... what do you think about it? Or, do you have questions? I'm not very familiar with Pin, but am somewhat familiar with DynamoRIO, which is a competing project.

mspecter · on April 15, 2013

An interesting application of Pin for malware analysis / visualization is Danny Quist's Vera[1] and de-obfuscation framework[2]. It's also used in MIT's Architecture course to benchmark different architecture designs.

[1] http://www.offensivecomputing.net/?q=node/1687 [2] http://www.offensivecomputing.net/?q=node/492

nnethercote · on April 15, 2013

Here's a paper about Valgrind that includes some details on how it differs from Pin: http://www.valgrind.org/docs/valgrind2007.pdf.

interconnector · on April 14, 2013

Discussed at greater depth in the paper at http://goo.gl/YDTwu , if anyone's interested.

nkurz · on April 15, 2013

Thanks! Direct link to the paper here: http://ursuletz.com/people/faculty/pdfs/p190-luk.pdf

tptacek · on April 15, 2013

Thank you for this.

smtddr · on April 15, 2013

This is definitely something I won't forget about the next time I'm trying to figure out what a binary is exactly doing. Also, I really need something like this for OSX right now.

tptacek · on April 15, 2013

I believe people are working on a port of DynamoRIO to OSX.

k4st · on April 15, 2013

I am working on a DBT framework that has some user space support. I do my main development in OS X and Linux, and so I have done some testing of it on OS X.

The main focus of the DBT tool is Linux kernel modules, but let me know the kinds of stuff you need it for and I can a) figure out if my tool is applicable, and b) perhaps share the code.

smtddr · on April 15, 2013

/Applications/Xcode.app/Contents/Developer/usr/bin/instruments

I need to know everything about that binary. How it works, what ports(unix domain & network sockets), files it opens on the harddrive, libraries it's linked to, how it decides what to do. Anything & Everything there is to know about it. ^_^

simscitizen · on April 15, 2013

There are plenty of built-in performance/introspection tools in OS X that you can try first before resorting to a third-party solution:

1) What ports it opens:

- netstat shows you what ports a program has running

- DTrace shows you all syscalls a process makes (among other things). dtruss is a convenient wrapper script included in OS X which shows you all the syscalls a process makes (including opening sockets.

2) What files it opens

- Again, DTrace's syscall provider lets you introspect all syscalls, including open(). There's even a handy wrapper script included with OS X called opensnoop.

- Alternately, you can use the fs_usage command line tool to tap into the xnu kernel's trace mechanism. This shows all sorts of filesystem events, including what files are opened.

3) What libraries a binary is linked to

OS X binaries use the Mach-O format, not ELF like most other Unixes. So you have to use OS X's binary introspection tools to understand that format rather than the standard GNU binutils. What you're looking for here is otool, which lets you introspect Mach-O binaries. Specifically, "otool -L /Applications/Mail.app/Mail" for instance shows you which libraries Mail links to. Run this recursively to get the transitive closure of all dependencies a binary links against. Another way to do this is to run "vmmap -v <pid>" to show you the vm layout of a process, which includes the __TEXT/__DATA segments of all libraries the process links against.

And of course, gdb/lldb is included with the developer tools, you can just attach to whatever process you care about and set breakpoints, type "info sharedlib" to see what libraries are in the address space, etc. Also, for better or worse, Objective-C is an extremely dynamic language, so you can even do things like write a shared library with code you want to inject into a process (potentially monkey-patching existing methods using ObjC categories) and dlopen it from gdb to insert it into the target process's address space.

smtddr · on April 15, 2013

Nice, never heard of otool or vmmap. I'll definitely try 'em out. thanks.

mspecter · on April 15, 2013

So, this might be naive, but for static information like the shared libs it's using, you might want to check out otool, and then pick up a good disassembler (I'm a fan of hopper, which is relatively cheap). For dynamic analysis you might want to check out the standard osx tools like netstat, and instrumentation tools like valgrind or gdb. gdb + breakpoints on choice system calls works pretty well on non-obfuscated binaries!

smtddr · on April 15, 2013

I'll have to try that out. So far I've tried iosnoop, lsof and dtrace to get an idea of what the program is up to. I did get a bit of info from those tools.

k4st · on April 15, 2013

I will look into this a bit tomorrow. Can you supply an example command-line invocation?

smtddr · on April 15, 2013

http://blog.manbolo.com/2012/04/08/ios-automated-tests-with-...

This blog talks about the tool for automated testing. It's kinda "complicated" to setup. In the blog, he eventually gets to a commandline:

instruments -t /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/Library/Instruments/PlugIns/AutomationInstrument.bundle/Contents/Resources/Automation.tracetemplate "/Users/jc/Library/Application Support/iPhone Simulator/5.1/Applications/C28DDC1B-810E-43BD-A0E7-C16A680D8E15/TestAutomation.app" -e UIASCRIPT /Users/jc/Documents/Dev/TestAutomation/TestAutomation/TestUI/Test-2.js

But there's a lot of setup, so =/

caf · on April 15, 2013

The user guide gives a good flavour of the kind of things you can do with this ( http://software.intel.com/sites/landingpage/pintool/docs/584... ).

zokier · on April 15, 2013

> Pin is proprietary software developed and supported by Intel and is supplied free of charge for non-commercial use.

Huh. What are the licensing conditions and price for commercial use then?

lgeek · on April 15, 2013

If this is a concern, feel free to use the open source, BSD-licensed main competitor: DynamoRIO: http://www.dynamorio.org/

qznc · on April 15, 2013

Isn't valgrind a more popular competitor?

lgeek · on April 15, 2013

PIN/DynamoRIO and Valgrind have slightly different design aims. In short:

* Valgrind was designed to support rich analysis plugins (like Memcheck, which keeps a shadow copy of every bit of data) and performance was a secondary concern (on Valgrind, applications run on average about 4x slower, threads are serialized, etc).

* DynamoRIO and PIN are designed not to make much of an impact on performance (usually a few percent) and are more suitable for running in production, but it's somewhat more complicated to write plugins for them.

Both DynamoRIO[0] and Valgrind[1] maintain lists of publications which go into much more detail.

[0] http://www.dynamorio.org/pubs.html [1] http://valgrind.org/docs/pubs.html

cwp · on April 15, 2013

How does this compare to dtrace?

rainforest · on April 15, 2013

They differ quite significantly. dtrace uses probes - points where instrumentation can be installed to inspect the process as it runs. Since it requires these probes to be defined, probes only come "for free" in kernel-space: e.g. tracing syscalls, pageins, that kind of thing. SystemTap offers similar functionality - userspace probes can be defined for it too.

Pin, on the other hand, dynamically rewrites the binary to inject instrumentation. This allows it to inject instrumentation code at a higher granularity (individual instructions). It's useful where the application might not define dtrace probes, or, for example on OSX, where the application has "opted out" of it to protect it from being screenshotted (a la iTunes).

Pin is more like a scripted debugger than an instrumentation tool.

cwp · on April 15, 2013

Ah, very clear. Thank you.