Hacker News new | past | comments | ask | show | jobs | submit login
Linux kernel performance: Flame Graphs (dtrace.org)
148 points by brendangregg on March 17, 2012 | hide | past | favorite | 4 comments



You know you've written a great article when it has tons of points on HN with zero comments...

Having used a few profiling tools in the past (such as gprof, valgrind's callgrind/cachegrind through kcachegrind, and of course clock_gettime()), I think flame graphs are the best visualization of that kind of data I've seen so far. I wonder what other performance visualizations exist that I haven't seen. I'd love to see a 3D flame graph that takes multiple samples over time, probably with overlapping windows.


Very cool. Exec Summary:

Conclusion

With the Flame Graph visualization, CPU time in the Linux kernel can be quickly understood and inspected. In this post, I showed Flame Graphs for different workloads: networking, file system I/O, and process execution. As a SVG in the browser, they can be navigated with the mouse to inspect element details, revealing percentages so that performance issues or tuning efforts can be quantified.

I used perf_events and SystemTap to sample stack traces, one task out of many that these powerful tools can do. It shouldn’t be too hard to use oprofile to provide the data for Flame Graphs as well.

https://github.com/brendangregg/FlameGraph


If you haven't read anything else from Brendan Gregg before, do yourself a favor and bookmark his blog. I've learned more from his blog than anyone else's.


Interesting visualization, but it may be a tiny bit misleading. My first thought was that the larger the horizontal bar, the more time spent in that function, but that's not the case. Rather, it captures the number of times the function was seen in some execution path.

So if you have:

    f():
        do_less_work()
        do_large_work()
f()'s width is inflated even though it's do_large_work() that does most of the work.

The actual execution time of a function is its xwidth less the sum of all its childrens' xwidths.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: