Hacker News new | past | comments | ask | show | jobs | submit login
How to find per-process I/O statistics on Linux (xaprb.com)
49 points by jawngee on Aug 23, 2009 | hide | past | favorite | 17 comments



Anyone has an idea if there is any way to distinguish random from sequential IO?

In my experience sequential IO is never the problem. Instead, it seems to me that random seeks are really the only performance problem nowadays. In the most extreme case throughput is only ~100Bits/s instead of ~100MB/s. Unfortunately random seeks are hidden behind abstraction layers and thus quite invisible to programmers (until the system freezes). Maybe we just need to wait for SSDs to become cheap.


Take a look at blktrace on Linux. It logs all block I/O layer activity from when an I/O request is sent to the disk to when it's serviced. It also outputs which sectors on the disk were written to service the request. People have written a few tools to read these logs and show meaningful numbers/pretty graphs -- btt is the official one, and here's another one that I like: http://oss.oracle.com/~mason/seekwatcher/.

Among other things, you can distinguish random from sequential I/O using blktrace and its accompanying set of tools by looking at the distance between subsequent I/O requests sent to the disk.


SSD's are cheap.


...and cheap SSDs are pathologically bad at random writes -- at least an order of magnitude worse than an mediocre hard drive on average, with regular latency spikes of several seconds!


When measured in $/IOPS, even good SSDs are cheap. I think the real problem is the lack of caching software (but I would think that).


Would you care to quantify that ?


I picked up my Intel (the good controller, MLC version) for $400, best upgrade ever, that's fucking cheap.


Shorter method: run 'iotop'.

This tool is packaged in every current Linux OS.

If you want to change IO priority, run ionice.


The article acknowledges iotop and explains that this method is for older kernels that don't support whatever syscalls iotop uses.


Yes, but the headline is misleading - this is a terrible way to 'find per-process I/O statistics on Linux'.

'Linux with older kernels' would be more appropriate.


I call your bluff. 'iotop' may be included in many linux distro's but don't say every. My Arch box doesn't have it, not by default anyway.

edit: What I'm trying to say, is that your statement is overly broad. Not all linux distro's have 'iotop' installed. It is not a guarantee. Particularly on specialized distros.


He did say "packaged", which is rather different from "installed".


not by default for most distros means an

apt-get install iotop

or

yum install iotop

away. That took about 6 seconds on the two machines I tried it on. So even if it is not installed by default it hardly is a hindrance that it isn't.


iotop is a very thin ncurses frontend to taskstats in the linux kernel, and is totally dependent on TASK_DELAY_ACCT and TASK_IO_ACCOUNTING being enabled when the kernel is built.

They were introduced in 2.6.20, so the stupidly conservative distros that have been frozen for years on 2.6.18 or 2.6.19 don't get to play.

On nice distros that don't force-feed you their kernel and litter their repos with broken-out kernel modules, installing something like iotop is not necessarily just a call to the package manager.


'litter their repos with broken-out kernel modules'

You don't seem to understand why modules exist. It doesn't matter how many modules are available, they won't be loaded unless they're needed. Eg, for PCI hardware, unless you have hardware that modules.pcimap matches to a driver. There is no overhead from having modules available to load, just extra convenience next time you, say, add a NIC or somesuch.

Also no distro force-feeds you a kernel. You're always able to build your own in the rare event that you need to, or the more likely event that you just feel interested in doing so.

For a business, most people can understand the benefit of using the same software that a few million others do.


When I said 'broken-out kernel modules', I was referring to modules that are distributed as independent packages in the repository, instead of just sitting in /lib/modules/$(uname -r)

I know exactly why modules exist, having written my own several times. The real benefit for stuff distributed with the mainline kernel is not runtime loading (you could just build them all statically), but unloading and reloading.


just tried it on my ubuntu 8.04 server.

   > sudo apt-get install iotop
   Reading package lists... Done
   Building dependency tree       
   Reading state information... Done
   E: Couldn't find package iotop




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: