The problem with htop on Linux is that once there are 200+ processes running on the system htop takes significant portion of CPU time and utilization. This is because htop has to go through each process entry in procfs (open, read, close) every second and parse the text context instead of just calling appropriate syscall like on the OpenBSD and friends to retrieve such information.
It would help if kernel provided process information in binary form instead of serializing it into text. Or even better to provide specific syscalls for it like on macOS, Windows, OpenBSD, Solaris and others.
Significant in what way? I created 400 processes + 328 threads on a 10-year-old CPU and htop is not using more than 1.3% CPU on a machine with 800% available CPU power (quad-core, 8-thread)[0]. That means 0.16% total CPU used. While I agree that it is _less_ efficient than some other ways, in what way is that _significant_?
On my system there are 240 tasks running with ~1700 threads. Htop is using 6% of single core with cgroup column disabled and 9% with it enabled. It spends most of the time in kernel space so it's not htops fault.
> While I agree that it is _less_ efficient than some other ways, in what way is that _significant_?
This gets noticeable when every connected user/session runs the htop.
Try many small or tiny processes, and soon htop overhead becomes significant (up to 5% total utilization in my situation), that's not exactly efficient.
I remember way back in the early years of the Third Age, I had to write a process accounting system that supported all kinds of Unices (HP-UX, AIX, Solaris, FreeBSD, Linux ...). And you're right, there's a plethora of other options other than procfs, although IIRC, Linux wasn't the only one I had to support with that variant. Wasn't Solaris one of them?
I will say, htop is a lot more efficient than GNU top, despite its functionality. I do not use it in the (default?) mode where it lists all the threads, because that is nuts.
Traditional Unix implementation of ps and similar tools work by directly reading the appropriate data structures from kernel (through /dev/kmem or something to that effect). Modern BSD implementations have libkvm.a, that abstracts that somehow, but still directly reads stuff from kernel space.
I don't know, but I don't see doing random reads from kernel memory as particularly sane API to get list of processes and procfs is several orders of magnitude cleaner solution.
> Traditional Unix implementation of ps and similar tools work by directly reading the appropriate data structures from kernel (through /dev/kmem or something to that effect).
This is not correct - /dev/kmem and similar are typically only readable by root. If what you say were correct, ps and friends wouldn't work for unprivileged users (unless they were setuid root, which they're not).
Some version of *BSDs probably fixed that with some sysctl interface, but on older Unixes you would read the kernel memory to get that system information.
Its not meant to be used 24/7, its just used as a guide/diagnsostic to how the machine and its processes are performing! I use it for maybe about 10 seconds.
Some people use tmux when connecting to their virtual machines on the server and forget to quit the htop instances. This adds to the total CPU utilization really quick.
Well, you are right (and top has the same problem if not worse), but I'm always surprised what these tools are doing that they take significat CPU time. Parsing 200 virtual text files every second for sure should not.
> Or even better to provide specific syscalls for it like on macOS, Windows, OpenBSD, Solaris and others.
One of my favorite things about htop are some of the projects that have been created that are modeled after htop but focus on information other than system resources.
Cointop [0] is one of these projects that comes to mind.
intel_gpu_top helped me solve a mysterious performance issue on a MacBook after countless hours of fruitless investigation. Overheating and throttling was an issue but even after I fixed it the system would lag hard - instantly when I used the external 4k display, and after a while on the internal 1440p screen. Turns out cool-retro-term was maxing out the integrated Intel GPU which caused the entire system to stutter and lag.
Unfortunately both the MBP and my current XPS 15 are unable to drive cool-retro-term on a 4k display with the CPU integrated graphics, and they both overheat and throttle if I use the nvidia graphics card :/
Laptops have very poor cooling. I have a Clevo laptop with a great processor but it will sometimes throttle itself to cool down. Great for small bursts of activity such as compilation but I don't understand how they could market these laptops as gaming machines. Running ffmpeg stabilizes the temperature at a healthy 96 degrees.
Modern powerful hardware has a hard time emulating a glass of water with good fidelity. Reproducing physical effects like ghosting is often harder than it looks.
There's a lot of different usages that may not heat the GPU as much. Also Windows might have better thermal management in the drivers.
CPU wise, Intel defines their TDP as the average heat dissipation, but the CPU can boost higher than this. But from what I understand they tell manufactures to design to the TDP.
To me a browser for this seems like overkill, but I can understand the argument that "everyone already has a browser open", even if I don't think that it leads to good places.
htop is an excellent tool. I appreciate his valiant effort to explain what load average is; it's confused Unix users forever. His explanation is more or less right but I think misses a bit of context about the old days of Unix performance.
It used to be in the early 90s that Unix systems were quite frequently I/O bound; disks were slow and typically bottlenecked through one bus and lots of processes wanted disk access all the time. Swapping was also way more common. Load average is more or less a quick count of all processes that are either waiting on CPU or waiting on I/O. Either was likely in an old Unix system, so adding them all up was sensible.
In modern systems your workload is probably very specifically disk bound, or CPU bound, or network bound. It's much better to actually look at all three kinds of load separately. htop is a great tool for doing that! but the summary load average number is not so useful anymore.
> In modern systems your workload is probably very specifically disk bound, or CPU bound, or network bound. It's much better to actually look at all three kinds of load separately.
Linux recently got an interface called Pressure Stall Information that lets you collect accurate measures of CPU, I/O and memory pressure.
- Horribly under-specced machine (8GB RAM) with way too many (~150) tabs open -> continuous swap stalls that last for minutes at a time and freeze the mouse cursor: 5-second load average of 12-20
- 6-year old Android phone doing tasks in the background, generally feeling more than adequately performant, and lukewarm enough that you aren't sure if it's your hand or the phone generating the warmth: load average of 12-13
- Building Linux with -j64 to see what would happen: 5-second load average of 0.97... 0.99... 49.40... 127.21... ^C^C^C^C^C^C^C^C^C 180.66... ^C^C^C^C^C^CCCCCCC^^^CCcccCCcCC 251.22... ^C^C^C^C^C^C ^C^C^C^C ^C^C^C ^C^C^C^C^C^C ^^^CCCCC^C^C^C (mouse finally moves a few pixels, terminal notices first ^C) 245.55... 220.00... 205.42... 198.94...
- Resuming from hibernation with 20GB of data already in swap: 0.21... 0.15... 0.10... 251.50... 280.12... 301.69... 362.22... 389.91... 402.40... 308.56... 297.21... 260.66... 254.99... (etc; this one takes a while)
These are two different commands. "strace | grep foo" looks for any line containing foo. It will find "foobar" and "food" system calls. It will find the word "foo" in the (abridged) string data that it prints out (read(..., "foo...", ...)).
Meanwhile, strace's -e filter will find exact matches of syscalls that are named on the command line.
Obviously the author wants the second one, but it is hardly useless to grep. And, once you know how to use grep, you know how to do this sort of search for every program ever written. It is nice that strace has its own budget string matching built in... but knowing how strace works only tells you how strace works. Knowing how grep works lets you do this sort of thing to any program that outputs text.
(A lot of work has been done to try and make something as good as "strace -e" but generically; for example, Powershell's Select-Object cmdlet. I have never managed to get that to do anything useful, but they did manage to break grep, erm ... Select-String, in the process. Great work!)
As I have mentioned in another reply, just because you know grep does not mean you should stop there. Especially when teaching others, you should find the optimal way and mention that you _could_ also use grep if you were in a rush.
You could always do things the quick-and-dirty way, but does that help you grow as a programmer? You could write Python code that looks like C, like many people do when they come from a C background, or you could learn how to write Pythonic code by reading the documentation and examples.
> Knowing how grep works lets you do this sort of thing to any program that outputs text.
It's worth noting that I could say the same about strace. Once you know strace, you could run it against any program that uses system calls, which by the way, is many. :-)
I disagree in this case, I think it's more Unix-style to use grep when appropriate. You should learn a lot of generally applicable tools, not the intricate details of a few specialized tools. The "useless cat" case can be represented as, instead of grep foo filename, grep foo < filename.
That's like saying you can just use `find . | grep ... | wc -l` and then learning the hard way that you can have newlines in filenames. While I agree you should learn lots of general tools, you should not stop there. If you have a particular need you should consult the manpage; it is one of the ways you become better. In the htop example it might be fine as a quick-and-dirty method, but when teaching others like through this particular blog post you should do so the right way.
If your pipeline is meant to count files then the bigger problem is that it counts directories too.
If you have filenames with newlines you may have other - less nasty - stuff too.
So you either get a reasonable answer in 1s (or 2s if you have short look at the output before counting). Or you spend 1h+ discussing requirements and carefully writing a program that gives a precise answer.
Things to consider:
- how to treat symlinks
- how to treat hardlinks
- what if files are added/removed to/from the directory tree while you scan
- how to react to missing read permissions
- is your regex on the whole path or just on the basename
> learning the hard way that you can have newlines in filenames
Speaking as someone who is using Unix-like systems since the days of Santa Cruz Operation and Linux on a 50 MB disk partition - not everything that is permissible is auspicious.
And most often that's because the permissions were too lax to begin with. That's what you get when a bunch of pot-smoking hippies (X) draw your OS specifications.
I avoid using even spaces in file names, for this specific reason.
---
(X) - and I say that in the best way possible, though some may be inclined to disagree.
"...he and brethren were attempting to make a small fortune for themselves by secreting away an item - a printed book - that was easy enough to get in 1640 but near impossible to get in Tristan's age, making it of great worth."
"So you're thieves and chancers," said I approvingly.
"No," he objected.
-- The Rise And Fall Of D.O.D.O. by Neal Stephenson and Nicole Galland
Agreed, but you should also keep in mind that this means extra work. Often the difference is negligible. But if you e.g. strace a syscall-wise very busy process and pipe that into grep, all the logging strace does that is then piped over to grep just to be discarded by grep eventually, may actually affect performance a lot; while strace -e avoids this extra work.
This is such a great article - I’ve worked with linux for more than a decade and never really understood what “setuid root” actually meant or that “kill” is a builtin to Bash
> I decided to look everything up and document it here.
You can be a hero, too! I find this inspiring. It's nice to see such an accessible and pragmatic way of making a contribution to the community. My very first thought on seeing that was "I could do that!"
I regret neither Top neither Htop show you estimates of "IO Activity time" like in Windows 10 task manager - I need to use separate iostat to observe that.
I found the "IO Activity time", percentage of time when IO is used, to be a really good indicator of IO load on machine level - neither io-op per second, neither bandwith tell much if you're already using up all available IO. Load does not help here, as number of processes doing IO influences "load" more.
To be fair it’s a far worse problem on Windows (although I guess a lot of my Linux machines are running from ram now so maybe I wouldn’t notice if it weren’t.)
Yeah, you need root or sudo to view iotop. But some information is (at least on some fresh ubuntu) available without elevation, like the per-disk output from iostat.
It's either on Performance tab (per disk) - the big chart is "active time", either per process on Processes tab as a column "Disk" - color is related to time usage, while the bandwidth is as text.
The one thing I don't get about htop is the progress bars... they never seem to behave the way I'd expect them to based on the percentages, and they've got some colour coding I'm not clear on either... surely there is something I'm missing.
>There will be a lot of output. We can grep for the open system call. But that will not really work since strace outputs everything to the standard error (stderr) stream. We can redirect the stderr to the standard output (stdout) stream with 2>&1.
Besides being a great explanation of htop, I like the way this article captures the way I - far from a shell guru - tend to think when putting together a few steps in the terminal. And even then it shows that it pays to read the man page too!
I am convinced that load average on a machine is one of the most misleading statistics you can view. It never means what you think it means and half the time it doesn't even indicate a real problem when it's high.
> One process runs for a bit of time, then it is suspended while the other processes waiting to run take turns running for a while. The time slice is usually a few milliseconds so you don't really notice it that much when your system is not under high load. (It'd be really interesting to find out how long time slices usually are in Linux.)
Isn't this the famous kernel HZ? It was originally 100 (interrupts/second), but nowadays often 250 or 1000:
It’s much more complex than that these days. With the CFS scheduler a process will run for somewhere between the target latency (basically the size of slice that N processes waiting to be scheduled are competing for, I think defaulted to 20ms) as the upper bound and the minimum granularity (the smallest slice that may be granted to a process being scheduled) as the lower bound, I think defaulted to 4ms.
This is made more complex by the ability to configure the scheduler with high granularity, including the ability to schedule different processors and process groups with different schedulers (and the rules that govern how the schedulers can then preempt each other).
Awesome article, this HN thread should indicate it was originally published on HN in 2016. As mentioned on the front page "#1 on Hacker News on December 2, 2016"
This is a great explanation of everything in the output of htop and related, but I suggest the author clean up the prose a little bit to make it a bit less conversational and easier to read.
It would help if kernel provided process information in binary form instead of serializing it into text. Or even better to provide specific syscalls for it like on macOS, Windows, OpenBSD, Solaris and others.