Brian Kernighan: sometimes the old ways are best

one1plus1one · on Jan 22, 2009

I learnt to program in an IDE, and was born and raised in a Microsoft environment. My entire computing experience has been dominated by Microsoft.

But now I'm learning Unix-commands just for the fun and romance of it.

It may sound shallow, but one reason I want to learn Unix is because it's somewhat "mysterious" and all the cool hackers I know use Unix.

And many people I know in the IT world who can get things done often turn to Unix.

Thus, this article re-inforces my motivation to learn Unix -- the operating system of our ancestors.

I should note that it doesn't mean I hate windows. In fact as I learn unix commands I am seeing more and more that Windows was born out of Unix and is a "distorted" version of unix. But Windows feels like home to me. So I'll likely always use both: Windows and Unix.

ksvs · on Jan 22, 2009

That does not seem shallow. That seems a very good algorithm for discovering good new things.

rfurmani · on Jan 22, 2009

That reminds of this article about why Unix is inherently made for a programmer's style of thought: http://igor-nav.livejournal.com/12843.html

lallysingh · on Jan 22, 2009

So is this a grep/diff/sort/awk/etc man saying that IDEs tend to suck?

It's nice to hear it from Kernighan, but it's hardly surprising. IDEs do suck.

ojbyrne · on Jan 22, 2009

I think he should have attributed more importance to regular expressions. They don't even get mentioned, beyond being implied by grep.

patio11 · on Jan 22, 2009

You can grep for regular expressions? I think my mind just got blown. For the last ten years I've been finding them with awk.

awk '(/putithere/) {print $0}'

thamer · on Jan 22, 2009

http://en.wikipedia.org/wiki/Grep

grep is a command line text search utility originally written for Unix. The name is taken from the first letters in global / regular expression / print, a series of instructions for the ed text editor.

grep -rHnE 'my [r]eg+ex' /path/to/folder/ : look for files with this regular expression recursively and print out their name and the line number, as well as the matching line itself.

Awk might do it, or you can -exec it with find, but I've read several times that grep is generally faster, even faster than a simple search with a custom and non-optimized C program.

gaius · on Jan 22, 2009

The re in grep means "regular expression".

lacker · on Jan 22, 2009

The contents as text instead of pdf:

As I write this column, I'm in the middle of two summer projects; with luck, they'll both be finished by the time you read it. One involves a forensic analysis of over 100,000 lines of old C and assembly code from about 1990, and I have to work on Windows XP. The other is a hack to translate code written in weird language L1 into weird language L2 with a program written in scripting language L3, where none of the L's even existed in 1990; this one uses Linux. Thus it's perhaps a bit surprising that I find myself relying on much the same toolset for these very different tasks.

What's Changed

Bill Plauger and I wrote Software Tools in 1975, nine years before IEEE Software began publication. Our title was certainly not the first use of the phrase, but the book did help to popularize the idea of tools and show how a small set of simple text-based tools could make programmers more productive. Our toolset was stolen quite explicitly from Unix models. At the time, Unix was barely known outside a tiny community, and even the idea of thinking consciously about software tools seemed new. In fact, we even wrote our programs in a dialect of Fortran because C was barely three years old at the time and we thought we'd sell more copies if we aimed at Fortran programmers. A lot has changed over the past 25 or 30 years.

Computers are enormously faster and have vastly more memory and disk space, and we can write much bigger and more interesting programs with all that horsepower. Although C is still widely used, programmers today often prefer languages such as Java and Python that spend memory and time to gain expressiveness and safety, which is almost always a good trade. We develop code differently as well, with powerful integrated development environments (IDEs) such as Visual Studio and Eclipse--complex tools that manage the whole process, showing us all the facets of the code and replacing manuals with online help and syntax completion. Sophisticated frameworks generate boatloads of code for us and glue it all together at the click of a mouse. In principle, we're far better off than we used to be. But when I program, the tools that I use most often, or that I miss the most when they aren't available, are not the fancy IDEs. They're the old stalwarts from the Software Tools and early Unix era, such as grep, diff, sort, wc, and shells. For example, my forensics work requires comparing two versions of the program. How better to compare them than with diff? There are many hundreds of files, so I use find to walk the directory hierarchy and generate lists of files to work with. I want to repeat some sequence of operations--time for a shell script. And of course there's endless grepping to find all the places where some variable is defined or used. The combination of grep and sort brings together things that should be the same but might not be--for instance, a variable that's declared differently in two files, or all the potentially risky #defines. The language translation project uses much the same core set: diff to compare program outputs, grep to find things, the shell to automate regression testing.

What We Want

What do we want from our tools? First and foremost is mechanical advantage: the tool must do some task better than people can, augmenting or replacing our own effort. Grep, which finds patterns of text, is the quintessential example of a good tool: it's dead easy to use, and it searches faster and better than we can. Grep is actually an improvement on many of its successors. I've never figured out how to get Visual Studio or Eclipse to produce a compact list of all the places where a particular string occurs throughout a program. I'm sure experts will be happy to teach me, but that's not much help when the experts are far away or the IDE isn't installed. That leads to the second criterion for a good tool: it should be available everywhere. It's no help if SuperWhatever for Windows offers some wonderful feature but I'm working on Unix. The other direction is better because Unix command-line tools are readily available everywhere. One of the first things I do on a new Windows machine is install Cygwin so that I can get some work done. The universality of the old faithfuls makes them more useful than more powerful systems that are tied to a specific environment or that are so big and complicated that it just takes too long to get started. The third criterion for good tools is that they can be used in unexpected ways, the way we use a screwdriver to pry open a paint can and a hammer to close it up again. One of the most compelling advantages of the old Unix collection is that each one does some generic but focused task (searching, sorting, counting, comparing) but can be endlessly combined with others to perform complicated ad hoc operations. The early Unix literature is full of examples of novel shell programs. Of course, the shell itself is a great example of a generic but focused tool: it concentrates on running programs and encapsulating frequent operations in scripts. It's hard to mix and match programs unless they share some uniform representation of information. In the good old days, that was plain ASCII text, not proprietary binary formats. Naturally, there are also tools to convert nontext representations into text. For my forensics work, one of the most useful is strings, which finds the ASCII text within a binary file. The combination of strings and grep often gives real insight into the contents of some otherwise-inscrutable file, and, if all else fails, od produces a readable view of the raw bits that can even be grepped.

A fourth criterion for a good tool is that it not be too specialized--put another way, that it not know too much. IDEs know that you're writing a program in a specific language, so they won't help if you're not; indeed, it can be a real chore to force some nonstandard component into one, like a Yacc grammar as part of a C program. Lest it seem like I'm only complaining about big environments here, even old tools can be messed up. Consider wc, which counts lines, words, and characters. It does a fine job on vanilla text, and it's valuable for a quick assessment of any arbitrary file (an unplanned-for use). But the Linux version of wc has been "improved": by default it thinks it's really counting words in Unicode. So if the input is a nontext file, Linux wc complains about every byte that looks like a broken Unicode character, and it runs like a turtle as a result. A great tool has been damaged because it thinks it knows what you're doing. You can remedy that behavior with the right incantation, but only if a wizard is nearby.

There has surely been much progress in tools over the 25 years that IEEE Software has been around, and I wouldn't want to go back in time. But the tools I use today are mostly the same old ones--grep, diff, sort, awk, and friends. This might well mean that I'm a dinosaur stuck in the past. On the other hand, when it comes to doing simple things quickly, I can often have the job done while experts are still waiting for their IDE to start up. Sometimes the old ways are best, and they're certainly worth knowing well.

trapper · on Jan 22, 2009

It really does sound like he doesn't want to learn new things.

"I've never figured out how to get Visual Studio or Eclipse to produce a compact list of all the places where a particular string occurs throughout a program."?

Come on. [in eclipse/variants] CTRL-H? Search menu for the blind?

The rest of the points are pretty general: Utility, availability, generality. This covers most good IDE's if you ask me.

What's needed is a study rather than an opinion. Compare two or more groups, experts in either emacs/vi + shell vs eclipse/visual studio and one common language. Look at the speed of completing basic tasks common to programming.

I would hypothesize that while differences amongst groups exist, no differences between means would be larger than a cohen effect size of 0.5

silentbicycle · on Jan 22, 2009

Question: What about when you need to do something that the IDE doesn't have built in?

(I use Emacs. Many years ago (mid 90s) I used the IDE that came with Turbo C++, but haven't used Visual Studio or Eclipse, and am genuinely curious how they handle extensibility.)

trapper · on Jan 22, 2009

Eclipse has an insane amount of plugins to do extra things, and writing your own is pretty simple if you really need to do it.

People tend to forget though that you have shell access as well. You aren't somehow limited to just the ide!

johngunderman · on Jan 22, 2009

They don't handle it well. I haven't used Visual Studio in a few years, but Eclipse has plugins that you can get off eclipse-plugins.org. unfortunately, they are not all free :-( Emacs is certainly superior for most everything.

trapper · on Jan 22, 2009

Honestly, I don't see it. It's been a while since I used emacs for development, but the entire reason I moved from emacs to eclipse was because I had at least a 10x improvement in productivity working with large [java] code bases.

Has jdee come a long way since? I just checked and it doesn't look like it.

johngunderman · on Jan 22, 2009

As far as java on emacs, I can't exactly comment. Java is my one main exception to emacs. Its not that I dislike it on emacs, I just have never tried it there. I learned on Eclipse it worked for what I needed (although Eclipse is horribly slow...).

jlc · on Jan 22, 2009

Somehow I expected a bigger beard.

jodrellblank · on Jan 22, 2009

He had, but his Beard Mass Index (BMI) was heading towards "clinically obsolete", so he's trying to cut down.

;)