IMO, the format of this book makes sense for the context it was intended for: Stanford's course on optimizations and program analysis, where the problem sets and reading provide theoretical knowledge and the projects -- hacking on the joeq VM/compiler-infrastructure -- provide implementation experience. These distinct experiences make for a well-rounded education in advanced compiling techniques. (Disclaimer: I have not taken this course and am basing these comments on a quick look at their course Web page.)
So reading the Dragon Book is more akin to reading mathematics than reading a handbook on compiler implementation. Its treatment of data-flow analysis is another example of this: While the Dragon Book gives top-notch treatment of the mathematical underpinnings of data-flow analysis (semi-lattices, partial orderings, monotonicity, greatest lower-bounds, etc.), it does not go into how to implement an efficient worklist algorithm; it does not even mention du-chains AFAICT. At most, it suggests using a bit-set to represent the reaching definitions that enter and exit a basic block.
So I too am in agreement. But maybe this is what you want, mapleoin. If you just want to know what a reaching def is or how compiler writers know that their data-flow analysis is going to terminate, this is a book that will not bog you down with too many implementation details.
I think there more obvious problems with the dragon book:
in particular, ~300+ pages are spent focusing on techniques for doing parsing and lexing.
Now if I want to write a baby yacc [sic], this might be useful. But in this modern era, parsing is a very well understood problem, with lots of easy to use tool. Yes in a production compiler helpful syntax and type error messages are key, but when you're learning the compiling part, you want a book that doesn't spend half of its volume on that topic. Also, the 2nd edition doesn't seem to have a single level reader in mind.
The intro book people should look at is (as mentioned elsewhere), should be the intro to compilers in ml book by appel, and for advanced stuff folks should look at stuff like appel's compiling with continuations, the munchnick (spelling?) book, and one or two others.
I think the point is that 1) most exposure to the dragon book for most folks predates the 2nd edition, and in your experience, most of the learning sounds like it was from the lecture notes and problems sets rather than the text (presumably used as a reference supplement in practice?)
Even if you are writing a production compiler, you will likely not be using a tool, nor writing a tool, to generate the parser. Most production commercial compilers in practice use hand-written recursive descent with a hand-written lexer, which itself potentially uses feedback from the parser (very handy for parsing C or C++). Recursive descent usually gives context for error recovery that matches the way most programmers think about language syntax; and with it being written by hand, the recovery can be tweaked to be as smart as necessary.
Also, IDE / editor support like intellisense can greatly benefit from integration with a recursive descent parser. If you encode the cursor's position as a special token, the parser can handle that token and do a deep return (throw an exception, longjmp, whatever) with relevant context, active scopes, etc.
> Now if I want to write a baby yacc [sic], this might be useful. But in this modern era, parsing is a very well understood problem, with lots of easy to use tool.
I certainly don't begrudge you using the existing tools, but
speaking as someone writing a "baby yacc", I don't think parsing is quite the solved problem you make it out to be.
Yes there is TONS of literature on the subject, but new techniques and algorithms are being discovered all the time. ANTLR's LL(*) parsing hasn't been published yet (though I believe he's working on it) and only three years ago Frost, Hafiz and Callaghan published an algorithm for generalized top-down parsing in polynomial time. There's also the idea of PEG, published by Bryan Ford in 2004, a guy at UPenn who is carving out a set of languages between regular and push-down languages (http://www.cis.upenn.edu/~alur/nw.html).
All of this is to say; we're still discovering things about parsing. It's not a settled subject.
Ierusalimschy's LPEG (a PEG parser for Lua[1]) also has some new developments. IIRC, he found a way to greatly improve the space performance of PEGs. I've used it a lot (it's a nice middle ground between REs and a full parsing framework, and Lua is one of my favorite languages), but I'm not familiar enough with Ford's PEG implementation to be more specific.
Also, here's a good blog post[2] in which the author discovered how using parsing tools to syntax-highlight text as it's being modified quickly led him to the frontiers of parsing research.
You make a good point; it does spend a lot of time on the front-end. As pointed out elsewhere, Muchnick is more thorough on back-end topics, which it appears we both agree are far more interesting. I've definitely heard of those other books before, and in particular, Appel's _Compiling with Continuations_ will be especially interesting when I look into compiling functional languages.
Actually, I don't have any personal experience with the Stanford course. I am reading the book for fun really. (For all its faults, it is quite engrossing!) I just thought it would be relevant to investigate under what context this book is generally used.
So reading the Dragon Book is more akin to reading mathematics than reading a handbook on compiler implementation. Its treatment of data-flow analysis is another example of this: While the Dragon Book gives top-notch treatment of the mathematical underpinnings of data-flow analysis (semi-lattices, partial orderings, monotonicity, greatest lower-bounds, etc.), it does not go into how to implement an efficient worklist algorithm; it does not even mention du-chains AFAICT. At most, it suggests using a bit-set to represent the reaching definitions that enter and exit a basic block.
So I too am in agreement. But maybe this is what you want, mapleoin. If you just want to know what a reaching def is or how compiler writers know that their data-flow analysis is going to terminate, this is a book that will not bog you down with too many implementation details.