IMO, the format of this book makes sense for the context it was intended for: St...

carterschonwald · on Aug 29, 2010

I think there more obvious problems with the dragon book:

in particular, ~300+ pages are spent focusing on techniques for doing parsing and lexing.

Now if I want to write a baby yacc [sic], this might be useful. But in this modern era, parsing is a very well understood problem, with lots of easy to use tool. Yes in a production compiler helpful syntax and type error messages are key, but when you're learning the compiling part, you want a book that doesn't spend half of its volume on that topic. Also, the 2nd edition doesn't seem to have a single level reader in mind.

The intro book people should look at is (as mentioned elsewhere), should be the intro to compilers in ml book by appel, and for advanced stuff folks should look at stuff like appel's compiling with continuations, the munchnick (spelling?) book, and one or two others.

I think the point is that 1) most exposure to the dragon book for most folks predates the 2nd edition, and in your experience, most of the learning sounds like it was from the lecture notes and problems sets rather than the text (presumably used as a reference supplement in practice?)

barrkel · on Aug 29, 2010

Even if you are writing a production compiler, you will likely not be using a tool, nor writing a tool, to generate the parser. Most production commercial compilers in practice use hand-written recursive descent with a hand-written lexer, which itself potentially uses feedback from the parser (very handy for parsing C or C++). Recursive descent usually gives context for error recovery that matches the way most programmers think about language syntax; and with it being written by hand, the recovery can be tweaked to be as smart as necessary.

Also, IDE / editor support like intellisense can greatly benefit from integration with a recursive descent parser. If you encode the cursor's position as a special token, the parser can handle that token and do a deep return (throw an exception, longjmp, whatever) with relevant context, active scopes, etc.

carterschonwald · on Aug 29, 2010

this is also very very very true. And thus why parsing combinator libraries are one of the sanest defaults one can adopt in dealing with parsing

haberman · on Aug 29, 2010

> Now if I want to write a baby yacc [sic], this might be useful. But in this modern era, parsing is a very well understood problem, with lots of easy to use tool.

I certainly don't begrudge you using the existing tools, but speaking as someone writing a "baby yacc", I don't think parsing is quite the solved problem you make it out to be.

Yes there is TONS of literature on the subject, but new techniques and algorithms are being discovered all the time. ANTLR's LL(*) parsing hasn't been published yet (though I believe he's working on it) and only three years ago Frost, Hafiz and Callaghan published an algorithm for generalized top-down parsing in polynomial time. There's also the idea of PEG, published by Bryan Ford in 2004, a guy at UPenn who is carving out a set of languages between regular and push-down languages (http://www.cis.upenn.edu/~alur/nw.html).

All of this is to say; we're still discovering things about parsing. It's not a settled subject.

silentbicycle · on Aug 29, 2010

Ierusalimschy's LPEG (a PEG parser for Lua[1]) also has some new developments. IIRC, he found a way to greatly improve the space performance of PEGs. I've used it a lot (it's a nice middle ground between REs and a full parsing framework, and Lua is one of my favorite languages), but I'm not familiar enough with Ford's PEG implementation to be more specific.

Also, here's a good blog post[2] in which the author discovered how using parsing tools to syntax-highlight text as it's being modified quickly led him to the frontiers of parsing research.

[1]: http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html

[2]: http://www.codekana.com/blog/2009/04/02/on-the-speed-of-ligh...

dlo · on Aug 29, 2010

You make a good point; it does spend a lot of time on the front-end. As pointed out elsewhere, Muchnick is more thorough on back-end topics, which it appears we both agree are far more interesting. I've definitely heard of those other books before, and in particular, Appel's _Compiling with Continuations_ will be especially interesting when I look into compiling functional languages.

Actually, I don't have any personal experience with the Stanford course. I am reading the book for fun really. (For all its faults, it is quite engrossing!) I just thought it would be relevant to investigate under what context this book is generally used.