Hacker News new | past | comments | ask | show | jobs | submit login

I think more programmers should learn how to write parsers by hand. You need them everywhere; for protocols, file formats, user input, little languages. They often use regular expressions or other crutches when a parser would be the better choice.



Sometimes when the problem space is simple enough you can get the best of both worlds by using regex to perform your tokenization when lexing and then passing the tokens to a traditional stack based parser. It’s possible for this approach to be bad performance wise but it depends on the use case.

The lexer that yacc implements is indeed regex based and as I also recently discovered was originally written by a little known developer by the name of Eric Schmidt [1]

1: https://en.m.wikipedia.org/wiki/Lex_(software)


What would be good resources for this?


Start with a simple language like parsing JSON or S-expressions and do it in a language you already know. There are a ton of tutorials you can find online along these lines!


Course notes for University of Waterloo CS241 "Foundations of sequential programs"

Part of the course is understanding exactly when Regexps are good enough and when you need something more powerful.

http://anthony-zhang.me/University-Notes/CS241/CS241.html


As @rofrol already said, definitely Crafting Interpreters.

Just would like to add a link: https://craftinginterpreters.com/contents.html

It explains parsing and other topics in a really clear and accessible way.


Crafting interpreters?


Do you know Haskell? If not, I suggest you get accustomed to the language, and then read about monadic parsing [1] through Graham Hutton's work. Graham is a famous CS professor at U Notthingham, appears often in ComputerPhile [3,4], and wrote a book on Haskell [2].

I had to write an interpreter, optimizer and engine for a declarative language plus bottom up knowledge base in Haskell as part of an assignment, and an exam in a graduate course on advanced programming. Haskell made the problem significantly easier compared to languages I am much more comfortable with, like Python or C.

[1] www.cs.nott.ac.uk/~pszgmh/pearl.pdf

[2] https://www.amazon.com/Programming-Haskell-Graham-Hutton/dp/...

[3] https://www.youtube.com/channel/UC9-y-6csu5WGm29I7JiwpnA

[4] https://www.youtube.com/watch?v=eis11j_iGMs


Okay, this is really perplexing, this reply was at +5 and is now at -1. I don't understand why and I'd appreciate any explanation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: