Hacker News new | past | comments | ask | show | jobs | submit login

Bottom up parsing - "If not, it's necessary to backtrack and try combining tokens in different ways" I feel the way it is put along with shift reduce parsing is misleading. Backtracking is essentially an aspect avoided (more like solved) by shift-reduce parsing. They don't go together in bottom up parsing. Shift reduce parsers posses the potential to predict the handle to use by looking at the contents on top of the stack.

Good job BTW, there are very few people who write about compiler/language theory :)




For unrestricted CFG parsing, shift-reduce parsing does require backtracking. A modern parsing algorithm like GLR uses LR tables but has to resort to backtracking when it encounters shift-reduce conflicts (it avoids the exponential blow-up of backtracking by using dynamic programming to share results for subproblems).


Like in my other comment, I agree with this. I missed mentioning mentioning my assumption of a forceful shift or reduce action in case of an ambiguous grammar.


Not a fan of handling it that way. Shift-reduce conflicts don't correspond to people's intuitive understanding of the grammar because good shift-reduce parser tables are highly non-local as they're constructed from fixed-point dataflow iteration over the whole grammar. It's not like top-down parsing where it's easy to understand the compromise of making a non-backtrackable decision based on k-token lookahead. If you force them to manually resolve shift-reduce conflicts in the grammar, it's likely they will be surprised by what happens, and not in a good way.


Yes, it definitely isn't the right way to handle ambiguous grammars. My assumption was caused by extensive usage of YACC for writing parsers in the past. To avoid it's default action, generally ambiguous grammars are avoided (since it is used only to write programming language parsers anyway)


You need backtracking if you have an ambiguous grammar. This comes up in natural language parsing; I'd guess that it is avoided in programming languages.


Actually, many languages which started with hand written parsers do have ambiguous grammars, or have an unambiguous grammar so hideously unwieldy it is best ignored.

This sort of thing can be fixed up with predicates added to the grammar but it always feels like a bodge.


I totally agree with this. I was assuming the case of a default behavior like shift or reduce force fully for unambiguous grammars, which I should have mentioned. Thanks.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: