In my experience, the parser definitely needs to be a library, but I'm not sure ...

ddevault · on Dec 29, 2018

Thanks for sharing your insights!

>There are a few places in POSIX where the parser has to be invoked recursively

In your examples, we have the runtime invoke the parser as necessary. The parser doesn't know about the runtime. For alias resolution, we have a callback function, which hooks into the runtime but is pretty thin and abstract.

>For history expansion

Thankfully, this is non-POSIX so mrsh doesn't have to worry about it.

>one problem is that the shell inherently modifies global process state [...] i.e. my definition of library is that you can make instantiate multiple instances of it with different parameters

My definition doesn't line up with yours. My definition is a shared object or static archive and a bunch of headers with an API you can link to instead of implementing something yourself.

chubot · on Dec 29, 2018

Are you parsing command subs at runtime too? Bash does that [1], but I believe it's a bad idea. dash, mksh, and zsh seem to do it "the right way", although none of them statically parses as much as OSH.

IIRC a case that really seals the deal is:

    $ echo $(case x in x) echo foo;; esac)
    foo

How do you find the closing paren? You basically have to parse shell, so you might as well do that at parse time rather than runtime. There's a section in the aosabook bash chapter that talks about that.

In other words, bash has had parsing bugs with PAREN MATCHING for 20 years (I have a case in my suite that was fixed between bash 4.3 and 4.4). If you just statically parse then you can get it right all on the first try.

It can get arbitrarily complicated, you can add a subshell and nested command subs in there too, etc.:

    $ echo $( ( case x in $(echo x)) echo foo;; esac) )
    foo

Bash syntax makes it worse, but this problem appears in POSIX sh too.

[1] http://www.oilshell.org/blog/2016/10/13.html

ddevault · on Dec 29, 2018

    ~/s/m/build > cat test.sh
    #!/bin/sh
    echo $(case x in x) echo foo;; esac)
    ~/s/m/build > mrsh -n test.sh
    program
    program
    └─command_list ─ pipeline
      └─simple_command
        ├─name ─ word_string [2:1 → 2:5] echo
        └─argument 1 ─ word_command ─ program
          └─command_list ─ pipeline
            └─case_clause
              ├─word ─ word_string [2:13 → 2:14] x
              └─items
                └─case_item
                  ├─patterns
                  │ └─word_string [2:18 → 2:19] x
                  └─body
                    └─command_list ─ pipeline
                      └─simple_command
                        ├─name ─ word_string [2:21 → 2:25] echo
                        └─argument 1 ─ word_string [2:26 → 2:29] foo

chubot · on Dec 29, 2018

OK it looks like mrsh is parsing command subs at parse time, which is good! bash doesn't do that.