Szl, a Tcl-inspired embeddable language

no_protocol · on Oct 8, 2016

This 3-letter name is already taken [0], might be worth at least considering a different one.

It looks like this probably qualifies as a Show HN as well, if you can edit the title.

What are your future plans for the project? Are you using it regularly?

thesmallestcat · on Oct 8, 2016

If Google can reuse names like "Go" and "Dart," then their "szl" is fair game for reuse.

spotman · on Oct 8, 2016

Fair game sure, but unfortunately confusing. If there was another dart that was an actual programming language ( Is there ? ) you would have to qualify every time which dart programming language you meant. I imagine searching for things online would be tricky.

munificent · on Oct 9, 2016

> f there was another dart that was an actual programming language ( Is there ? )

I don't believe so. We changed the name to "Dart" from "Dash" because the lawyers did a trademark search and felt the former was a safer bet.

thesmallestcat · on Oct 10, 2016

szl is the name of the compiler, not the language, but point taken.

qznc · on Oct 8, 2016

The language manual: http://dimakrasner.com/szl

no_protocol · on Oct 8, 2016

The writing in here looks pretty clean. My one suggestion would be to get some more examples in there. Most of the descriptions are one line and as a total outsider I just wanted to see some in action.

This document is much more appealing to me than the project's README. Since people often browse just the README when visiting a repository on GitHub, I wonder if you would be better off getting some of the language from sections 1 and 2 into the top of your README. The current README Overview just didn't do it for me. As soon as I saw the first two sections of the user manual, I was more interested because it showed what problem you were solving and how.

> Warning - This manual may reflect a future state of szl.

Got a chuckle out of me.

ufo · on Oct 8, 2016

Why does every command start with a "$"?

dimkr1 · on Oct 8, 2016

Because everything is an object, including built-in functions

nowayyeah · on Oct 8, 2016

Then why the $, you already know that it's an object.

dimkr1 · on Oct 8, 2016

Nope. The first token in a statement is treated the same way as those that come after it. This way, you can do stuff like this (i.e. imagine a dictionary that maps HTTP methods to hander functions):

    >>> $local functions [$list.new $puts $sleep]
    puts sleep
    >>> [$list.index $functions 0] hello
    hello

ufo · on Oct 8, 2016

I think it might be better to special case the first word in the command, just like sh and tcl do. The list.index case you mentioned is much rarer than just calling a command directly.

qznc · on Oct 8, 2016

I cannot find the actual differences to Tcl?

dimkr1 · on Oct 8, 2016

It isn't Tcl, but a language inspired by Tcl. I think the manual tells the difference, once you see the more restricted syntax.

catuskoti · on Oct 8, 2016

Fundamentally it is more like tcl (function chaining) than like bash (process chaining), where process pipelining is the central operation, and where functions and programs are equivalent. In that sense, it has mostly the advantages but also the inconveniences of tcl, javascript and similar non-shell scripting languages. My opinion is that they have quite missed the ball in bash on non-nestable arrays when they introduced them quite recently, but besides that, bash is still pretty much my favourite tool.

chubot · on Oct 8, 2016

Yes, in writing my bash parser/interpreter (see below) I explored the non-nested arrays and assoc arrays issue.

I think it mainly has to do with the fact that bash has no references and garbage collection. When you create an array like [1, 2, [a, b]] in Python/Ruby/Perl, you are introducing the concept of references, as well introducing hte potential for reference cycles.

In contrast, an array in bash is just a value, and it took me awhile to figure out how to even copy it. None of these work:

    a=('x y' z)

    b=${a}
    b=${a[*]}
    b=${a[@]}

    b="${a[*]}"
    b="${a[@]}"

    b=( ${a} )
    b=( ${a[*]} )
    b=( ${a[@]} )

    b=( "${a}" )
    b=( "${a[*]}" )

This works:

    b=( "${a[@]}" )

In other words, it takes 11 characters to properly copy an array! This syntax is horrible.

jstimpfle · on Oct 9, 2016

What's more horrible about it is that the simpler sh constructs are somehow not respected in bash culture.

In sh, each function has a local array, and you use that array with "$@". And typically you can get along with only that array. Not so bad.

    my_b() {
        somefunc foo "$@" bar
    }

_qc3o · on Oct 8, 2016

I think someone needs to write bash the good parts. It is easy to get into a mess because bash wasn't really designed with modular programming in mind. I very often fall back to using files and exported environment variables to pass information between scripts which isn't that great. I still prefer it to ruby or python or some other scripting language because as you said pipes are very convenient.

chubot · on Oct 8, 2016

I'm actually working on that... My viewpoint is that sh is semantically a nice and useful language, but it has horrible syntax and bad/old implementations (I've been looking at bash, dash, mksh, and to some degree zsh).

I'm polishing up a very complete shell parser and a less complete interpreter in 10,000 lines of Python. Most of the problem is parsing -- executing is easy once you've done that. It's actually closer to the bash superset than POSIX sh, so it will be able to run many real programs.

I started with 3,000 lines of C++, but felt like I would never finish with that iteration speed. Now that I know the language inside and out, I can port it back to C++. There are 3,000 lines of tests in a custom test framework which can run against any sh implementation.

I have a few blog posts planned ... if anyone is interested in a link send me an e-mail with subject line "new shell: me@example.com" (address in profile)

I discovered a lot of interesting things about the language and its implementations, like:

- It's actually 4 languages/parsers interleaved, which I call: command, word, arithmetic, boolean (boolean is [[ which is the "compile time" version of [). Then there are some tiny sublanguages like glob and X{a,b}Y brace expansion.

- It can be parsed preetty completely up front, in one pass. Current implementations tend to defer parsing of the stuff inside ${a} $( ) and $(( )) etc. for subsequent passes. They interleave parsing and execution.

But there's only one situation where parsing really depends on execution -- indexing of assoc arrays vs arrays (${a[X] vs ${A[X]). This is in contrast to Make and Perl, which both interleave parsing and execution.

- places where the POSIX spec is stricter than implementations. All implementations accept 'echo 1 >out.txt 2 3' in addition to 'echo 1 2 3 >out.txt', but the POSIX grammar doesn't allow this.

- places where the POSIX grammar is more lenient than bash. The POSIX grammar allows functions defs with a single statement, without {}, like:

    $ dash -c 'f() echo hi; f'
    hi

But bash doesn't allow this.

There are a whole bunch of other things but I will save them for the forthcoming blog.