Shell-conduit: Write Shell Scripts in Haskell with Conduit

covi · on Sept 21, 2014

I still don't see why this is better (in any reasonable metric) than just plain Bash. Is there really any type safety we gain here? Is it more convenient than Bash? The only thing I see is the argument about quoting arguments.

_kdhr · on Sept 21, 2014

Forgive me for assuming you aren't a Haskeller and therefore aren't part of the target audience (maybe you are), but I'll try make up for the assumption by elaborating some general benefits.

The process arguments aren't type-safe, no. But the code that uses them can be. I can also use existing Haskell libraries mixed with this (which are type-safe). For example:

tail' "-f" "foo.txt" $= withRights (intoCSV def $= CL.map (\[a,b,c] -> b <> "\n")) $= grep "--line-buffered" "^4"

This line streams lines from foo.txt, parses each one with the CSV conduit parser[1], takes the second value and then feeds that into grep which spits out only columns which match the regex ^4. Or if I wanted to match only valid email addresses, I could import Text.Email.Validate and use this conduit:

… $= CL.filter isValid $= …

Which, again, comes from a normal Haskell package[2] which validates RFC 5322 emails with a full parser. I can also just extract the domain part of valid emails:

… $= CL.mapMaybe emailAddress $= CL.map domainPart $= …

I'm operating on structured, well-typed data in between normal UNIX pipes, in a streaming manner. I think that's pretty sexy.

Finally, when I decide this script is getting more complex than a mere script, I don't have to worry about maintaining it or refactoring it. I'm already using Haskell!

There's also the personal benefit to me, my editor support for Haskell is very good. Editor support for Bash is comparatively lacking.

I don't think it's more convenient than Bash if you take a myopic view of scripts that you write.

[1]: https://hackage.haskell.org/package/csv-conduit-0.6.3/docs/D... [2]: http://hackage.haskell.org/package/email-validate-2.0.1/docs...

covi · on Sept 21, 2014

Hey nice examples. However those examples only show that you are already operating/programming outside of the target use case of a Bash script -- and that's why a general-purpose language like Haskell comes in handy. IMO if one is to fully operate within the scripting level and to not worry about more complicated processing, Bash is easier. So I'm saying this is not really an apple-to-apple comparison.

(FWIW I did program in Haskell and I loved it.)

_kdhr · on Sept 21, 2014

Fair. That's partly what motivated my desire to be able to script within Haskell in the first place, I often find myself tredding that gray area. Half way through a Bash script (or a pipe chain) before realising I can't express what I want to express. A bunch of times I've given up and started writing a Haskell file. That's my personal experience, anyway. So I'd rather avoid bumping my head on the ceiling altogether.

Although I just saw this comic which seems somewhat germane :) http://www.catacrac.net/crac/w-16

cf · on Sept 21, 2014

So is conduit the recommended streaming IO library now? I saw there were a bunch of implementations of iteratees and was waiting for the community to coalesce around one.

_kdhr · on Sept 21, 2014

Conduit and Pipes are popular, as far as community support, libraries and implementations go. Secondarily there are also io-streams and machines. I don't think the community has settled or will settle on a single one.

carterschonwald · on Sept 21, 2014

Yup! Chris is absolutely correct. There's several different choices of libs for streaming computation that make different tradeoffs, and behave differently and are tuned for different workloads

cf · on Sept 21, 2014

Are there any summaries of those tradeoffs.

cies · on Sept 21, 2014

See here the discussion on reddit:

http://www.reddit.com/r/haskell/comments/2h017v/shellconduit...

dscrd · on Sept 22, 2014

"Its syntax is insane"

=>

    (do source
        source $= conduit)
    $$ sink

vs.

    source
    source | conduit

=> hmm.

_kdhr · on Sept 22, 2014

http://stackoverflow.com/questions/4277665/bash-how-do-i-com...

tinco · on Sept 21, 2014

Oh man, I just spent a lot of time building a project that executes a bunch of shell scripts and feeds them into haskell pipes. I most definitely could've used this. Perhaps I will still port it since it looks so clean.

danidiaz · on Sept 21, 2014

You mean the "pipes" library? I have a bunch of helper functions for pipes & process here: http://hackage.haskell.org/package/process-streaming

_kdhr · on Sept 21, 2014

If it's a shortish script and public I'd appreciate a link to it. I'm looking for real use-cases to study and put in the README. =)

x3ro · on Sept 21, 2014

From the article: "Its syntax [Bash] is insane, incredibly error prone, its defaults are awful, and it’s not a real big person programming language."

A "real big person programming language"? Wow, that guy's got some issues :D Seems like his definitions seems to be a functional programming language? I'd assume that there is way more "real world big importance" stuff written in Bash than there is in Haskell.

jerf · on Sept 21, 2014

Every so often, after a particularly painful bout with interactive Bash, I find myself mentally designing some shell replacement based on some other language, sometimes custom, sometimes something like Python. And I always end up reminding myself of the same conclusion... nobody will use the resulting interactive shell, because the bash defaults and error handling and stuff mostly make sense interactively (I mean, I can quibble, but let's admit that most of us are pretty comfortable with them in interactive mode), and making everything more explicit in some hypothetical replacement will also make everything more annoying to use. It's hard for me to make anything that's more terse than Bash, and for an interactive shell that's a big deal.

However, the interactive use case makes one big assumption, which is that you, a human being, are sitting there, statement by statement, and looking at all the output. It assumes the "return codes" of commands are advisory, and not something you literally want to see on every command. It assumes that if you're having trouble with some escape sequence you can interactively work out what you're doing with "echo" or "ls" or something. It's fundamentally built around expecting to be interactive.

It is unsurprising that that assumption doesn't work out well when trying to program in it. Where it makes sense for interactive Bash to be somewhat sloppy with things and expect the human to pick up the pieces, and to be clear that is 100% true, not sarcasm, that's a terrible thing in a programming language. It takes a bare minimum of set -e (stop on errors) and set -u (stop if unset var used) just to turn it into a semblance of a safe language, and that's still just putting lipstick on a pig.

I'm not terribly convinced that a shell can be optimized for both interactive and programmatic use... certainly the two modes will be fighting with each other in the design even if you pull it off, and the whole will be more complicated than an interactive-optimized language + "just use Perl/Python/etc". Of course it's too late to remove shell scripting for bash or UNIX, but the more serious the task you're trying to do, the less you should be reaching for shell scripting to do it with.

Unless you're doing something in which you really don't care that the script hit an error halfway through and it's just fine for the entire script to keep blundering along despite having no idea what's going on at that point.

_kdhr · on Sept 21, 2014

Regarding interactivity, here's a short demo of using shell-conduit as part of my Haskell shell called Hell:

https://www.youtube.com/watch?v=eaNr03yI4vE&feature=youtu.be

It's a REPL that will automagically run conduits if it sees them. And also provides function name and file completion via tab. So you can write ls S TAB and get ls "Scripts/", like I do in the video. It also tracks if the directory changes and updates the prompt, has a history, etc.

I don't use it as my main shell (yet), but it's available to me to experiment with.

chubot · on Sept 21, 2014

There is some truth to what you say, but bash has functions, which largely mitigate it. The easiest way to make bash sane is to put everything in functions. Then everything can be tested interactively. I don't know why people put 300+ lines of code at the top level -- that makes changes very difficult to test.

A good trick for this is to put:

  "$@"

at the bottom when you are testing. Then you can do:

  ./myscript.sh func-name arg1 arg2 ..

If you really want, you can change it to

  main "$@"

before deploying so it only runs the main function. Those lines, and constants, should be the ONLY things at the top level of a bash script IMO (even a 10 line one).

Don't stare at bash code without running it -- that's crazy. I have taught Python and also admonish people to not "stare" at their code; just form a hypothesis and run it. I suspect this is where Haskellers have problems, because they think that not running your code is a good thing.

I love shell; it's one of my favorite languages now. I don't think there are really that many warts once you know it. You're right that set -e and set -u are generally code ideas, along with set -o pipefail.

taeric · on Sept 21, 2014

"Don't stare at bash code without running it -- that's crazy. I have taught Python and also admonish people to not "stare" at their code; just form a hypothesis and run it. I suspect this is where Haskellers have problems, because they think that not running your code is a good thing."

This was awesome. And, I agree, probably close to the mark.

It is somewhat odd in this day and age when computers do run so bloody fast that so much effort seems to be spent in not just running an application.

Of course, this is clearly easy to take too far. I feel that a lot has been lost by folks that did not first draft out their intentions on paper. (Well, at least I know I have wasted a fair bit of time in that regard.)

dllthomas · on Sept 21, 2014

"Of course it's too late to remove shell scripting for bash or UNIX"

I don't think we should. As you say, bash is primarily a UI. But I think what is wonderful about shell scripts is the ability to capture and replay an action in that UI with zero cognitive overhead - even a somewhat complex action. It is absolutely the case that once you move beyond "I would type this into my shell without inspecting intermediate states and not be worried I missed something", you should be reaching for something other than bash. But that space is able to provide a lot of value.

dllthomas · on Sept 21, 2014

'It assumes the "return codes" of commands are advisory, and not something you literally want to see on every command.'

If you do (and I do) you can add $? (or better - ${PIPESTATUS[*]}) to your prompt.

networked · on Sept 21, 2014

>Every so often, after a particularly painful bout with interactive Bash, I find myself mentally designing some shell replacement based on some other language, sometimes custom, sometimes something like Python.

I would suggest trying Tcl for this use case. I found it to be just the right language with which to replace shell scripts. The syntax and the built-in command set are easily mapped from those of sh (Tcl has "cd"!), however, it has saner and more straightforward semantics than the POSIX shell (like how variable expansion and quoting are handled [1]) as well as more features (e.g, instead of 'set -e' you get real exceptions on errors that you can catch). Tcl also pushes the idea of all values being strings to the point of arguably being homoiconic. A big boon is that external *nix commands are easy to mix with the built-in commands thanks to 'exec' and 'open |command'. For piping external commands 'exec' understands the shell pipeline syntax but generic pipelines can be implemented as well [2].

If you need to ship scripts Jim Tcl, a smaller implementation of the languages, is worth looking into. It is portable and self-contained enough to be used as the basis for a build system [3] and mostly fixes one of Tcl biggest flaws, the lack of closures. Despite the small size it's a real programming language with very reasonable performance and great UTF-8 support, including in regular expressions; I'm making a toy embedded web microframework for it and I was able to get it to service up to 50 requests per second running on an old Palm Pre Plus, which is a pretty slow device.

One of the best things about Tcl is eltclsh [4], which allows you to use Tcl as an interactive shell with readline editing and tab completion for commands, variables and filesystem paths. It removes the need to use 'exec' for external commands; when a command is not recognized eltclsh as a procedure or a built-in it looks for it under $PATH. Admittedly, I don't run eltclsh as my login shell but for complex file system manipulation I find it highly useful. Another great feature is that with TclVFS installed and mounted you can 'cd' to FTP paths and copy files from HTTP paths, e.g.,

    cd ftp://www.kernel.org/pub/linux/kernel

As for other alternatives to shell suitable for both scripting and interactive use, Racket's XREPL mode could be a candidate if you added path completion to it and maybe ParEdit-like functionality or some kind of autosuggestion for placing parentheses as you type. (I know some people swear by scsh [5] as a shell replacement for scripting but it lacks readline and completion.)

[1] http://www.tcl.tk/man/tcl8.5/TclCmd/Tcl.htm

[2] http://wiki.tcl.tk/17419

[3] https://msteveb.github.io/autosetup/

[4] https://www.openrobots.org/wiki/eltclsh

[5] http://scsh.net/

parallelist · on Sept 21, 2014

Number of things written in a language isn’t really a good measure for its quality. There was a time when most things were written in COBOL or assembly. I think the main reason people write so much BASH is because there by default.

_kdhr · on Sept 21, 2014

"Real big person programming language" was just a jokingly childish indulgence, don't read too much into it. ;-)

dllthomas · on Sept 21, 2014

My view is that Bash is not a "real big person programming language".

I say this as someone who spent a year maintaining and extending a system composed of 90k lines of bash, who lives at the bash prompt, and who frequently reaches first for bash when I need a small bit of logic.

What bash is is a real big person UI (though I make no particular claim about its uniqueness as such).

techdragon · on Sept 21, 2014

It also ignores the established practices system admins have had for literally decades now of writing hard or complex scripts from bash into more advanced languages like Perl and Python.

tome · on Sept 21, 2014

If you read the article you will discover that the very next paragraph is called "Perl/Python/Ruby are also evil".

_kdhr · on Sept 21, 2014

For what it's worth before someone takes it too seriously, the target audience is Haskellers: I'm preaching to the choir but pretending to give other languages due diligence. Using Haskell for this is a priori.

techdragon · on Sept 22, 2014

And this is why Haskell is developing into the next "brogrammer" problem, except this time it's not as easy to give a catchy name to. Haskell programmers are developing a reputation (at least it seems so from the people I work with regularly as a system administrator) as arrogant obsessive purists.

Would it be that hard to give python and Perl programmers a reason to consider Haskell for their next shell script? We aren't stupid, we will use what's good and worth it, many programmers start out writing simple imperative scripts, they are wonderful places to learn new skills, so why be dismissive about non Haskell programmers like this.

Not saying you are trying to be an asshole, just saying that the lack of reasons other than "Haskell is better, just accept it" pisses off a lot of people and the more we see that message the more grating it becomes.

imanaccount247 · on Sept 21, 2014

How does it ignore that? That supports what he is saying. sh is a bad language, so people use perl and python instead. This is simply making it easier to use haskell rather than perl or python.

danieldk · on Sept 21, 2014

One important of Python and Perl is that most systems have them pre-installed anyway, usually because some included system administration require them. If I write a script with #!/usr/bin/python as the interpreter, I can be fairly certain that it work (though it got worse since Python 3).

I do agree that Haskell provides much nicer and safer abstractions. From a high-level, conduit is like UNIX pipes, but with typing.

imanaccount247 · on Sept 22, 2014

>If I write a script with #!/usr/bin/python as the interpreter, I can be fairly certain that it work

No you can't. Only some (poorly though out) linux distros do that. Sane linux distros (back when those existed) and every other unix-like system in existence won't have a /usr/bin/python. This is why /usr/bin/env exists and is a posix standard.

techdragon · on Sept 22, 2014

Which is fine, because that is merely a small part of the point. Using better languages is "normal", and we already have a 20 year history of doing it with Perl, and a nearly twenty year history of doing it with Python, the larger point being, "anyone writing these scripts knows how the hash bang works". In my mind, adding to the complexity of a system by requiring Haskell and this library is counterproductive when the goal us to improve maintenance efficiency.

imanaccount247 · on Sept 22, 2014

That makes absolutely no sense. You are complaining that a haskell library, written for people who use haskell, requires haskell. Python code requires python, perl code requires perl, ruby code requires ruby. So?