I think one of the larger issues with curl | sh is what could happen in the even...

tzs · on Nov 3, 2014

I had something like that happen once, but it worked out in my favor. I had misconfigured BIND, which I had intended to run just as a caching name server for my network, and it was listening for outside connections too. Some variant of the Lion worm found it and used a BIND bug to get onto my system.

It sent my password file to someone in China, started a scanner to look for other systems to infect, downloaded a .tar.gz file that contained a root kit to hide itself, unpacked the .tar.gz, and ran the install script contained therein to install the root kit.

Or rather, it tried to. I had ISDN at the time, and had noticed the modem lights heavily blinking even though I was not doing any internet activity. This confused me, and I pulled the plug on the modem. Turns out I pulled the plug while it was downloading the .tar.gz. It got most of the file, but not quite all of it. It lost the last file inside the archive--which happened to be the install script!

Without the install script, it could not install the root kit, and that made getting rid of the worm a heck of a lot easier.

nlo · on Nov 3, 2014

This can be mitigated by wrapping your whole script in parens, as in:

  #!/usr/bin/env bash
  (
    # real work
  )

oneeyedpigeon · on Nov 3, 2014

Why don't shells NOT execute a command if there's no trailing EOL character, to mitigate this very problem?

edwintorok · on Nov 3, 2014

Why should shells encourage this behaviour (piping unknown stuff to them?)

You should at least download, verify the size and checksum (if available), take a peek at it, and only then run it.

dllthomas · on Nov 3, 2014

Because you might be piping known stuff to them. Receiving a partial input stream is not necessarily reliant on networking.

edwintorok · on Nov 3, 2014

OK, but the detection of partial vs complete script is not as simple as 'does last line have EOL'? There are various builtins that require an end token, like case/esac, if/fi, etc. Do these work properly when truncated at an arbitrary line?

dllthomas · on Nov 3, 2014

Those control structures do work properly - the shell reads ahead until it finds the end token, and fails if it's absent.

It is true that there remains the problem of potential truncation exactly at the end of a top-level line, but I contend that "it stops running here" is a much easier thing to reason about (and, strictly, could always happen if hit with a SIGKILL anyway) than "does the meaning of this line change if we cut it off in a weird place".

michaelmior · on Nov 3, 2014

In at least some of those cases, I would expect curl to exit with an error and the pipeline to abort.

mikeash · on Nov 3, 2014

All bash sees are bytes coming in on stdin, and eventually an EOF. It neither knows nor cares what caused the EOF.

michaelmior · on Nov 3, 2014

Sure, but if the server response is not pipelined (as is probably often the case), then bash should never see anything.

mikeash · on Nov 3, 2014

HTTP pipelining is about reusing a TCP connection for multiple requests. It doesn't influence when curl outputs data and wouldn't apply here anyway. I don't think there's any mode which would cause curl to buffer the entire response before writing any of it.

michaelmior · on Nov 3, 2014

Yes, I'm aware of how HTTP pipelining works. It was a poor choice of terminology. My point is that by default curl does buffer some of the response. And if the connection was terminated before the first buffer was output, then I would expect this to result in an error which would abort the shell pipeline.

mikeash · on Nov 3, 2014

Yes, in some cases curl won't produce any output, like if the web server is down, or the connection fails before anything is returned. And yes, it would also happen if curl buffers some of the response and then dies. I don't really see why that's interesting.

michaelmior · on Nov 3, 2014

The default buffer size is typically the page size, which is typically 4096 bytes. I would expect a large number of these scripts to be less than 4096 bytes meaning curl would output nothing before producing an error and the partial script would never be evaluated.

mikeash · on Nov 3, 2014

That's the default buffer size for pipes, which won't matter here. When curl terminates, whatever's buffered in the pipe will be flushed. The only thing that could prevent downloaded data from being received by the shell would be internal buffering in curl, if it does any.

michaelmior · on Nov 3, 2014

Good point. curl doesn't do any internal buffering. I was thinking that the pipeline should be aborted if the curl exits with a non-zero status, but of course this is not the case.

mikeash · on Nov 3, 2014

Yeah, it would be nice if there were a way for a part of the pipeline to signal that something bad happened and everything should stop. Ideally, some sort of transaction system so the script is guaranteed to run fully or not at all. But instead we have this crazy thing.

LeonidasXIV · on Nov 3, 2014

Some scripts also detect this and are written so that there is no code executed before the file is not complete. Of course, that's a minority.

michaelmior · on Nov 3, 2014

Definitely including a checksum and validating that before executing would be ideal.

dTal · on Nov 3, 2014

To elaborate, this is quite easy: you wrap the entire contents of the script into a function definition, then call the function as the last line.