The amount of projects that do this is absurd. People have been saying this for ages and nobody seems to listen. You could have all sorts of fun based on the user agent, as well: if it looks like a normal browser, send the harmless script. If it's curl or wget, prepend "echo 'I hope you are not going to pipe this into /bin/sh'; exit 1".
Which exhibits exactly the failure scenario outlined in the article; partway through the script RVM cleans up after itself by running "rm -rf ${rvm_src_path}"
You're right, I'm completely wrong - the worst thing that can happen here is that the whole RVM directory is removed. Hardly catastrophic, and definitely not the extreme failure case the article is talking about.
> The amount of projects that do this is absurd. People have been saying this for ages and nobody seems to listen. You could have all sorts of fun based on the user agent, as well: if it looks like a normal browser, send the harmless script. If it's curl or wget, prepend "echo 'I hope you are not going to pipe this into /bin/sh'; exit 1".
It's not really absurd. In truth, this roughly equivalent to what people would do if you had them download the code, unpack it, and go to work.
What the author doesn't seem to know is that if the web server returns Content-Length (which the project really ought to ensure, and all the ones I've seen do), then wget will by default keep retrying until such time as the entire content stream has been successfully downloaded. If you Ctrl-C out of it, both wget and the shell process will terminate without processing any partially downloaded statements. The one liner is just automating the tedious manual steps that people would otherwise do, and arguably doing a better job of it than most.
You can and should do better (which is why package managers exist), but this is better than what a lot of people do if you don't provide the one liner. If you look at the commands in most one-liner shell scripts, there usually aren't a lot of destructive things like "rm" in them either.
Now, it'd still be better to do these one-liner shell scripts as shell archive whose last line extracts and runs the actual script, and you could use gzip encoding to provide some extra sanity checking, but for the most part one liner scripts are not the problem.
Yes it is. If I'm piping to a shell as an unprivileged user, there needs to be an additional round of exploits to do nastiness to other than that user.
For those whose security model looks like http://xkcd.com/1200 that's not a big difference; but depending on context it might be.
I raised that point with the comic. Frequently that's the case, in which case yes, not much different (though maybe it's less likely to do hardware damage, maybe...). It's not always the case.
Why is it "ultimate bravery"? Unless you compile your own kernel, the Linux kernel many of you are using right now is "arbitrary". That is, someone else compiled it and you are trusting that what it will do (or initiate, via init, for example) when you boot it is acceptable. My guess is most Linux users do not compile their own kernels.
Perhaps the problem here is not that these project "developers" are suggesting that others pipe unread scripts into /bin/sh, it's that too often these "developers", or the others on the receiving end of their scripts, have a poor understanding of shell scripting.
I find in the majority of cases where someone wants me to run their shell script, I could do what needs to be done without using the script or rewrite the script myself, using a minimal POSIX-like shell instead of bash or ksh, and reducing script size by 90% or a similarly large percentage. Then there's the minority of cases where I actually admire the care that the script author took to make the script efficient and portable, or where I actually learn something new by reading the script. That happens all too rarely.
While I'm on the topic of shell scripting, I'll add that I do a lot of shell scripting for my own personal use. Since no one else ever has to read my scripts, I take the luxury of ignoring the Shift key and write all my scripts in lowercase. I have no need for uppercase variable names; they only slow me down.
I note that djb also writes his shell scripts in lowercase. For whatever it's worth, I consider him to be one of the world's best open source programmers, and a competent shell scripter.
That's not really intended as a serious application though - but being able to boot over long distance HTTP links is a nice advantage for a lot applications. Plus you can use HTTPS with that.
What's the difference between piping to a shell and running a random executable or installing a random ppa? All 3 could do massive harm to your system.
One difference is that piping to a shell is a lot easier to inspect. So you could easily argue that piping to a shell is safer.
Heck, even downloading a source tarball and compiling it yourself isn't any safer unless you actually inspect that source. And who does that?
The issue isn't executing untrusted code, it's connections terminating early and causing your shell to execute incomplete code.
The article ends with the example code
TMP_DIR=`mktemp`
rm -rf $TMP_DIR
And the stream ending pre-maturely may cause the second line to end with rm -rf / and then execute it. While this wouldn't do anything anyway without --no-preserve-root added, it still brings up a good point about interrupted connections executing incomplete code which would otherwise be safe if the command was finished.
The fact that the change is so obvious and so simple and yet so many developers keep telling users to pipe the output into a shell is precisely why this is a "big deal".
You are assuming wget returns error codes reliably. It does not. Also, your example assumes write permissions to the current working directory and no risk of someone nefarious being able to write to the current working directory.
No it doesnt turn into that, it turns into wget https://install.sh && ls && less install.sh && less lala2 && echo "ok looks good enough" && ./install.sh # no sudo, to user local install, its just a package or program ffs,
when you want packages to install for all users, or system wide, then you use the default package manager of your distribution.
Then take that to its logical conclusion and read the source code for all the stuff you are installing, and the description of all the packages you are installing. Your argument was the strawman.
Would it make sense to mitigate this by creating a (even smaller) bootstrap script that itself downloads the "real" script and checks e.g. the SHA256 hash of the downloaded file before executing?
I realize that I tend to do that by default in my shell scripts. After a few years of Python, I always make a main() function which gets the command line parameters. Is this weird?
It just seems to me that the main should worry about interfacing with the outside world, and the rest of your code should really just be written as a library.
I think I do that to avoid global variables as much as possible, as well. Declaring things as "local" tends to keep me honest, as opposed to having a bunch of junk cluttering the global namespace.
Why that? Have we never heard of defaced web pages?
Granted, using shacat is much better than piping into sh. But basic learning from security breaches is that nothing is safe, you only can find ways to do thing in a less catastrophic manner than others.
> Why that? Have we never heard of defaced web pages?
Well if you can't trust the website you're screwed anyway. If the website is compromised then absolutely any way they have of installing software is broken
Good, but in some cases this won't work with latest versions (content will change), plus, you need to add instructions for installing shacat first, which defeats the point of having a single line install. It's always been convenience over security.
or possibly a self checking script that only executed if it was complete... ie: the script is escaped and must run unescape(escaped_script) to be lethal but by then you can confirm that the script is infact whole and as the creator intended to be...
Those scripts are typically very short. They'll either download completely or not at all. And if they do abort short then what are the chances that it's going to do it in exactly the wrong place? And then you have to manage to kill the wrong process to start the aborted script.
Statistically a cosmic ray flips a few bits in your RAM every year. Theoretically those bit flips could transform a benign program into a malicious one, and I'm sure that's happened to somebody somewhere sometime.
But it's very unlikely to happen to you tomorrow. You could use ECC RAM to prevent this from happening, but how many people do that?
Counterpoint to the author: have you ever heard of anyone ever having anything catastrophic happen because of this, ever? Have you ever heard of anyone even having it fail with garbage?
Catastrophic? I can't claim that. But I can tell you that I had a server drop out on me when installing software this way. Here's a quick quote of the bug report I sent to Linode when I was installing Longview.
> Bug: Update the nginx (?) error pages to be bash friendly
>
> root@marius ~]# curl -s https://lv.linode.com/nGiB | sudo bash
> bash: line 1: html: No such file or directory
> bash: line 2: syntax error near unexpected token `<'
> 'ash: line 2: `<head><title>502 Bad Gateway</title></head>
>
> This could be done by just dumping out echo statements since its already
> being parsed by the shell. Additional meta about the request (headers, etc)
> could be dumped into comments too…
>
> # 502 Bad Gateway
> echo
> echo "Uh oh: Something went sideways during the install!"
> echo "("
> echo "Re-running the installer should clear things up."
> echo
> echo " sudo !!"
> echo
Yep, I agree this is generally bad practice and is prone to error conditions, but I am mainly addressing the article since it is claiming that this should never be done because of the possible catastrophic consequences, and I just don't think that scenario is likely enough to mount a campaign to abolish this practice :)
Other than what others already said about early connection termination, there is one major reason why you should not pipe to a shell.
To avoid users complaining about how dangerous it is.
This is why I distribute a tarball with an installer script in it. It's functionally almost the same, not that much harder, but it avoids all the backlash and makes my software more likely to be used.
Tend to agree. There is still the trust relationship downloading source and using it, although I guess with a source bundle you can at least verify the hash and have the same (potential malware) as everyone else.
Also, no-one still mentioned the fact that not doing it over HTTPS with a client that checks certificates (you would be surprised at how many tools get this wrong, sometimes or always) is a complete code execution MitM vulnerability.
It is like giving away all the security built everywhere else and yelling "YOLO".
It's exactly as insecure as running an unsigned binary installer downloaded over HTTP (without checking checksums), which no one should ever do.
(Because the authors should offer it over HTTPS, sign it, provide some checksum or ship it through some package manager that knows better, if they can't secure the process themselves)
Well yes, that's the problem. Proper security is onerous, there's no use crying about it. Either do it right or accept that you're insecure. :P
Fix all the problems you can, of course, defense in depth is always a good idea. But hackers managing to alter a mirrored file without getting access to the website itself happens more often than you'd like to admit, and even hackers getting access to both a website and a tarball can be mitigated by signing tarballs instead of just listing sha sums.
If you trust your website won't be hacked to serve malicious code then you might do away with signing and cryptographic checksums, but then that only makes using TLS that much more important to avoid having your tarball zoinked by a MitM during transit.
This keeps coming up on Hacker News, and while I'm sure the people on Hacker News know this is bad, they probably still do it anyways because it's never had an adverse effect for them.
Speaking for myself, this has never caused a problem for me, and I'll probably keep doing it because it's convenient and that convenience is more valuable weighed against the potential bad things that could happen. Most likely is the case that the package just doesn't execute. The probability that it ends up on rm or something destructive is probably very low, and if someone is actively trying to MITM you, they will find a way if you are smart enough not to run scripts from wget, most people aren't the target of this kind of very specific attack.
Like Apple's TouchID – it may not really be secure, but it's very convenient, and that will often be enough to make it mainstream.
It's not a discrete one or the other choice here. I think it's very unrealistic to believe that for the average person this is a dangerous security risk in practical terms.
The number of times people might do this is probably well below 100 and there are much more risky day to day security faux pas than this.
The alternative most people are advocating is to download the script completely, and then run it if the download was successful. That can still be accomplished with a single line of shell script.
I would rather people not pipe to shells at all. It doesn't sound very secure. But if you have to do it, there are ways to avoid half-executed scripts:
Piping commands from the network may be a bad idea, but I pipe to my shell locally all the time. The idea is that you write a shell loop that prints a bunch of commands using `echo` and then once the commands look right, you pipe them to `sh -v`. It's great for the exact opposite reason piping from the network is awful: you can see what you're about to execute before you execute anything – I don't trust myself to write complicated shell loops without seeing what I'm about to run first.
Piping commands to a shell will subject your (probably already completely specified) command names and argument to all sorts of modifications: brace expansion, tilde expansion, parameter and variable expansion, command substitution, arithmetic expansion, word splitting, and pathname expansion.
I never do that; If I need a trial loop, I do the loop with all the relevant commands prepended by "echo" (and every ";", "&", "&&", and "||" properly escaped).
Yes, that's the point. You print the commands as you would type them manually, make sure they look right and perhaps test a single one out. Then you can pipe that whole thing into another shell. It's even reasonable to generate shell commands from, say, perl. Rather than having perl call system, just have perl print each command and then pipe the output of the whole thing to `sh -v`.
Unfortunately he's got this wrong. As long as the server returns a content length (which is up to the project to set up correctly), wget will retry until it gets the full length of the script. So the partial execute can't happen.
That's really about as well as you can do, because HTTP doesn't do a good job of reporting errors. You could try to get the content length in advance and then check against it after the download (which is basically what wget is doing), but that won't buy you much. Most servers won't do Content-MD5, so that's out. One smart thing to do would be to use "Accept-Encoding" to download a compressed version of the script and then do a decompression test before running. Alternatively, you can make the download script into a shell archive style script, such that it doesn't do anything until you get to the last byte, at which point it extracts out the real script and runs it (which wouldn't change what your install command is).
The whining about disabling the certificate check is also spurious. Most of the time these are scripts pointing to a non-https URL but which redirect to an HTTPS URL. You are already vulnerable when you do the HTTP request. On top of that, almost nobody is doing DNSSec, so you are already vulnerable at the DNS level. Even ignoring that, Salt offers it as a solution if you can't get the certificate check to work. The alternative would be to provide you with instructions on how to install a CA certificate, which someone is far more likely to screw up and unless you've established trust of the instructions themselves, could be just as vulnerable to a man-in-the-middle attack. Offering instructions on how to disable the check is a perfectly reasonable solution.
I'm pretty sure wget won't buffer an entire document before writing anything through the pipe to sh. And since it's possible for retries to fail, you have to assume sh could be sent a partial document, followed by EOF, no matter if http/https were used, or chunked/content-length, etc.
It is surprising to me that the shell will honor a command line without a newline[1], but it does:
$ (echo -n "echo rm -rf /tmp/mydir"; exit 1) | sh
OUTPUT: rm -rf /tmp/mydir
Obviously if tmpdir got truncated to /tmp, or even /, bad things would happen.
[1] Not so suprising when you consider sh -c "command" works without a trailing newline, I suppose.
> I'm pretty sure wget won't buffer an entire document before writing anything through the pipe to sh. And since it's possible for retries to fail, you have to assume sh could be sent a partial document, followed by EOF, no matter if http/https were used, or chunked/content-length, etc.
Nope. By default wget will literally try forever (you can even shut down your network stack, wait for 10 minutes, and then turn it back on, and wget will proceed from there), and until it closes the stream, the shell interpreter will act as though data is still coming. Any normal way of telling wget to stop will also stop the shell interpreter. Worst case is you end up in some "in the middle" state of the script, but you won't be misinterpreting the script due to early termination of the stream.
And actually, wget will write to stdout with full buffering, which means that small files will actually be completely buffered before writing anything out to sh.
That doesn't matter though because the problem that matters is NOT the buffering. Sure that can mean scripts execute partially and then exit, but they will never do something that wasn't intended. The problem the blog is talking about is when you get an EOF in the middle of a script.
And it's not at all surprising that it honours a command without a newline. Commands are terminated by semicolons, newlines OR EOF. That's pretty much how all shells work. You really wouldn't want the last line of a shell script to be ignored simply because someone didn't put a newline at the end of it.
Does anyone remember that project that was downloading a script directly from the most recent devel in a repository, and in order to demonstrate how insecure that practice was, someone actually included an rm -rf /home/
The maintainer didn't check the commit and included it in develop, which consequently was downloaded and... ect.
Wasn't it some stupid valentine's day thing? I looked for it earlier but couldn't find it; either my search-fu is failing or engines are getting worse.
The piping is for beginners. Rather than saying "Execute this script" or download and compile this tarball. It just works, magic. Advanced users will obviously wget and quickly read it. But hey, sometimes I like to live dangerously too, and I pipe things to my shell.
I think this merely hilights the bigger underlying issue: the lack of transactionality. If the install script were wrapped in a transaction, then premature exit or end of input (for whatever reason) would cause no harm because the transaction would have not been commited.
Well, it's risky, but you are anyway installing the software from those guys, aren't you? Do you audit every single source of every single app? No, you don't... everything is a matter of balance between a security and a convenience..
It's a really disturbing craze to install server software MS-DOS style and not use package managers! At the end of the day, building native packages with FPM, let's say, isn't such a big deal. At least, do an installer package.
Personally I prefer to save to file, then inspect (with less or vim) -- but some scripts are a little too clever. Lein is one great example of something that works, but at the same time is pretty scary (read time-consuming to verify by hand).
Another favourite are those tiny installers that wrap a nice "curl http://yeah.not.ssl.com/boom.sh | sudo -s" inside the "installer". Great for building confidence in a project.
Maybe the best thing to do is to just distribute the shell scripts encrypted with the private key of the project, thereby forcing users to run it through gpg. Or use a gpg-archive -- both PGP and GPG has support for this, see gpg-zip(1).
That way you can at least establish a chain of trust that goes straight back to the author, and links the installer(s) directly with a gpg-key. It's not really possible to get anything better than that. And you avoid having a separate .sig-file.
Apparently there's also a GNU project for distributing signed archives -- but it doesn't appear to have wide support.
What command can you use in a pipe chain to fully read the input until EOF before passing it on? I'm not a big fan of piping to the shell but such a command could be useful for network applications in general.
There is a --user-agent option supported by wget and curl. Maybe that could help increase confidence that something will work as it should. Or at least help investigate weirdness.
And slowly a humble `wget -O - <url>|sh` grows into a foot long command with user agent settings, save first/execute later, etc, etc... Why not use a package manager then? Added bonus - version control.
While i agree with the general folly of piping to a shell, have you ever actually tried to do a rm -rf / ? Most modern posix systems will catch you, even if you sudo it.
Also, barring that example I can't come up with many other horrible scenario. Unfortunate ones, sure. But not catastrophic. And as someone else said, adding random ppas would allow much worse things, and people do that all the time.
Incidentally, I think the last example in the OP ("rm -rf /") is wrong. The "/" would never be transmitted, it is part of the variable $TMP_DIR which is expanded on the local system, not the remote server. But the idea and the other test with echo seem correct.
Assuming the probability that you drop connection is evenly distributed amongst all characters, even if there is no payload and this is all that is executed then there's barely a 1% chance of the truncation happening in the way you describe.
Considering that there is usually a sizable payload and the probability of a dropped connection is not evenly distributed and is probably very low, the scenario gets even less likely.
Yes, it's possible, but it's also possible to rm -rf / because you typed a path wrong and I bet the probability of human error is much higher than the probability of this shell trick screwing you over. People have rm -rf /'d their systems, but even this isn't a good reason to advocate for say, removing rm entirely or not allowing people to type into the shell :P
BTW: If you want ULTIMATE BRAVERY, you have to boot an arbitrary kernel over the internet: http://ipxe.org/ (scroll to the bottom, where it says `iPXE> chain http://boot.ipxe.org/demo/boot.php`)