This is very cool, but a little opaque at first read... To get a quicker and digestible intro, watch the video introduction (linked on the page): http://www.youtube.com/watch?v=OpaiGYxkSuQ
Well, the example(at least the first one) in that video is bit skewed. First, he runs gzip and then immediately runs 'parallel gzip' without dropping disk caches. So in the later case the bottleneck would be CPU rather than disk IO(everything read from disk cache in the RAM). IMO I expect for the work that is IO bound we won't see any significant improvement using parallel or anything similar.
Here's an imagemagick example; over six minutes with xargs, under 20 seconds with parallel
$ ls *.png |wc -l
3580
$ time ls|sed 's/\(.*\)\..*/\1/'|parallel convert {}.png {}.ppm
ls --color 0.00s user 0.01s system 63% cpu 0.016 total
sed 's/\(.*\)\..*/\1/' 0.01s user 0.00s system 39% cpu 0.025 total
parallel convert {}.png {}.ppm 97.39s user 61.87s system 890% cpu 17.883 total
$ time ls|sed 's/\(.*\)\..*/\1/'|xargs -I {} convert {}.png {}.ppm
ls --color 0.01s user 0.00s system 63% cpu 0.016 total
sed 's/\(.*\)\..*/\1/' 0.01s user 0.00s system 39% cpu 0.025 total
xargs -I {} convert {}.png {}.ppm 93.08s user 47.88s system 38% cpu 6:10.88 total
I wrote a simpler one a couple of years ago (http://code.google.com/p/spawntool/) myself. All it does is read commands from stdin, one per line, and keep a desired number of processes running until all command lines are exhausted. Simple.
I wrote my own because I got tired of all kinds of substitution and quoting issues with xargs. With spawn I only need to generate the shell commands and instead of piping them to bash I pipe them to spawn. Also, this means I can easily review your command line generation with less (so that quotes etc. are good) until I eventually switch to sh or spawn.
As I understand it, xargs only runs on the local machine; GNU parallel can run on remote machines as well. So parallel is the cluster-friendly version of xargs's -P.
Probably, although it doesn't seem to be the focus of xargs.
And the version of xargs that is included with Solaris 10 doesn't have the -P option. In which case, installing gnu parallel is a slightly easier option than installing a different version of xargs.
Because it can be run on the command line, ad hoc. Make -j is great for pre-existing command lists and dependencies. But as the man page describes, parallel is like xargs, which I use all the time on the command line for ad hoc actions (frees me from having to write a bash loop).
make requires a Makefile, whereas one can pass parameters directly to parallel.
There also seems to be a few more options revolving around job success/failure and how to react -- a) ignore failed jobs and report how many at the end, b) cleanly exit as soon as a job fails and c) stop all jobs as soon as one fails.
It can, but that was not its intended purpose. That is, you can figure out a way to map your task to a dependency hierarchy and save it to a Makefile, but why do that when you could use something designed for that?
Examples should be at the top!
(10 years of frustration in that one line message :p)