Hacker News new | past | comments | ask | show | jobs | submit login

Nice article. Really easy to follow introduction.

I only discovered process substitution a few months ago but it's already become a frequently used tool in my kit.

One thing that I find a little annoying about unix commands sometimes is how hard it can be to google for them. '<()', nope, "command as file argument to other command unix," nope. The first couple of times I tried to use it, I knew it existed but struggled to find any documentation. "Damnit, I know it's something like that, how does it work again?..."

Unless you know to look for "Process Substitution" it can be hard to find information on these things. And that's once you even know these things exist....

Anyone know a good resource I should be using when I find myself in a situation like that?




Be aware that process substitution (and named pipes) can bite you in the arse in some situations --- for example, if the program expects to be able to seek in the file. Pipes don't support this and the program will see it as an I/O error. This'd be fine if programs just errored out cleanly but they frequently don't check that seeking succeeds. unzip treats a named pipe as a corrupt zipfile, for example:

  $ unzip <(cat z)
  Archive:  /dev/fd/63
    End-of-central-directory signature not found.  Either this file is not
    a zipfile, or it constitutes one disk of a multi-part archive.  In the
    latter case the central directory and zipfile comment will be found on
    the last disk(s) of this archive.
  unzip:  cannot find zipfile directory in one of /dev/fd/63 or
        /dev/fd/63.zip, and cannot find /dev/fd/63.ZIP, period.


Useful. I'm assuming those are cases where normal pipes would fail too? So you can't do:

    cat z | unzip   # I know, uuoc, demo only
It's just with the process substitution you have more flexibility to shoot yourself in the foot?


Aside: If I recall correctly, with the zip file format, the index is at the end of the file. A (named) pipe works fine with, for instance, a bzipped tarball.

I wouldn't be surprised if the ZIP file format has its origins outside the Unix world given its pipe-unfriendlyness.


stdin and stderr are assumed to always be streams, so anything which accepts them won't seek on them.

However, if you give a program a filename on a command line, the program's likely to assume it's an actual file.


I often use SymbolHound in these cases - it's a search engine with support for special characters; for example: http://symbolhound.com/?q=%3C%28%29


Thanks for that. I've already made use of it since you pointed it out.


For this in particular, try the Advanced Bash Scripting doc, http://www.tldp.org/LDP/abs/html/

There's a bunch of interesting constructs there. Most of them also apply to improved shells such as zsh, though some are just pointless there.


I've seen the ABS guide criticized for being obsolete and recommending wrong or obsolete best practices. A recommended replacement is The Bash Hacker's Wiki: http://wiki.bash-hackers.org/doku.php

Which itself recommends as best (current) alternative: Greg's Bash Guide: http://mywiki.wooledge.org/BashGuide


man pages!

  $ man bash
  /<\(
  
Drops you right into the Process Substitution section.


    /<\(
For those wondering, man, which uses less as a pager, has vi-like key bindings.

"/<\(" starts a regex-based search for "<(" (you must escape the open paren).

This is the origin of the regex literal syntax in most programming languages that have them. It was first introduced by Ken Thompson in the "ed" text editor.


Ahhhh..haaa...ha....DOH! I've never even thought of looking at the manpage for bash before. Thanks, you've just made my life better.


I try to read it about once a year. I always find something new before my eyes glaze over and I'm done until next year.


If you're interested in stock Posix shell rather than bashish, the dash man page is a whole lot shorter and easier to follow, makes a great concise reference.


Nitpick: should read "POSIX-compliant shell", there isn't a "stock POSIX shell"


That's weird, isn't it? You wanted to know how to use a feature of bash and didn't check the manual?


Do you expect `man python` to output the full reference for the Python programming language?


I would, especially considering that "The Python Language Reference" is probably shorter than bash's manpage.


I wish that it did. The man pages really are supposed to be the manual.


I agree, though it does tell you where to look for it.


A bit OT but I don't understand why Google doesn't supply a way to do strict searches where everything you input is interpreted literally.


They have, I have complained loudly about this[1], never hard anything back (this is SOP I understand), but I have seen improvements last year.

Double quotes around part of a query means make sure this part is actually matched in the index. (I think they still annoy me be including sites that are linked to using this phrase[2], but that is understandable.)

Then there is the "verbatim" setting that you can activate under search tools > "All results" dropdown.

[1]:And the reason they annoyed me was because they would still fuzz my queries despite me doublequoting and choosing verbatim.

[2]: To verify this you could open the cached version and on top of the page you'd see something along the lines of: "the following terms exist only in links pointing to this page."


They still ignore any special characters: https://www.google.com/search?q=%22%3C()%22&tbs=li:1


Because if you want to ignore punctuation and case in normal situations, you leave them out of the search index. And then you can't query the same search index for punctuation and/or case-sensitive queries.

So they'd have to create a second index for probably less than 0.01% of their queries, and that second index would be larger and harder to compress.

As much as I'd love to see a strict search, from a business perspective I don't think it makes sense to a provide one.


I wish they'd supply that too, but they do seem to have gotten better at interpreting literally when it makes sense in context. I've been learning C# and have found, for example, that searches with the term "C#" return the appropriate resources when in the past I'd have probably seen results for C.


Google handles some constructs with punctuation as atomic tokens as special cases. C# and C++ are examples. A# through G# also return appropriate results, for the musical notes. H# and onward through the alphabet do not.

.NET is another example. Google will ignore a prepended dot on most words, but .NET is handled specially as an atomic token. I would bet this is a product of human curation, not of algorithms that have somehow identified .NET as a semantic token.

Searching for punctuation in a general case is hard, though. You wouldn't want a search for Lisp to fail to match pages with (Lisp). We often forget that the pages are tokenized and indexed, that Google and the other search engines aren't a byte-for-byte scan across the entire web.

I was recently trying to understand the difference between the <%# and <%= server tags in ASP.NET. Google couldn't even interpret those as tokens to search for. It took me a long time to figure out the former's true name as the data-bind operator in order to search for that and find the MS docs.


Occasionally it's useful to spell out the names of the characters, both when searching and when writing documentation, blog posts, and SO Q&A. That way, searching for "asp.net less than percent hash" might tell you it's the data-bind operator.


#bash on freenode, and http://mywiki.wooledge.org/BashFAQ


books on UNIX and shells




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: