Hacker News new | past | comments | ask | show | jobs | submit login

Here's an interesting bit of...abuse :) Since Bash is effectively stringly-typed, it can be used as a functional programming language, with pipes similar to function composition.

e.g.: wtfp.sh

    #!/usr/bin/env bash

    map() {
      local fn="$1"

      local input
      while read -r input; do
        "${fn}" "${input}"
      done
    }

    reduce() {
      local fn="$1"
      local init="$2"

      local input
      local output="${init}"

      while read -r input; do
        output="$("${fn}" "${output}" "${input}")"
      done

      echo "${output}"
    }

    filter() {
      local fn="$1"

      local input
      while read -r input; do
        "${fn}" "${input}" && echo "${input}"
      done
    }

    add() {
      echo $(( $1 + $2 ))
    }

    increment() {
      echo $(( $1 + 1 ))
    }

    square() {
      echo $(( $1 * $1 ))
    }

    even() {
      return $(( $1 % 2 ))
    }

    sum() {
      reduce add 0
    }

    map increment | map square | filter even | sum
...then:

    $ printf "%s\n" 1 2 3 4 5 | ./wtfp.sh
    56



This is the pipe mill[1] pattern. As you say, you can use it to mimic function composition from a functional paradigm. It can make for some elegant solutions to dealing with streams of data.

The issue is that pipe mills are very slow.

    $ mill() { while read -r line; do echo $line; done }
    $ export FIVE_MEGS=$(( 5 * 1024 ** 2 ))

    $ time yes | mill | pv -S -s "$FIVE_MEGS" > /dev/null
    5.00MiB 0:00:23 [ 221KiB/s] [============>] 100%            
    real    0m23.084s
    user    0m14.121s
    sys     0m26.780s

    $ time yes | pv -S -s "$FIVE_MEGS" > /dev/null
    5.00MiB 0:00:00 [6.55GiB/s] [============>] 100%            
    real    0m0.005s
    user    0m0.000s
    sys     0m0.006s
Even Python loops are faster.

    $ export PYLOOP="from sys import stdin
    for line in stdin:
      print(line)"

    $ time yes | python3 -c "$PYLOOP" | pv -S -s "$FIVE_MEGS" > /dev/null
    5.00MiB 0:00:00 [67.1MiB/s] [============>] 100%            
    real    0m0.082s
    user    0m0.071s
    sys     0m0.019s
[1] https://en.wikipedia.org/wiki/Pipeline_(Unix)#Pipemill


Did you try running the same benchmark in dash or ksh? Looping in bash is criminally slow.


I did the comparison if anyone is curious.

dash is faster but not as much as I'd expected:

    $ time yes | bash mill | pv -S -s "$FIVE_MEGS" > /dev/null
       5MiB 0:00:24 [ 208KiB/s] [====================================================>] 100%            

    real 0m24.614s
    user 0m21.338s
    sys 0m11.749s

    $ time yes | ksh93 mill | pv -S -s "$FIVE_MEGS" > /dev/null
       5MiB 0:00:11 [ 447KiB/s] [====================================================>] 100%            

    real 0m11.451s
    user 0m6.480s
    sys 0m12.441s

    $ time yes | dash mill | pv -S -s "$FIVE_MEGS" > /dev/null
       5MiB 0:00:07 [ 663KiB/s] [====================================================>] 100%            

    real 0m7.720s
    user 0m4.687s
    sys 0m8.934s

awk is super fast as expected:

    $ time yes | awk '{ print $0 }' | pv -S -s "$FIVE_MEGS" > /dev/null
       5MiB 0:00:00 [11.5MiB/s] [====================================================>] 100%            

    real 0m0.439s
    user 0m0.444s
    sys 0m0.011s





Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: