The point here is that the same functionality could be implemented way more efficiently even with shell scripts.
The idea of using standard UNIX tools for the showcase is good one. Basically, it tells you that a modern FS is very good at storing chunks of read-only data (one don't need Java for that) with efficient caching and in-kernel procedures. That using pthreads for jobs is a waste, because context-switching has its costs, etc.
To put it simple - by mere rewriting basic functionality in, say, Erlang, one could get orders of magnitude more efficient implementation.
The only selling point of Hadoop is that it exist (mature, stable, blah-blah). It also has one problem - Java. But as long as hardware is cheap and credit is easy - who cares?
The idea of using standard UNIX tools for the showcase is good one. Basically, it tells you that a modern FS is very good at storing chunks of read-only data (one don't need Java for that) with efficient caching and in-kernel procedures. That using pthreads for jobs is a waste, because context-switching has its costs, etc.
To put it simple - by mere rewriting basic functionality in, say, Erlang, one could get orders of magnitude more efficient implementation.
The only selling point of Hadoop is that it exist (mature, stable, blah-blah). It also has one problem - Java. But as long as hardware is cheap and credit is easy - who cares?