Spark requires Hadoop to run, so this whole Spark vs Hadoop debate makes no sense whatsoever.
There is a place for arguing how effective Map/Reduce is, but it's been known for years that M/R is not the only, nor best general purpose algorithm for solving all problems. More and more tools these days do not use M/R, Spark including, and Spark certainly is no the first tool to provide an alternative to M/R. AFAIK Google has abandoned M/R years ago.
I just don't understand this constant boasting about Spark, it seems very suspicious to me.
This is not correct. Spark uses the Hadoop Input/Output API, but you don't need any Hadoop component installed to run Spark, not even HDFS.
You can -- and many companies do -- run Spark on Mesos or on Spark's standalone cluster manager, and use S3 as their storage layer.
> this whole Spark vs Hadoop debate makes no sense whatsoever
If we talk about Hadoop as an ecosystem of tools, then yes, it doesn't make sense to frame Spark as a competitor. Spark is part of that ecosystem.
But if we talk about Hadoop as Hadoop 1 MapReduce or as Hadoop 2 Tez, both of which are execution engines, then it very much makes sense to pit Spark against them as an alternative execution engine.
Granted, Hadoop 1 MapReduce is pretty old compared to Spark, and Tez is still under heavy development, but these are alternatives and not complements to Spark.
(Note: In Hadoop 2, MapReduce is just a framework that uses Tez as its underlying execution engine.)
> I just don't understand this constant boasting about Spark, it seems very suspicious to me.
Suspicious how?
I think Spark's elegant API, unified data processing model, and performance -- all of which are documented very well in demos and benchmarks online -- merit the excitement that you see in the "Big Data" community.
2. Yes, Spark was/is buggy.
3. For me Spark is really paradigm shift, next generation framework compared to M/R