Datafusion out performs spark by a large margin. It is on par with photon based on my experiences, see benchmarks at https://github.com/blaze-init/blaze.
Ah nice, thank you for sharing that. I hadn’t seen it before, and congrats on beating out Spark that hard, I hope it continues to improve!
As an aside, maybe it would make sense to publish a new blog post somewhere so that the top hit on Google for “DataFusion benchmark” isn’t that post I linked.
Haha, yeah, we should definitely put a little bit more efforts into SEO :) Everyone is so focused on the hard-core engineering at the moment. I think Matthew from the community is actually working on a new comprehensive benchmark for us at the moment, which I hope will be published soon.