Hacker News new | past | comments | ask | show | jobs | submit login

It's not the same. Data fusion comes with ballista, with the goal of replacing spark for many usages. It also supports JSON and Avro



Yes, it's not the same, but they serve the same purpose. It's, honestly, not important if Datafusion or DuckDB are using the arrow memory layout or not. What matters is their ability to run SQL queries (or Map-Reduce workloads) on CSV/Parquet files _WITHOUT COPYING THEM_.

If you start comparing them to solutions that copy datasets, you haven't understood what problem they are solving. For that problem, use postgresql or bigquery.


What have I not understood ? Please enlighten me.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: