Hacker News new | past | comments | ask | show | jobs | submit login

I'm using both Spark and polars, to me the appeal of polars is additionally it is also much faster and easier to set up.

Spark is great if you have large datasets since you can easily scale as you said. But if the dataset is small-ish (<50 million rows) you hit a lower bound in Spark in terms of how fast the job can run. Even if the job is super simple it take 1-2 minutes. Polars on the other hand is almost instantaneous (< 1 second). Doesn't sound like much but to me makes a huge difference when iterating on solutions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: