Hacker News new | past | comments | ask | show | jobs | submit login

I'm not including parsing time, both pandas and polars versions started from an in-memory data structure parsed from two XML files (low GB range). This is on my workstation with a single Xeon 4210 (10 cores, 20 threads @ 2.20-3.20Ghz).

Perhaps I can focus on a subset of this processing and write this up since it seems like there's at least some interest in real examples. As pointed out in a reply to a sibling comment, I don't guarantee that my starting code is the best that pandas can do -- to be honest, the runtime of the original code did not line up with my intuition of how long these operations should take. Maybe someone will school me but either way switching to polars was a relatively easy win that came with other benefits and feels right to me in a way that pandas never did.




Is polars not parallelizing some ops on the GPU?


It has zero GPU support for now.


Important point.

Nowadays, we write a pure pandas version, and when the data needs to be 100X bigger and faster, change almost nothing and have it run on the GPU via cudf, a GPU runtime that fully follows the pandas API. Most recently, we port GFQL (Cypher graph queries on dataframes) to GPU execution over the holiday weekend and it already beats most Cypher implementations. Think billions of edges traversed per second on a cheap 5 year old GPU.

We're planning the bigger than memory & multi node versions next, for both CPU + GPU, and while cudf leans towards dask_cudf, plans are still TBD. Polars, Ray, and Dask all have sweet spots here.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: