Hacker News new | past | comments | ask | show | jobs | submit login
Exploiting column chunks for faster ingestion and lower memory use (rerun.io)
15 points by Tycho87 7 months ago | hide | past | favorite | 2 comments



(Read this really quickly and bookmarked for a deeper read later.)

This part really caught my attention:

    > That's an improvement of 100x for write and ingestion speed, and 35x for memory overhead!
Where'd you guys get the idea for this approach? Did you know you could get this kind of improvement?


One of the founders of Rerun here. I don't remember exactly where the idea came from but I think it was basically two things. First off, creating chunks of columns to store or pass around is a pretty common approach in data systems. Parquet files have the concept of row groups for instance which is pretty similar (main difference is that chunks don't have to include all columns). Second, it was just quite obvious that we needed to amortize the fixed costs better for small data somehow




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: