Hacker News new | past | comments | ask | show | jobs | submit login

The app was a C# frontend (https://github.com/kg/HeapProfiler) that drove windows OS services to take heap snapshots and capture stack traces, so I ended up writing a custom key/value store in C# to avoid having to do cross-language interop, marshaling, etc (the cost of sending blobs to SQLite and running queries was adding up.). It's hard to beat the best-in-class optimized databases on their own turf but if you can just grab a spot to dump your data into, you end up being a lot faster.

By the end it ran fast enough that it was able to saturate the kernel's paging infrastructure and make my mouse cursor stutter, and I was able to take 1-2 snapshots per second of a full running Firefox process with real webpages in it, so it was satisfactory. SQLite couldn't process the amount of data I was pumping in at that rate (but it still performed pretty well - maybe a few snapshots per minute)

At the time I did investigate other data stores and the only good candidates I ran across used incompatible open source licenses, so I was stuck doing it myself. Fun excuse to learn how to write and optimize btrees for throughput :-)




Yeah, most databases probably will have pathological behaviors against your requirements (especially on tail-latency, which you would care about). Many implement similar tools would put a lightweight compression on top and just dump these snapshots to disk and then run a post-processing for queries. Dumping snapshots is also preferred because you can insert checksums and checkpoints for partial data recovery if there are failures.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: