Blosc – A high performance compressor optimized for binary data

buybackoff · on June 11, 2020

It's basically byte or bit shuffling filter (very fast SIMD optimized) in front of several modern compressors (lz4, zstd, their own) with self describing header. So if you have an array of 100 8-byte values, the result of shuffling is 100 1st bytes, followed by 100 of 2nd bytes and so on.

It shines when values are of fixed size with lots of similar bits, e.g. positive integers of the same magnitude. It's not so good for doubles, where bits change a lot. Also, if stroring diffs it helps to take a diff from initial value in a chunk, not previous value, so that deltas change sign less often (and most bits flipped).

From own usage case, for the same data, C# decimal (16 bytes struct) is compressed much better than doubles (final absolute blob size), while decimal is taking 2x more memory uncompressed.

If data items have little similar bits/bytes then it's underlying compressor that matters.

Xcelerate · on June 11, 2020

Back when I did HPC work, I used Blosc to compress information about atoms for molecular dynamics simulations before transferring this data between the Infiniband interconnects. Despite the high speed of the interconnects, it was actually faster to compress, transmit, and decompress using Blosc than to transmit only the raw data.

rawoke083600 · on June 11, 2020

Your job back them sounded wonderfully interesting ! :)

gtt · on June 11, 2020

btw, I'm currently tasked with Kolmogorov complexity estimation, so could someone recommend me best (from ratio point of view) compressors?

Faint · on June 11, 2020

http://mattmahoney.net/dc/text.html, is pretty much the scoreboard of https://en.m.wikipedia.org/wiki/Hutter_Prize

lrm242 · on June 11, 2020

Blosc is an outstanding project. I have used it with great success in finance and general data science in production with very large total datasets (one custom binary format and one leveraging protobufs).

It really shines first and foremost as a meta compressor, giving the developer a clean block based API. Once integrated (which really is quite easy) you can experiment easy with different compressors and preconditioners to see what works best with your dataset. These things can be changed at runtime and give you great flexibility.

Francesc has been advancing blosc consistently with a steady vision for years and years. It is one of the most underrated tools around IMO.

devit · on June 11, 2020

Apparently they have several benchmarks where they claim that decompression is faster than memcpy (!).

However, this is only the case because on several Intel x86_64 benchmarks they report memcpy performance between 5-10 GB/s, while even a basic DDR3 dual channel arch has 20 GB/s memory bandwidth, while a modern quad channel DDR4 can have 76.8 GB/s bandwidth, and of course there is no reason for memcpy to be substantially slower than memory bandwidth assuming it's properly implemented (AVX can separately read two and write one 256-bit per cycle = 128 GB/s memcpy at 4GHz).

Am I missing something or is this another case of "implausible claims = they screwed the benchmark = they are incompetent/malicious"?

stagger87 · on June 11, 2020

The absolute numbers don't seem far fetched. An AVX optimized memcpy on my high end machine (DDR4) has a throughput of 30GB/s.

As long as they are using the same memcpy routine in both the decompression case and the 'only memcpy' case, that seems reasonable. Obviously, the quicker memcpy becomes, the faster the decompression has to become to maintain the same performance ratios, but things like faster clock speeds or multi-threading can make that issue moot.

xiaodai · on June 11, 2020

It's very good! I have used Blosc in developing JDF.jl a serialization format for dataframes.

https://github.com/xiaodaigh/JDF.jl

doublesCs · on June 11, 2020

Could you tell us more? Is this meant to be an alternative to parquet?

In fact, now that I think about it, parquet supports compression. Shouldn't this be just an option when saving to parquet format?

pletnes · on June 11, 2020

Parquet’s snappy and brotli compressors are quite ok. Not sure if blosc is even faster though.

xiaodai · on June 17, 2020

it is an "alternative" to parquet but only works in the Julia ecosystem at the moment.

gigatexal · on June 11, 2020

Would be cool to see this in ZFS to make compressing binaries even more efficient

nisa · on June 11, 2020

The used shuffle techniques before compresson might be useful for squashfs? We play around with a mesh network (freifunk.net) and there are ton's of cheap 4mb flash devices that need every kb of storage :)

axegon_ · on June 11, 2020

Blosc is an excellent choice if speed is what you are after. Give or take 5 years ago I had to use a compression to transport a lot of data over zmq and blosc ran in circles over all other compressions.

w0utert · on June 11, 2020

Yes it's apparently so fast that in some scenarios it's even usable for compressing RAM. A framework I'm using does that to be able to process much bigger data sets than what would fit in RAM otherwise.

ddorian43 · on June 11, 2020

Can you be more specific around the framework and data type and access patterns ?

w0utert · on June 11, 2020

It's a framework called OpenVDB [1], which we use to represent and manipulate volumetric data (level sets). It stores the data as a sparse hierarchical grid with (from a practical perspective) infinite dimensions, and allows very efficient iteration and local manipulations of the grid.

I'm not an expert on how it is implemted exactly, but I believe the way it uses Blosc is by saving leaves of the VDB grids in blosc-compressed chunks, which are loaded into memory directly and only decompressed on-demand when the data is accessed, then re-compressed when the leaves are processed.

[1] https://www.openvdb.org

requin246 · on June 11, 2020

Can someone with Blosc 2 experience tell me what are the proper conditions to use superchunks or frames? When does it become advantageous to use one over the other?

This is a really interesting library.

js8 · on June 11, 2020

This would be an excellent candidate to put on an FPGA directly next to the CPU. (Assuming such thing would exist and be open enough to be usable by general public.)

waatels · on June 11, 2020

This look amazing. The application looks so diverse ! Can someone know if it can be applied on msgpack ?

profquail · on June 11, 2020

Not generally, no. blosc is geared towards “rectangular” data — that is, a C-style array of int, double, or some struct type.

any1 · on June 11, 2020

Can blosc be used to compress/decompress regular zlib streams?