Hacker News new | past | comments | ask | show | jobs | submit login

tl;dr:

  00xxxxxx          - copy (x+1)-th last EXPLICITLY ENCODED pixel (i.e. ignoring repeats)
  010xxxxx          - repeat the last pixel x+1 times
  011xxxxx xxxxxxxx - repeat the last pixel x+33 times
  10rrggbb          - copy the last pixel and adjust RGB by (r-1, g-1, b-1)
  110rrrrr ggggbbbb - copy the last pixel and adjust RGB by (r-15, g-7, b-7)
  1110rrrr rgggggbb bbbaaaaa
                    - copy the last pixel and adjust RGBA by (r-15, g-15, b-15, a-15)
  1111RGBA [rrrrrrrr] [gggggggg] [bbbbbbbb] [aaaaaaaa]
                    - copy the last pixel and replace RGBA if each bit flag is set
So it is essentially the PNG filter limited by Sub, but offers better delta coding and a larger history window. My guess is that it might not work well if the image has a strong vertical gradient (which would need the vertical context), but nevertheless it's pretty impressive that this simple coding is not as inefficient at all.



You might be able to do better compression by traversing the pixels in Hilbert [1] order or Z [2] order, to take into account vertical neighbours as well as horizontal. It might cost time or space though, as you couldn't stream the pixels through it anymore - you'd need them all in memory.

[1] https://en.wikipedia.org/wiki/Hilbert_curve

[2] https://en.wikipedia.org/wiki/Z-order_curve


You could still stream the pixels, you'd just have to stream them in Z order. This may seem pedantic, but this applies to any mismatch between the order in which you store pixels and the order in which the format stores pixels.

E.g. some file formats may be row-major, others column-major, some block-based (like JPEG) or a fancy interleaving scheme (like Adam7). Some might store the channels interleaved, others separate the channels. If any of these choices doesn't match the desired output format, it breaks streaming.


And formats like BMP are stored upside down...


The nice thing is if M >= B^2 (i.e., total memory is large enough to fit a square region of the image, where each row/column of the square fits a full block/page of memory), you can transform from row/column order to Hilbert/Z-order without needing to do more I/Os.

So you can't do such a conversion in a streaming fashion, but there is no need to load all data in memory either.


See also: Intel's guide to looping over smaller chunks of 2D arrays for better cache utilization:

https://www.intel.com/content/www/us/en/developer/articles/t...


Yeah, those images explain it pretty well. There is one slight difference, though.

Because the Hilbert/Z-order curves are defined recursively, algorithms for converting between the curve and row/column order are "cache-oblivious", in that they don't need to take an explicit block size. You write it once, and it performs well on any system regardless the CPU cache size and/or page size.


I wonder if the pixel data were transformed to/from a different order, before and after the compression/decompression, if that would speed things up without introducing too much slowdown of its own?


Iteration order through arrays has a big impact on performance due to cache effects. I'd guess that you'd lose a lot of performance by a complicated iteration scheme.


The current encoding is complete, so no future extensions are possible.

A fix would be to change:

  110rrrrr ggggbbbb - copy the last pixel and adjust RGB by (r-15, g-7, b-7)
to:

  1100rrrr ggggbbbb - copy the last pixel and adjust RGB by (r-7, g-7, b-7)
This leaves:

  1101???? ...
available for future extensions.


So I implemented this change and tested it out. Turns out, in my simple test suite, this actually changes the size of images. Not drastically. Often just a 5-10 kb difference (on a 500-600kb file), but that's still more than I expected for changing r-15 to r-7.


Suggests the DIFF16 delta-color mode is responsible for quite a bit of the savings. Maybe it would be worth experimenting the exact encoding.

One idea would be to start a predicted color (calculate the delta of the previous two pixels, apply that, then specify a delta to that). Another would be to encode the delta in YUV or some other color space, and then experiment with the best balance of bits between those channels.

Perhaps it would be better to steal bits from RUN16 instead, I somewhat doubt it's usefulness.


I really like this.


> 00xxxxxx - copy (x+1)-th last EXPLICITLY ENCODED pixel (i.e. ignoring repeats)-

Not exactly. It's "copy the last pixel color for which (r^g^b^a)&63 == x".


Oh you are right, I only took a cursory look at the header file and thought the hashing is only used to speed up the lookup.


I see it as a variant of LZ specialised to images, so its performance should not be too surprising. Running an uncompressed bitmap through a general-purpose compression algorithm tends to yield similar results to PNG.


This is so neat. Reminds me the recent LZ4 post who manually hand tweaking the encoding one could still create a decent compression boosts after all these decades.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: