Hacker News new | past | comments | ask | show | jobs | submit login

Absolutely. The point of an autoencoder is dimensionality reduction: boil a big set of data down to a few hundred or thousand numbers in a vector which summarizes it. You could treat it either as lossy compression and store just the encoding, or you can treat it as a hybrid format in which the autoencoder lossy encoding is then corrected to lossless by additional bits in the stream.

In practice, even the hyper-efficient compression algorithms used in something like zpaq tend to use only very small shallow predictive neural networks because no one wants to wait days for their data to be compressed or ship around big neural nets as part of their archives, so it's more of an information-theoretic curiosity. Few enough people will even use 'xz'.




You could, but I don't think this would be competitive on compression ratio at all, even allowing for an order of magnitude more time.


The technique is very effective, after all a variant of paq holds a compression record. http://prize.hutter1.net/ You can try paq yourself in PeaZip. http://filehippo.com/download_peazip_64/


PAQ does not use autoencoders, that's the difference. Neural network models for compression are extremely slow and can use a crapton of memory (one top benchmark result uses 32GB of RAM): https://www.quora.com/What-is-the-potential-of-neural-networ...


Last I checked, PAQ only uses a shallow (two layer) neural network as a last step to weight the predictions from the multiple handmade next-bit prediction models it contains.


So, you think the additional accuracy of the larger neural network isn't enough to overcome the storage size of the network itself?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: