Hacker News new | past | comments | ask | show | jobs | submit login
42.zip: A single 42,374-byte zip file that uncompresses to one million 4.5GB files (4.5PB) (unforgettable.dk)
33 points by soundsop on July 28, 2008 | hide | past | favorite | 17 comments



For anyone who doesn't have the time to break this down, basically it breaks down like this:

42.zip > lib {0-f}.zip > book {0-f}.zip -> chapter {0-f}.zip -> doc {0-f}.zip -> page {0-f}.zip -> 0.dll

where {0-f} should be expanded to 0 1 2 3 4 5 6 7 8 9 a b c d e f

and 0.dll is a 4GB file

$ head -c 1000 0.dll |od -c

0000000 252 252 252 252 252 252 252 252 252 252 252 252 252 252 252 252

*

0001740 252 252 252 252 252 252 252 252

0001750

or loads of <AA>'s when viewed in less.


Oh, so you have to recursively unzip for the full effect. I don't know of any tools that do that, although I wouldn't be surprised if virus scanners do it.


So ... email it? Maybe through GMail ?

Edit: tried it, GMail refuses the attachement, as the file contains an "executable" file. They took the easy way. Can't blame them though.


Bad compression, I know a better one:

    while 1
        output "\x00"
Just few bytes for infinite effects :)

I guess that given the context it's a shame I can't share this article of mine: http://antirez.com/post/87 because it's in italian language but maybe there are more people able to read italian than I hope here.



Yeah, disappointing that it's a recursive unzip. Can't prank someone with this...yet.


One's disappointment is another's relief.


You can on OS X Leopard, I think. It automatically unzips files once you download them. No clue if it does recursives and I really don't want to find out.


On a related note, there's a Gzip quine (a program which produces itself as an output) floating about which I thought was quite impressive if you're into this sort of thing.



(Non-dead) link to the file: http://www.maximumcompression.com/selfgz.gz


and it contains at most 42,374 bytes of information, no matter how verbosely it is spelled out once unpacked. compression doesn't hide information, it just displays it in the most concise way possible.

but call me up when you can archive information into thin air, such that the data can be smaller than the smallest self contained equivalent. there some ways to do this. for example, "the first million digits of pi" is a very concise way of referring to a number with a million digits. and pi doesn't have to be stored anywhere, it can always be calculated. so the information is compressed to pi[0..10^6] plus the overhead of the algorithm for calculating digits of pi. since the algorithm would be of constant size, there would be a threshold (likely less than 10^6) after which this approach would be more spacewise efficient than a self contained zip.

too bad there isn't an easily recognizable off-the-shelf constant for every million digit number one would want to compress. pi is transcendental, so EVERYTHING is theoretically in it, but it's tough to find anything in particular and, if you do, it's going to be REALLY deep. and referencing something at a random location deep in pi would probably take no less space than the original data we're compressing.

i wish there was something like that, though. it wouldn't be practical, but... for theory's sake.


I guess the only way this file could have been created is recursively unless I am missing something.


so what? fill in a 20gb file with zeros or f.i. in this case, the actual content of this 40kb file over and over, zip it, and you will get the same result on unzip.

DEFLATE is a way improved run-length encoding. Nothing productive though here... :)


Reminiscent of the 'Billion Laughs' XML file.


It's a Virus ...


This is a Trojan Horse.

YC admin should remove this post immediately.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: