Hacker News new | past | comments | ask | show | jobs | submit login

Let's say the string contains 100 0s and 101 1s. How would you encode the positions of the 0s without using more than 201 bits in total?

Nearly all random strings of sufficient length have different numbers of 0s and 1s. So if your argument were true, your compression scheme would reduce the average length of completely random strings, which is impossible.




Use 200 bit's to encode the message except the last bit which is determined if the message had 100 1's or 101 1's. QED knowing the number of 1's buy's you at least one bit of information.

Edit: You can do better than this by counting the number of 201 bit messages with 100 0's which is well below 2^201 and then encoding which of those corresponds to the original message. Aka if you had an 8 bit message with one zero 11111101 you have a total of 8 options which means you can encode it as 3 bits.


Even knowing the exact number of 1s and 0s buys you a few bits of information.

log(201 choose 100) / log(2) = 196.843

Order the 201 choose 100 strings on 201 bits with exactly 100 0s lexicographically, write down the index in that list of your particular message in 197 bits in a completely ordinary fashion ;P.

But to be clear, I don't disagree with what you mean at all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: