Hacker News new | past | comments | ask | show | jobs | submit login

Here's an analogy that may help when thinking about probability tables: Many types of standard codes like Morse code are like Huffman coding, where you give different variable length codes for each letter.. In Morse E and T are both a single dot and a dash respectively but X and Z get four.

You can view this Huffman Table or Morse codebook as a probability distribution on the likely letters to follow. A symbol with a small number of bits == high probability. A symbol needing large number of bits == low probability.

The only innovation with Arithmetic coding and ANS is that they allow for "fractional bits:" you don't need a whole dot for a very likely next-letter. But the idea is the same: you are taking your understanding of the probability distribution table and using it to come up with codes for the following letters.

One caveat is that instead of agreeing to a fixed code upfront, you agree to an algorithm to dynamically estimate the future probability tables as you read through the file.




Makes sense. Thanks!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: