Zipf's Law appears to be a generalization of Benford's Law which explains the interesting observation that the first digit of real word data is usually 1 almost 30% of the time.
That's only partially correct. While you can derive Benford's law as a consequence of Zipf's law, there are many datasets where Benford's law applies but Zipf's law doesn't, such as the values of physical constants.
Much better (IMO) generic explanations for Benford's law are scale invariance and mixtures of probability distributions, which are discussed in the Wikipedia article. Unlike proofs of Zipf's law, they make no assumptions about the relationships between the entities.
Man, this never gets old. Every time I see it work on a different dataset, it's as surprising as the first time I saw it. Recently I plotted supermodel incomes. It was an amazingly good fit. http://arvindn.livejournal.com/98510.html
http://en.wikipedia.org/wiki/Benford%27s_law
http://www.cut-the-knot.org/do_you_know/zipfLaw.shtml