As pi never repeats itself, that also means that every piece of conceivable information (music, movies, texts) is in there, encoded. So as we have so many pieces of pi now, we could create a file sharing system that's not based on sharing the data, but the position of a piece of the file in pi. That would be kinda funny
> As pi never repeats itself, that also means that every piece of conceivable information (music, movies, texts) is in there, encoded.
This is true for normal numbers [1], but is definitely not true for all non-repeating (irrational) numbers. Pi has not been proven to be normal. There are many non-repeating numbers that are not normal, for example 0.101001000100001...
Storing the index into pi for a file would usually take something like as much space as just storing the file, and storing or calculating enough digits to use that index would be impossible with the technology of today (or even probably the next century).
It's conjectured to be normal isn't it? I know it hasn't been proven yet, and I cannot seem to find where I read this, but I thought there was at least statistical evidence indicating that it's probably normal.
The rational numbers make up "zero percent" of the real numbers. It's a little hard to properly explain without assuming a degree in math, since the proper way to treat this requires measure theoretic probability (formally, the rationals have measure zero in the reals for the "standard" measure).
The short version is that the size of the reals is a "bigger infinity" than the size of the rationals, so they effectively have 'zero weight'.
But then the original implication, "100% of real numbers are normal, so that's pretty strong statistical evidence", still doesn't make any sense, as it's essentially using "100%" to imply "strong statistical evidence" that the rationals don't exist, which obviously doesn't follow.
I got the impression that the comment was a bit tongue-in-cheek.
The joke lies in the fact that saying "100% of real numbers" isn't *technically* the same thing as saying "all real numbers", because there's not really a good way to define a meaning for "100%" that lets you exclude rational numbers (or any other countable subset of the reals) and get something other than 100%.
it was about half a joke. statistical evidence doesn't really exist for the type of problem since polynomialy computable numbers are countably infinite so you can't define a uniform distribution over then
The thinking is inspired by the Infinite Monkeys Theorem. Which does have an easy-to-understand mathematical proof (and the criticisms of said proof are more difficult to grasp).
Isn't it a property of infinity? If pi goes on infinitely without repeating itself, every possible combination of numbers appears somewhere in pi.
It's sort of like the idea that if the universe is infinitely big and mass and energy are randomly distributed throughout the universe, then an exact copy of you on an exact copy of Earth is out there somewhere.
This property of infinity has always fascinated me, so I'm very curious for where the logical fallacy might be.
Not necessarily. The number 1.01001000100001000001... never repeats itself, yet most other numbers can never be found in it.
A number that contains all other numbers infinitely many times (uniformly) would be called normal, but no one has managed to prove this for pi yet. In fact, no one even managed to prove that pi doesn't contain only 0s and 1s like the above after the X-th digit.
The fact you can't encode arbitrary data in a structured-but-irrational number doesn't mean you can't encode data in a 'random' irrational number.
The question is really 'Does every series of numbers of arbitrary finite length appear in pi?' I can't answer that because I'm not a mathematician, but I also can't dismiss it, because I'm not a mathematician. It sounds like a fair question to me.
>I can't answer that because I'm not a mathematician
So what? Mathematicians can't answer it either. It is an open question and because it is an open question claiming it is or isn't true makes no sense.
>The fact you can't encode arbitrary data in a structured-but-irrational number doesn't mean you can't encode data in a 'random' irrational number.
You can not encode data in a random number. If it is random you can not encode data in it, because it is random. I am not sure what you are saying here.
I demonstrated that numbers where the digits go on forever and never repeat exist, which don't contain every single possible substring of digits. Therefore we know that pi can either be such or a number or it is not, the answer to that is not known. Definitely it is not a property of pi being infinitely long and never repeating.
That's why I put random in quotes. Pi is not a random number. You can encode data in it eg find a place that matches your data and give people the offset. That's not very helpful for most things though.
just index on the number of ones. Ex 0.10110 there are two ones in a row, so reference those two ones to be the number two. For 00, flip it and refer to the same pair of ones.
That is totally missing the point. Of course for every number there is an encoding that contains all pieces of information.
That obviously applies to 0.00... = 0 as well, it contains 0, then 00, then 000 and so on. So every number and therefore every piece of information is contained in 0 as well, given the right encoding. Obviously if you can choose the encoding after choosing the number all number "contain" all information. That is very uninteresting though and totally misses the point.
Most physicists don't believe that infinity can actually exist in the universe.
Put another way, the program which searches those works of art in the digits of pi will never finish (for a sufficiently complex work of art). And if it never finishes, does it actually exist?
That's a completely different issue. Using math to solve physics problems deals with physical models. Models are imperfect and what kinds of math they use is completely separate from asking "does infinity exist in our actual universe".
To answer that question, you would have to dismiss with experimental evidence all models people can come up with that try to explain the universe without "infinities". It's neither completely clear what that would mean, nor whether it's even in principle possible to determine experimentally (it's also most likely completely irrelevant to any practical purpose).
It's not that shocking to me - you should try tutoring a class of mathematics undergrads! They make this class of error all the time. It's a "this sounds like it's obviously true, so the obvious reason must be right" kind of thing. Rigorous logic takes a lot of time to click for people.
feel free to prove me wrong. I never said it's efficient, the point is just that the information is out there. If pi has the following subnumbers 00, 01, 10, 11 in there, we can construct every perceivable data we can encode as binary. Even with 0 and 1. So we can construct a file by pointers to these four numbers. The bigger substrings we can match, the bigger the compression ratio. The set of pointers might even be way bigger than the file itself. It's nowhere near efficient or clever, but just entertaining
I don't think you can argue against IP because the way you arrange the pointers is IP itself, but still a funny thought experiment anyway
I'm not saying, that every piece of information is in there end to end, but that there are parts in there which can be used to construct it. I think I should've made the "encoded" part a bit more transparent haha. But I love the discussion that I kicked off!
There are many ways in which a number might not never repeat itself, but not contain all sequences (e.g. never use a specific digit). What you want is normal numbers and pi is not proven to be one (though probably it is).
you might find this to be pretty cool. It's similar to what you're describing. Whoever made it has an algorithm where you can look up "real" strings of text and it'll show you where in the library it exists. you can also just browse at random, but that doesn't really show you anything interesting (as you would expect given it's all random).
> every piece of conceivable information (music, movies, texts) is in there, encoded
Borges wrote a famous short story, “The Library of Babel,” about a library where:
“... each book contains four hundred ten pages; each page, forty lines; each line, approximately eighty black letters. There are also letters on the front cover of each book; these letters neither indicate nor prefigure what the pages inside will say.
“There are twenty-five orthographic symbols. That discovery enabled mankind, three hundred years ago, to formulate a general theory of the Library and thereby satisfactorily resolve the riddle that no conjecture had been able to divine—the formless and chaotic nature of virtually all books. . .
“Some five hundred years ago, the chief of one of the upper hexagons came across a book as jumbled as all the others, but containing almost two pages of homogeneous lines. He showed his find to a traveling decipherer, who told him the lines were written in Portuguese; others said it was Yiddish. Within the century experts had determined what the language actually was: a Samoyed-Lithuanian dialect of Guaraní, with inflections from classical Arabic. The content was also determined: the rudiments of combinatory analysis, illustrated with examples of endlessly repeating variations. These examples allowed a librarian of genius to discover the fundamental law of the Library. This philosopher observed that all books, however different from one another they might be, consist of identical elements: the space, the period, the comma, and the twenty-two letters of the alphabet. He also posited a fact which all travelers have since confirmed: In all the Library, there are no two identical books. From those incontrovertible premises, the librarian deduced that the Library is “total”—perfect, complete, and whole—and that its bookshelves contain all possible combinations of the twenty-two orthographic symbols (a number which, though unimaginably vast, is not infinite)—that is, all that is able to be expressed, in every language.”
I've done the (simple) math on this -- in fact I'm writing a short book on the philosophy of mathematics where it's of passing importance -- and the library contains some 26^1312000 books, which makes 202T look like a very small number.
So though everything you describe is encoded in Pi (assuming Pi is infinite and normal) we're a long, long way away from having useful things encoded therein...
Also, an infinite and normal Pi absolutely repeats itself, and in fact repeats itself infinitely many times.
I just submitted a sub-page of that site, which has some discussion that touches more on the layout of the library as described by Borges:
https://news.ycombinator.com/item?id=40970841
This is not necessarily true. Pi might not repeat but it could at some point - for example - not contain the digit 3 anymore (or something like that). It would never repeat, but still not have all conceivable information.
But the number 3 is there just because we decide to calculate digits in base 10. We could encode Pi in binary instead, and since it doesn't repeat it necessarily will never be a point where there will never be another 1 or a 0, right?
That's true - you can quite easily prove that an eventually constant sequence of decimals codes for a rational number.
But it's also true that pi may not contain every _possible_ sequence of decimals, no matter what base you pick. Like the Riemann hypothesis, it seems very likely and people have checked a lot of statistics, but nobody has proven it beyond a (mathematical) shadow of doubt.
Obviously, it was just an example to illustrate what a non-periodic number could look like that doesn’t contain all possible permutations. If the number never contains the digit 3 in base 10 it will also not contain all possible permutations in all other bases.
It would average the same size as the actual data. Treating the pi bit sequence as random bits, and ignoring overlap effects, the probability that a given n bit sequence is the one you want is 1/2^n, so you need to try on average 2^n sequences to find the one you want, so the index to find it is typically of length n, up to some second order effects having to do with expectation of a log not being the log of an expectation.
You need both index and length, I guess. If concatenating both value is not enough to gain sufficient size shrink, you can always prefix a "number of times still needed to recursively de-index (repeat,start-point-index,size) concatenated triplets", and repeat until you match a desired size or lower.
I don’t know if there would be any logical issue with this approach. The only logistical difficulty I can figure out is computing enough decimals and search the pattern in it, but I guess that such a voluminous pre-computed approximation can greatly help.
No invertible function can map every non-negative integer to a lower or equal non-negative integer (no perfect compression), but you can have functions that compress everything we care about at the cost of increasing the size of things we don't care about. So the recursive de-indexing strategy has to sometimes fail and increase the cost (once you account for storing the prefix).
> every piece of conceivable information (music, movies, texts) is in there, encoded.
So that means that if we give a roomful of infinite monkeys an infinite number of hand-cranked calculators and an infinite amount of time, they will, as they calculate an infinite number of digits of pi, also reproduce the complete works of Shakespeare et al.
Isn't 202TB (for comparison) way too small to contain every permutation of information? That filesize wouldn't even be able to store a film enthusiast's collection?
Well it all comes down to encoding, doesn't it. We can represent almost everything with just 0 and 1 as well, can't we? The description of that data is way bigger than the elements used to describe it of course.
The sad thing is that the index would take just as much space as the data itself, because in average you can expect to find a n-bit string at the 2^n position.
Assuming we are only interested in base 10 and that pi contains e means that at some point in the sequence of decimal digits of pi (3, 1, 4, 1, 5, 9, 2, ...) there is the sequence of decimal digits of e (2, 7, 1, 8, 2, 8, ...), then I believe that question is currently unanswered.
Pi would contain e if and only if there are positive integers n and m such that 10^n pi - m = e, or equivalently 10^n pi - e = m.
We generally don't know if combinations of e and pi of the form a pi + b e where a and b are algebraic are rational or not.
Even the simple pi + e is beyond current mathematics. All we've got there is that at least one of pi + e and pi e must be irrational. We know that because both pi and e are zeros of the polynomial (x-pi)(x-e) = x^2 - (pi+e)x + pi e. If both pi+e and pi e were rational then that polynomial would have rational coefficients, and the roots of a non-zero polynomial with rational coefficients are algebraic (that is in fact the definition of an algebraic number) and both pi and e are known to not be algebraic.