Hacker News new | past | comments | ask | show | jobs | submit login
Proofs based on diagonalization help reveal the limits of algorithms (quantamagazine.org)
85 points by nsoonhui on Sept 6, 2023 | hide | past | favorite | 69 comments



Here's a nice video that digs into the general structure of diagonalization arguments: https://www.youtube.com/watch?v=dwNxVpbEVcc

It's a really neat way to see a shared core between seemingly-separate topics:

    - Gödel's Incompleteness Theorems,
    - Cantor's Theorem,
    - Russell's Paradox,
    - The Halting Problem,
    - The Y Combinator!,
    - etc.


I have a problem with the diagonalization, but I'm more of a math fan so please forgive my ignorance.

The proof of uncountability of real numbers starts with a process of enumerating this infinite list of infinitely long binary numbers. Then says that we construct a number that can't be on them by flipping bits the down the diagonal.

But hasn't this concept already kind of mixed domains? The enumeration was describing a process, but the diagonal flipping move seems to assume or require that the enumeration is complete or at least paused. How could the diagonal construction ever finish to allow a "check" of whether the result of that process is actually on the list.

Obviously at any point in time it won't be, but aren't we just kind of proving that we can construct a number that isn't on the list yet? And the first process would certainly eventually get to our constructed number?

The diagonal just seems like a shortcut to a number that we know the first process hasn't gotten to yet?

I just don't get how we conclude that Integers are countable and Reals are Uncountable from this proof, even though its obvious in an intuitive sense that Reals have to be much "larger".


It's not a process. You think of it as a process because you can only read one line at a time, but the proof is looking at the whole.

It's like saying "There are infinitely many primes because if not we could multiply them all together and add one to generate another one". You don't actually have to find all the prime numbers and work out a product. You are saying that's actually not possible because the idea contradicts itself.


I don't quite understand your main concern, but it seems others did and have answered. To elaborate on why Cantor's argument doesn't go through for the integers:

Assume you had a list of every integer. To begin with, you couldn't start flipping bits from the left since integers don't start from the left. You could however imagine them having a sequence of 0s extending infinitely to the left, e.g. 101 = ...00000101. If you did that, the diagonal argument is possible to perform only starting from the rightmost digit, and you do end up with some string of numbers.

The problem is this: If you do walk right-to-left flipping bits in such a list, the integer case has an "out". You do end up with a string of 0s and 1s, but by assumption it can't have 0s extending infinitely to the left because then it would have been on the list. In other words, the thing you end up with never has its "final" 1, and so what you end up with isn't an integer.

This is really the crux of the argument: Not every string of 0s and 1s actually represents an integer. However, put any string of 0s and 1s after the radix point ("decimal point") and that does represent a real number. So what you construct in the real version of the argument really is a real number not on the list (i.e. a contradiction).

It's instructive to do the same thought experiment with an imaginary list of rational numbers written out as decimals and see why the diagonal argument fails in that case too.

Now I described the above as a kind of procedure ("walk right-to-left") but that's really just for convenience. There isn't any actual iteration going on.


> Not every string of 0s and 1s actually represents an integer.

Is that to say that only the finite ones are? Since an infinitely long string that starts with 1 is infinite in size?

Otherwise I can’t make sense of this statement.


Yes, for instance ...101010 (the pattern 10 repeated infinitely) isn't an integer.

It's maybe easier to think of the argument as working on sequences: From a sequence (x_0, x_1, x_2, ...) with each x_n a binary digit (i.e. 0 or 1) you can always construct a real number x = Sum(x_n / 2^n, n = 0, 1, 2, ...).

The integer you'd want to construct from such a sequence would be Sum(x_n * 2^n, n = 0, 1, 2, ...), but this only works if there's some N such that x_n = 0 for all n > N. Otherwise you would have an "infinite integer" (whatever that means). Valid positive integers are all on the form Sum(x_n * 2^n, n = 0, 1, 2, ..., N), i.e. the binary expansion is finite.

The subtlety of the diagonal argument (why it works for reals but not for integers/rationals) is that while it always produces _some_ string of digits, only the real numbers are defined such that the string of digits is itself always a real number. In the integer/rational case, you can construct a number "outside" the set you started from. (In fact you always do if the original list really was of all integers/rationals. Given any enumeration of the rationals, the diagonal argument can actually be used constructively to produce an irrational number.)


> The proof of uncountability of real numbers starts with a process of enumerating this infinite list of infinitely long binary numbers.

> How could the diagonal construction ever finish to allow a "check" of whether the result of that process is actually on the list.

> The diagonal just seems like a shortcut to a number that we know the first process hasn't gotten to yet?

I think you are mixing up the enumeration of the (hypothetically countable) reals with the existance of said set of reals, and viewing the construction of the counterexample as an actual process that involves programmatically iterating over the list instead of a description of how the digits of said new real number is created.

There is no pause or setp-by-step of this iteration that's required, since it's only telling you that the i'th digit of the counterexample will be distinct from the i'th digit of the i'th number (thereby creating a real not in your countable set, since you have created a number outside of your 1-1 mapping onto the natural numbers).


You are confused about the concept of countability and finiteness. The enumeration that you mention requires that a set be countable, not that it is finite. So it's wrong to somehow require the enumeration to finish.

Countable sets can be finite or infinite. Most interesting countable sets are infinite. The enumeration then means a bijection with the natural numbers. Your definition of enumeration seems to require it to "complete" which is incorrect.

(I put "complete" in quotes because that's the word you used, but I don't want you to further confuse that with the completeness of real numbers which is a whole another thing.)


It's a bit like saying 0,1... isn't a real number because you couldn't type all the 1s.


Actually, I think you bring up some really good points.

If you believe the reals are countable, then your task is to come up with one, just one, scheme for enumerating them (i.e. putting in 1-to-1 correspondence with the naturals). If, on the other hand, you believe the reals are uncountable, then your task is decidedly harder: You need to come up with some scheme that, given any proposed enumeration, you can produce a number definitively not on that list.

Cantor's argument does the latter. Given any scheme---say a computer program---that claims to enumerate the reals, Cantor's argument shows us how to build a corresponding computer program to generate the digits of a number that are never produced by your original program. I.e. if you let the original program list out N reals, Cantor's program can give you an N-digit number that it has definitely not listed. The original program certainly might list it later, but by that point it's listed N+M digits and Cantor's program can give an N+M-digit number that's missing. The point being that Cantor's program always works, no matter how big N gets.

On the other hand, as another poster has pointed out, the Cantor scheme fails to work for an actually countable set, like the even numbers or primes. Cantor's argument will still work in a sense. It will give you something, but the thing it gives you will always violate some property that the original set of things must satisfy, like having only finite digits or whatnot.


Well, we can't really "count to infinity" in real life, and most of mathematics (unless you're a finitist or something, which is rather rare) assumes that we can somehow do it and establishes some base rules about how that would have to work without leading to contradictions.

With that said, assume that you have some list of real numbers. In a sense, what we're asking for is a function f, such that f(i, j) returns the j-th digit of the i-th number on that list. [0, 1]

Then you construct a function g as follows:

  def g(j)
    f(j, j) + 1 % 10
  end
Now you convince yourself of the fact that there is no number i, such that f(i, j) = g(j) for all j - or in other words, g encodes a different number than f(i, .) for any i. So, for every list of real numbers, there must be a number not on that list.

[0]: I'm brushing over the fact that some different sequences of integers encode the same real number, e.g. 0.999... and 1.000... - it doesn't affect the argument substantially.

[1]: This function doesn't necessarily have to implementable in a programming language - mathematicians are fine with talking about "uncomputable functions" and claiming that they exist. The argument shows that such a function can't exist even non-constructively. You can avoid all that talk about uncomputable functions and numbers by getting into constructive mathematics, but that's another can of worms and as far as I know it comes with its own unintuitive results. For example, exactly the same diagonalisation argument shows that there is no computable function encoding a list of all computable numbers.


A set is countable if, and only if, there is a bijection between it and the natural numbers — ie if there can be an enumeration.

Cantor is doing a proof by contradiction. He’s assuming that the complete enumeration exists precisely because it is wrong.

If you object to making that assumption, you’re just agreeing with Cantor’s conclusion before he’s made the argument.


You've gotten a lot of thoughtful answers here already. Here's mine. I'll come at it from a constructivist perspective.

There is a way to completely define a countable list. I do not say that you can completely write the list out in finite time. But you can specify it fully. Write a function f : Nat -> X, where X is the type of the things in the list, such that f is total, i.e., provably halts on every input. Your favorite typed lambda calculus or other theorem proving language will give you this property basically for free. The entire list is encoded as f. It doesn't matter that you can only extract elements one at a time by processing this function. They're all there, well-defined.

Now, we use this technique twice in Cantor's argument. First, we suppose the existence of f : Nat -> Real with the property that f covers all the reals. What we mean by that is you cannot define a real, x, such that f i != x for all i. (But we will do so.)

Second, each real number is itself encoded as a countable list of digits. So, Real := Nat -> Digit. In some sense, every real number is itself "a process" of digit generation, but you don't need to carry it out. Simply holding the function will suffice. Now, it's kind of tricky to compare reals to see which are equal and unequal, but ignoring the 0.99999... = 1 issue, which can be resolved, a proof that x != y involves identifying just one place i : Nat where the digits disagree: x i != y i.

Here we go: define x as x i := (f i i) + 1 mod 10. We claim x witnesses that f doesn't cover the reals. We agreed this means we must show x != f i for all i. We agreed that means finding j such that x j != f i j. The proof we seek: Let j := i.

We're just pushing functions around in a suitably typed language. Everything halts, nothing is running forever, no infinite storage space is required to hold the supposed list of reals nor the digits of each real. You can whip this up pretty fast in Lean or Coq etc.


Ironically, Cantor's argument falls apart in constructivism.


Cantor's argument shows that there exists no surjective map from the natural numbers to the reals and that proof is valid constructively. So the proof still works, it just has a different interpretation.


My point is that it DOES have a different interpretation. In particular, we don't interpret it as saying that there are somehow fundamentally more reals.

Incidentally the "exists no surjective map" interpretation may or may not be correct, depending on the exact choice of definitions.

Here is how it can be false.

Suppose that we define a function f in some programming language to be a real if it can be proven from a specific set of axioms that f(n) is always a rational, and that |f(n) + f(m)| is always less than or equal to 1/n + 1/m. Let's call those axioms, ASSUME.

It is now straightforward to write a program that maps the natural numbers onto the reals. Just do a breadth-first search for all proofs from ASSUME that some function f represents a real. Given n, we return the program that is the subject of the n'th such proof. It is also straightforward to do Cantor's diagonalization argument. But proving that the resulting function produces an output for every n requires proving that ASSUME is consistent. Which, as Gödel pointed out, hopefully doesn't follow from ASSUME. And therefore the construction does not result in a real by the chosen definition.

(When we start thinking carefully about reasoning, we find that there is an important difference between "X proves" and "X proves that X proves".)


> define a function f in some programming language to be a real if it can be proven from a specific set of axioms that...

Fine, but these programs only comprise constructive reals. Similarly, with my choice of Real := Nat -> Digit, there are only countably many functions you could actually (create a process to) write down which inhabit that type, but that doesn't make the type countable.

There are at most countably many programs one can access in any type. This follows from what programs are: finite (though arbitrarily long) strings over a finite alphabet. That doesn't refute other interesting properties about the "sizes" of these things, as given by the existence or non-existence of certain maps into and out of other types.


And the age-old debate.

The classical mathematician says, "...but those are only the constructive reals. What about the rest?"

The constructivist says, "What rest? The ones that don't exist?"


The diagonalisation argument shows that if f is a total function i: N -> j: N -> [0..9] (which calculates the j-th digit of the i-th real number), then g j = (f j j + 1) % 10 is a total function N -> [0..9] and has the property that for all i there exists a j, such that f i j /= g j.

To me, that's an entirely constructive proof that there is no (computable) surjective map from the integers to the (computable) reals, but if you disagree, then I guess that's just a debate about definitions.

I have no idea what Gödel has to do with any of this or why you think we would have to prove some axioms consistent. We don't have to prove our axioms consistent to do mathematics, thankfully, because that would be impossible.


The problem is that you are cavalierly throwing around terms like "f is a total function" instead of the more careful, "this set of axioms proves that f is a total function". Diagonalization adds a level of self-reference which may require that you use stronger axioms to prove that if is a total function.

That is not due to a question of definitions or constructivism. This is a point of mathematical logic that most pay little attention to. Namely proving that you prove something is not the same as proving it. And the difference between those two is the question of whether you are consistent.

Suppose that I have a list of axioms, which I'm calling ASSUME. If you want, pick ZFC as the axioms, doesn't matter.

Suppose that I have a program P that takes any positive integer as input and produces an output in finite time. So I get a list P(1), P(2), P(3) and so on. You can model this mathematically in various ways. Gregory Chaitin's version of LISP is particularly nice for this. But any Turing-complete programming language will work.

Suppose that by the construction of my program, each P(i) comes with a proof from ASSUME showing that it will take an positive integer as input, and produce a rational number as output. For any given i, we can find that associated proof. Then using ASSUME we can verify that proof and prove that P(i)(1), P(i)(2), P(i)(3), ... is a sequence of rational numbers.

The surprise from mathematical logic is that ASSUME generally does NOT prove that the following is a sequence of rational numbers:

P(1)(1), P(2)(2), P(3)(3), ...

The reason is that we have shown:

> ASSUME proves that for each i, there is a proof from ASSUME that P(i)(i) is well-defined.

To prove that P(i)(i) is well-defined, we need to inspect the i'th proof. But proving it for every i requires something strong, like that ASSUME is consistent.

And therefore proving that the function f produced by diagonalization is total generally requires a stronger set of axioms than the ones needed to prove things about each item on the list.


> Diagonalization adds a level of self-reference which may require that you use stronger axioms to prove that if is a total function.

Sorry, but this is nonsense.

Here's a proof of my argument in Coq (without any extra axioms) which is based on constructive logic (and yes, I'm ignoring the fact that some sequences of digits are the same real number and everything before the decimal point etc. pp. for simplicity's sake):

  Require Import Coq.Init.Nat.

  Inductive Digit :=
    | Zero | One | Two | Three | Four | Five | Six | Seven | Eight | Nine.

  Definition succ (d : Digit) : Digit := match d with
    | Zero => One | One => Two | Two => Three | Three => Four | Four => Five | Five => Six
    | Six => Seven | Seven => Eight | Eight => Nine | Nine => Zero
    end.

  Theorem d_succd : forall (d : Digit), d <> succ d.
  Proof.
  intros d.
  destruct d; discriminate.
  Qed.

  Theorem feq_each : forall {A} {B} (f g : A -> B), f = g -> forall (x : A), f x = g x.
  Proof.
  intros A B f g hyp x. rewrite hyp. reflexivity. Qed.

  Definition Real := nat -> Digit.

  Definition Seq (A : Type) := nat -> A.

  Definition not_includes {A} (s : Seq A) (e : A) := forall (i : nat), s i <> e.

  Theorem diag : forall (l : Seq Real), exists (g : Real), not_includes l g.
  Proof.
  intros l.
  remember (fun x => succ (l x x)) as g.
  exists g.
  intros i.
  intros contra.
  specialize ((feq_each (l i) g) contra).
  intros contra'.
  specialize (contra' i).
  rewrite Heqg in contra'.
  revert contra'.
  apply d_succd.
  Qed.
If you're going to argue with Coq, then whatever, but this is as constructivist as it gets.

Edited to add that a proof of essentially the same theorem can be found in Bishop's "Foundation of Constructive Analysis" (in my edition, page 25), with explicit reference to Cantor.


I have no argument with Coq. I do argue that your chosen definition entirely misses the point of what I was saying. Specifically this definition.

  Definition Real := nat -> Digit
My definition of a Real is NOT a function from a nat to a Digit. It is a program that is SUPPOSED TO return a Digit when it gets a nat. Where "supposed to" means that there is a proof in the axiom system saying that it will.

This does not mean it actually will return a digit. Our axiom system might be broken. But we can claim to have a proof from our axiom system that it will.

Given an enumeration of such programs, a program representing your diagonalized function can be created. Your proof demonstrates that IF it always returns a Digit, THEN it will produce output that is different from anything on the list.

But does it always return a Digit? Just as before, maybe, maybe not. The new program waits on another program, and the other program might not return a Digit.

Before we could claim to have a proof in the axiom system that our program would return a digit. We now have less than that. What we now have is a proof that there is a proof in the axiom system that it will return a Digit. A proof of a proof is not actually a proof from the axiom system.

What we have does represent a proof in (axiom system + consistency of axiom system) that it will return a Digit. But this is a strictly weaker claim than we had before. And it is weaker by the fact that we now need a new axiom. Namely the axiom of the consistency of our previous axiom system. And consistency is necessary because we have to deal with the reference to the axiom system introduced through diagonalization.

If you still doubt, please don't bother arguing with me. Go ask your neighborhood mathematical logician. Then argue with them. You've clearly decided that I don't know what I'm talking about. But you haven't decided that of them, so they still have a chance.


> You've clearly decided that I don't know what I'm talking about.

Yes, by now I'm pretty certain of this.


And I am likewise convinced that you are missing basic connections between computing, Gödel, the halting problem and so on.

Last try.

Suppose that X is an axiom system that models computation. Since Turing machines can be described in arithmetic, and arithmetic in computation, that's actually equivalent to being able to model arithmetic. But we'll stick with computation.

I'll use Python syntax for computation. Both because it is readable, and because it has https://docs.python.org/3/tutorial/classes.html#generators, closures, and eval. Which are really convenient for what I'm going to do.

Let's look at code that generates a functions that take natural numbers and return Booleans. Some are simple.

    def is_even (n):
        return n % 2 == 0
Obviously this always returns a Boolean. Some are more complex.

    def is_even (n):
        return n % 2 == 0

    def collatz (n):
        i = 0
        while n != 1:
            i += 1
            if n % 2 == 0:
                n = n // 2
            else:
                n = 3 * n + 1
        return i

    def collatz_is_even (n):
        return is_even(collatz(n))
We don't know that collatz_is_even always returns. It returns a Boolean if it does. But if the Collatz conjecture is false, then some inputs will never return and this isn't a Boolean sequence.

Now the following function can obviously be written, but will be somewhat complicated:

    def proofs (X):
        # Does a breadth-first search through proofs in first order logic
        # from axiom system X. Will yield proofs in order.
        ...
Here is another function that can clearly be written, and is also somewhat complicated.

    def found_boolean_seq (code):
        # If proof proves that <code> returns a function seq which always
        # returns a Boolean when passed a natural number, will return code
        # as a string.
        #
        # Otherwise it returns None.
        ....
Based on these we can write the following.

    def boolean_seqs (X):
        # A list that will include every function that X proves defines a
        # boolean sequence. All functions that X can prove this about will
        # be somewhere in the sequence.
        #
        # This returns them as closures.
        # 
        for proof in proofs(X):
            code = found_boolean_seq(proof)
            if code is None:
                yield lambda n: False
            else
                yield eval(code, {})

    def boolean_seq (X, n):
        # Returns the n'th boolean sequence from boolean_seqs.
        #
        # This returns it as a closure.
        i = 0
        for f in boolean_seqs(X):
            i += 1
            if i == n:
                return f
And now let's diagonalize to produce a new sequence.

    def diagonalize (X):
        # Returns a sequence that disagrees with every other one on our list.
        def inner (n):
            return not boolean_seq(X, n)(n)
        return inner

    # Do some work to set up the axiom system X
    ...

    # And return the diagonalized sequence.
    diagonalize(X)
OK. If proofs and found_boolean_seq are both correctly written, we can prove from how Python works that every proof from X will show up in proofs(X). And every piece of code that X can prove defines a Boolean sequence will show up in boolean_seqs(X).

Now a question. Can the diagonalized sequence show up in boolean_seqs(X)? If X is inconsistent, then it certainly does. X proves anything. Including that that code always returns a Boolean.

Next question. If the diagonalized sequence shows up in the n'th position, what will happen if we call seq(n) on it? The answer is that the copy we call will find its own code in the sequence and will eval it. It will then call seq(n) on that copy, which will repeat. By induction we can prove that it will create an unlimited number of copies of itself, which takes forever, so it never returns.

But note, If it is on the list, that's because axiom system X proved that it WOULD return. So if it is on the list, then axiom system X is inconsistent.

A version of Gödel's theorem follows. If the axiom system X is consistent, then the diagonalized sequence is not on the list. This means that X cannot prove whether or not the diagonalized sequence always returns. Which in turn means that X is incomplete.

The fact that I keep pointing out is that cannot prove whether or not the diagonalized sequence always returns. If the axiom system X is consistent, then you need a strictly stronger axiom system than X to prove that the diagonalized function always returns a Boolean. It has to be stronger exactly because proving things about the diagonalized function means that you're proving things about the axiom system X.

And that is true of diagonalized computation in general. Diagonalizing creates a layer of self-reference. Which means that you can't always prove things about the diagonalized function, even though you can prove the same thing about each function in the sequence.


This is all true. It shows that no effectively computable set of axioms can prove all "true" computable functions computable (or in less metaphysical language, that there are always functions we can show to be computable from "outside" our logical system, i.e. metalogically, while not being able to show it within that system). It's basically just a different variant of the incompleteness theorem. And another way to put it is that the set of computable functions is not recursively enumerable, and by extension, neither is the set of real numbers.

But just because this set is not recursively enumerable, it doesn't mean that we can't talk about it and prove theorems about it, and not even a constructivist such as Bishop would go so far. He wouldn't even be able to define the real numbers if he did because there is no way of enumerating all the sequences that satisfy his definition of a real number.

Your proposed definition of the set of real numbers (that kind of encode a proof of their own existence) is IMHO at odds with how constructivists themselves define them and IMHO also a totally unworkable definition for real analysis.

Maybe there is some oddball mathematical framework that does what you want, but I'd certainly never want to work in a framework which doesn't even allow me to write down theorems of the form "for every decidable language, ..." just because I can't enumerate all the decidable languages, and similar.


I agree that most constructivists define things differently than this. I brought up this example in https://news.ycombinator.com/item?id=37425642 to demonstrate that you were wrong in https://news.ycombinator.com/item?id=37423933 to claim that Cantor's argument necessarily must work constructively. It works in some constructive systems, and not in others.

I'm not arguing for attempting to work in such constructive systems. Merely pointing out that they exist. And their existence is a limitation on what diagonalization can prove a priori.

Also look at what you said in https://news.ycombinator.com/item?id=37427973. Now can you see that my point is not nonsense? Diagonalization really does add a layer of self-reference. And the fact that it does is why it doesn't work in the system that I described.

This also explains the point that I was making in https://news.ycombinator.com/item?id=37428468. Your Coq proof was fine for what it proved. But it was not looking at the key piece of my example. Which is the fact that the functions are not necessarily total functions. They are merely functions which we have a specific reason for believing that they might be total. Namely a proof from a particular set of axioms. And my whole point was that the diagonalized function in general doesn't have equivalent evidence of being total. Which is a point of complexity that your Coq proof did not address.

Are you still so certain that I don't know what I'm talking about?


Please give a reference to any textbook or paper that defines real numbers in such a purposefully useless way as you do it, and I'll reconsider my position. Otherwise I think you're just playing semantic games, trying to redefine well-known terms - in which case you should not be surprised that people don't understand you.


That's called moving the goalposts.

Now that you personally know how to trace the reasoning to see that my counterexample really was a counterexample, you'll only accept it as a counterexample if you see it in published research?

This is something that I worked through myself back in grad school back in the last millennium. https://www.researchgate.net/scientific-contributions/Marcia... gave me a number of leading hints, and confirmed my reasoning to me. In particular she was the one who commented, "A lot of people miss the importance of the fact that an axiom system proving that it proves something doesn't mean that it proves something." From her reactions at the time, I'm pretty sure she thought it was trivially obvious. And may well have known of prior research.

So no, I don't have published research using this formulation. Nor is it reasonable for you to demand published research to verify something learned through oral communication decades ago, which should be obvious to anyone in the field.

Doubly so given that my first description of the counterexample came with enough specific details that you SHOULD have been able to figure it out yourself. I not only told you what definition would create the problem. I also said why, and gave you an outline of the construction from which you could produce the proof!


I recently learned that the computable numbers are subcountable but not enumerable.

Proof sketch: Turing Machines are enumerable by lexiographically sorting them. The Turing Machines that compute computable numbers are a subset of them. Therefore the set of computable numbers is subcountable.

Now assume that the set of computable numbers is enumerable by some Turing Machine M. Then I can construct a new computable number who's ith digit is a swap from the ith number's ith digit given by M (ie a diagonialization). This contradiction shows that the set of computable numbers is not enumerable.

Another way of looking at this is that determining if some Turing Machine computes a computable number requires deciding the halting problem, which cannot be done. TMs computing computable numbers are required to terminate given an input precision.


Thinking like that no infinite set can be countable, because "at any one point in time" you haven't completely enumerated it yet. That's not how thinking about infinite sets works.


>The proof of uncountability of real numbers starts with a process of enumerating this infinite list of infinitely long binary numbers. Then says that we construct a number that can't be on them by flipping bits the down the diagonal.

I don't know what you mean by "process". Any list of numerals like this will be missing (at least) a numerals is what he says.

Maybe let's take a step back.

It needs a proof by contraposition. The "normal" position is "S -> T" (so if S then T), which also fixes the contraposition "(not T) -> (not S)".

Cantor's idea is to have a list of (possibly infinite length) binary literals representing real numbers between 0 and 1 (and if those are uncountable, so are ALL the real numbers :P).

The important part is that first consider the set of all such literals, whether countable or not. Let's call those T (for "total"). But then consider the subset of those that can be enumerated. Let's call those S (for "subset").

The countability means that you have a function that literally enumerates those (i.e. a function "natural number -> literal", 1:1)

Enumerate the literal and enumerate the digit at some position in the literal.

Let a_n be the nth literal. The a_n need to be unique (in order for those to be an enumeration).

Let d(n, m) be the nth literal's digit at position n.

Now construct a new literal q that differs from the first string in the first position, from the second string in the second position and so on.

This q is NOT IN S. (it differs from any of the elements of S in at least one position each)

Thereby, we proved that if S is countable, then it's not all of T. (because we found some literal that's not on the list; the list of all needs to be bigger!)

So the "position" proof is: "Assume S is countable. Then show it's not all of T."

But that, by contraposition, means, if it's all of T then it's not countable. Which is what we wanted.


This is something I’ve pondered, and for what it’s worth, I don’t think any of the responses so far clearly explain the flaw in your rationale.

That’s not to say Cantor was wrong - much more likely we’re missing something, however I think the issue is more subtle than people realise.


That is absolutely true. See https://news.ycombinator.com/item?id=37421201 for what that subtlety is.


Your ignorance hides wisdom and an unexamined assumption in classical mathematics.

The problem is what does existence mean?

Classical math includes "procedures" with processes based on absolute truth. They avoid paradox only because you cannot reason about the reasoning process itself. The result of this is simpler arguments, but we must accept the existence of things that we can never, even in principle, write down.

Now let's change existence to mean that mathematical things only exist when we can write them down. (Classical mathematics calls this the constructible universe.) So procedures must be computer programs. Real numbers are represented computer programs that produce a sequence of rational numbers converging at a known rate. An enumeration of real numbers is a program that produces a list of programs that themselves represent real numbers. And so on.

Now try Cantor's argument. We start with an enumeration of the real numbers. A program that creates a list of programs. We then construct a new real number as per Cantor. So far so good. And, as Cantor said, the new real number cannot be on the enumeration. Cantor now concludes that there are more real numbers.

BUT his conclusion is now false! I can easily write down a program that lists every possible program. Ones that crash. Ones that run forever and never stop. And, yes, ones that represent real numbers. It is a countable list. And our new real is on the list!

The problem isn't that there are too many real numbers. It is that you can't decide whether or not a given program represents a real number without solving the halting problem. So every enumeration has a choice - it either includes things that aren't real numbers, or it misses some real numbers. But all the things that can be real numbers are on a countable list!

The only sense in which there can be "more" real numbers in classical mathematics is that we accept the "existence" of numbers whose claim to exist is extremely tentative. And we do so with decision "procedures" that depend on decisions made based on unverifiable absolute truths.

If you want a sense of how different mathematics looks if you don't do that, look up Constructivism.


> They avoid paradox only because you cannot reason about the reasoning process itself.

All of ZFC can be formulated with an effectively computable set of first-order axioms and every valid theorem has a proof which is nothing more than a finite string of symbols which can be checked in finite time.

It's really odd to claim that "you cannot reason about the reasoning process of classical mathematics".


The procedure used within Cantor's diagonalization proof depends on a sequence of choices, each of which is based on whether something is absolutely true, even though that truth cannot necessarily be established by any finite set of reasoning from a finite set of axioms.

Actually doing this requires making decisions based on absolute truth. The procedure could never be done in practice because there is no reasoning process available in first order logic that can always determine absolute truth. In fact even being able to determine the absolute truth of whether our current axioms are consistent would imply a contradiction. (Thanks, Gödel.)

Whether it makes any sense to talk about such constructions at all was the heart of the Formalist vs Constructivist debate about "the law of the excluded middle".

The formal first order proofs from ZFC that you mention depend on the axioms of ZFC. Which in turn are assumed. What is actually proven is, "If these are true, then that is the conclusion." But the axioms must be assumed. What is proven is that the axioms lead to the conclusion. And not that the axioms are true or the conclusion matches our naive notions of truth or existence.

As to that, a Constructivist can't even get through the first axiom of ZFC before finding problems to question. The axiom of extensionality says that two sets are equal if and only if they have the same elements. In classical mathematics, elements either are or are not in the set. And two sets either do or don't have the same elements. And we can build procedures based on that truth. But such statements based on absolute truth might not be able to be established through any reasoning process. Procedures requiring decisions made on such absolute truths cannot actually be performed. In what way can we establish existence by constructions involving impossible procedures based on information that no reasoning process can ever verify.

I will make this problem concrete.

Let's let A be the empty set. And B be the set of non-trivial zeros of the Riemann zeta function. According to classical mathematics, either ZFC may prove that A = B, it may prove that A is not B, or the question may be undecidable. If the question is undecidable, then through https://en.wikipedia.org/wiki/Second-order_arithmetic you can prove that A = B. (The principle of the proof is the same as how Gödel proved that if ZF is consistent, then so is ZFC.) Classical mathematicians happily accept that claim, even though it rests on a statement that might be true without any possibility of proof. And, furthermore, classical mathematicians will happily accept proofs based on constructions that require an infinite number of decisions that are similarly impossible to make through any form of first-order logic.

Constructivism collapses all of that. In that philosophy, we can't talk about the truth of any statement unless a proof as been found for or against it. Procedures may not depend on the impossible task of deciding the undecidable. Nor can we sneak such assumptions in through things like the axiom of extensionality. Which makes the classical proof just pointless symbol manipulation that implies nothing about what does or doesn't exist.

Here is the Constructivist view. If we assume things that make no sense, we conclude things that also make no sense. Rather than trying to understand the nonsensical conclusion, we should avoid the nonsensical assumption. There aren't an uncountable number of real numbers who can never be defined, even in principle, in any finite way. Mathematical things that can't be defined, simply don't exist.

This view only seems strange to us because it is impossible to study higher mathematics without being indoctrinated into the classical mathematical approach. It is only after learning to think outside of that box that you start to realize how arbitrary the implicit assumptions are.


> The formal first order proofs from ZFC that you mention depend on the axioms of ZFC. Which in turn are assumed. What is actually proven is, "If these are true, then that is the conclusion." But the axioms must be assumed. What is proven is that the axioms lead to the conclusion. And not that the axioms are true or the conclusion matches our naive notions of truth or existence.

That is true of every kind of mathematics. Ex nihilo nihil. There is no argument that can be made without any sort of axiom (you can derive tautologies, but only if you agree on at least the inference rules you're going to use - if you don't even agree on that you're out of luck).

You don't like the axioms of ZFC, or LEM, or whatever - fine. But you can't argue that whatever you want to use instead doesn't suffer from exactly the same problems. Maybe you find your axioms "more intuitive". Okay, and you're free to do mathematics in whichever way you prefer. But I guarantee that someone else will come along and tell you that your own axioms don't look all that intuitive to them. In fact, it seems like you assume that the integers exist. But an ultrafinitist would look at you in disbelief because they don't think that infinite sets exist at all (after all, where are the infinite sets in nature?). So which axioms are the "true" ones?

Personally, I'm not interested in metaphysical questions about what is "mathematically true". It's satisfying enough that a) we can constructively prove that classical proofs are valid chains of inferences from their axioms to their conclusions, b) nobody has found a contradiction in ZFC for over 100 years, c) the results we get from applying mathematics to the real world are astonishingly useful in all sorts of domains. Many mathematicians consider that a win and then move on without pondering about metaphysical things that we can never have an answer to.

> This view only seems strange to us because it is impossible to study higher mathematics without being indoctrinated into the classical mathematical approach.

Calling other people "indoctrinated" doesn't make your argument stronger. For the record, there's enough constructivist advocates hanging around programming-related communities, so it's not like these voices aren't heard here.

edit: And in any case, you did massively shift the goalposts here, because first you claimed that "we can't reason about the reasoning used in classical mathematics" (wrong, we can do all sorts of reasoning about classical reasoning without even having to resort to infinities and other weird things), now you're claiming that classical mathematics isn't "true" (or not provably so) - but that's an entirely different discussion.


No goalpost shifting.

Procedures in classical mathematics require a series of decisions made based on absolute truths accessible by no process of reasoning. Mathematics that includes entities created through such procedures avoid contradiction exactly because the truths are true, whether or not any reasoning exists that could show it.

Computability limits itself to procedures involving decision processes that can themselves be reasoned about. Attempts to create computable analogs of classical decision procedures reveal a whole world of logical subtleties thanks to the unavoidable possibility of self-reference.

The difference is this. In classical mathematics, diagonalization results in uncountable infinities. In computable mathematics, diagonalization results in undecidability. The reason why classical mathematics avoids undecidability is because decisions are made by appeals to absolute truth. Which represents a lack of process that therefore cannot be diagonalized.


Your argument is basically "my truths are true because they are true". That to me isn't mathematics anymore, it's metaphysics.


Thank you and all the very thoughtful and helpful replies!

Definitely going to check out Construvism, which is a new term to me, and was mentioned by several replies.


TIL that "Y Combinator" is a mathematical concept and not just a VC firm.


Yep. If you ever take a course on theory of programming languages that's based on the lambda calculus, you'll find it's one of the ways to model recursion in a programming language that would otherwise be Turing-incomplete. You get recursion at the cost of being subject to the Halting Problem.


Great video, thanks for sharing


Can someone explain this part of Turing's proof:

> ... we can define an uncomputable problem like the one in Turing’s proof: “Given an input string representing the code of an algorithm, output 1 if that algorithm outputs 0 when its own code is the input; otherwise, output 0.” Every algorithm that tries to solve this problem will produce the wrong output on at least one input — namely, the input corresponding to its own code.

Why is this impossible?

As I see it, there are two algorithms intermixed in that statement:

1. Code that emits a 0 if the input is it's own code

2. Code that emits a 1 if it receives as input code that behaves according to Algo 1.

Feel like I'm missing something basic or obvious!


Imagine you have an code which does #2. Now give it itself as an argument. What is the result? If it emits 0 then it's behaving according to #1, then because we assumed it satisfies #2 it must emit 1, a contradiction. Similarly it can't emit 1.

Thus it is impossible to have an algorithm which solves #2 for all inputs, because it must fail on at least one input (itself)


def unc(algstr) { if (algstr(algstr) == 0) return 1 else return 0 }

unc(unc) -> what does this return?


Stack overflow exception, since it calls unc(unc) from itself.


Gödel also had a short appearance in the Oppenheimer movie, in the scene where Einstein talks about Kurt still being afraid even though they are in the states now.


The document's initial example of diagonalization doesn't explain it properly. It shows how you can find a binary string that's not in the list; fine. But it's conveniently a list of 5 strings of length 5. It doesn't show how you'd find a new string of length 5 that's not in a list of 6 such strings, or 31 such strings.

I suspect the "conveniently" may drop out of some conversion of a real problem to an abstraction, but that's not explained.


> The document's initial example of diagonalization doesn't explain it properly.

> It shows how you can find a binary string that's not in the list; fine. But it's conveniently a list of 5 strings of length 5. It doesn't show how you'd find a new string of length 5 that's not in a list of 6 such strings

This is actually a mistake on your part. It's true that it's possible to find a string of length 5 that isn't already contained in a list of 6 such strings.

But it isn't possible to do that by using diagonalization. The concept of diagonalization is that you differ from the first string in the list at index 1, from the second string in the list at index 2, from the third at index 3, and so on. A list of six strings, to illustrate diagonalization, would require all of the strings to be six digits long.


> would require all of the strings to be six digits long

The OP's point (presumably, because it bugged me too) is that the article doesn't make this clear.


No, that can't be the point, because Slow_Dog specifically states that the article should show how to find a binary string of length 5 that isn't contained in a list of six such strings:

> It shows how you can find a binary string that's not in the list; fine. But it's conveniently a list of 5 strings of length 5. It doesn't show how you'd find a new string of length 5 that's not in a list of 6 such strings, or 31 such strings.

The reason it's not showing that is that it's showing an example of diagonalization. Slow_Dog would apparently prefer that the article about diagonalization discuss some other technique than diagonalization.


I think the missing piece is just note that, for actual proof applications, the length of the individual "strings" and the total number of them is both (countable) infinity, so the size does match.


It's not necessary that they match. All you need is that the length of each string is not less than its position in the list.


I would probably refer to this as "Cantor's diagonalization" to avoid the ambiguity with "matrix diagonalization" using the matrix eigendecomposition.


Hrm... Ambiguity only arises out of context. IMHO, there is ample enough context here to disambiguate, just as you seem to have successfully done.

Matrix diagonalization and what you're calling Cantor's diagonalization can both be seen as instantiations of a more general diagonalization process. This latter process seems to be what the article is obliquely pointing at, cf my top-level comment for a video that introduces those details.


I’d rather not call it diagonalization at all.

Nothing about turings construction enumerates many machines and builds a new machine to flip some aspect of the other machines on the list.

It’s more like the liar paradox akin to Gödel first incompleteness theorem.


It turns out that these are all kind of the same, the Liar's Paradox, Russell's Paradox, the Halting Problem, etc. And indeed the they rely on a "diagonal map" i.e., a map x -> (x, x) in the construction.

This was a great read, if you want a mathematical take on it, and some generalization too. https://arxiv.org/abs/math/0305282v1


Sure, if you create a larger object it can contain two smaller objects.

But that construction is so general that it contains all proofs of the from "There exists X that does not have not property Y" that proceed by constructing something that lacks the property, and stuffs the proof into the diagram.


Its not fully general. Many different types of undecidability proofs are basically proofs by diagonalization, but not all are. See some counter examples:

https://mathoverflow.net/q/454105


“I genuinely don’t have an operational definition for what it means to use diagonalization” - Terry Tao


So you are listing all TMs + inputs and creating a machine which does the opposite (from a halting perspective), which is like when Cantor lists the reals and creates a new one that has the opposite digit from the other reals


No, it's not like that at all. It's just an algorithm that contradicts itself, like Russell's Paradox.


Broaden your view of the Halting undecidability proof.

Let T_i enumerate all Turing machines. Let E be a function that encodes a Turing machine as a tape. Suppose Halting is decidable, and let H_i,j be the table of bits such that H_i,j is 1 iff T_i halts on input tape E(T_j). There is some x such that H_x,j = !H_j,j for all j (we can construct/find such a machine by hypothesis).

Therein lies the diagonalization, you see? We just identified a row such that the bit in column j differs from the jth bit on the the diagonal. The paradox you reference is what happens at bit H_x,x. But, in the bigger context, this argument proceeded by diagonalization.


I tried to comment that on the article. But they use Disqus which is hostile to people making comments.


The term I've seen most often is "diagonal argument".


A relevant link, Lawvere's fixed point theorem: https://ncatlab.org/nlab/show/Lawvere%27s+fixed+point+theore...


Unrelated to this, but one type of mathematical argument/technique I've never seen explained "for dummies" is proving the independence of an axiom from a set of axioms.


I thought for a second the title was missing a (1936).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: