Here's the Boyer-Moore theorem prover[1] from around 1980 doing that.[2] The proofs are fully automatic but you have to ask it to prove the theorems in an order so that the theorems you need for each step are already done. Otherwise it gets lost trying different paths. This is runnable; 8 years ago I ported it to GNU Common LISP. Takes under a second now. Took 45 minutes in 1980.
This is pure constructive mathematics. No existence proofs. Everything is computable.
The first theorem: X + 0 = X
In Boyer-Moore notation:
(PROVE-LEMMA PLUS-0 (REWRITE)
(EQUAL
(PLUS X 0)
(FIX X)))
FIX is just if it's a number, then the number, else 0. That makes the function total.
The second theorem: X + (Y + 1) = (1 + (X + Y)
(PROVE-LEMMA PLUS-ADD1 (REWRITE)
(EQUAL
(PLUS X (ADD1 Y))
(IF (NUMBERP Y) (ADD1 (PLUS X Y)) (ADD1 X))))
ADD1 is a primitive of Boyer-Moore theory. ADD is actually recursive ADD1.
The user just lists the proof goals. Here's the input file.[3] The proofs are all automatic.
This is more automation than many modern provers seem to have.
Toward the end, it proves McCarthy's theory of arrays, something I wrote. McCarthy stated those as axioms, but they are in fact provable. The proofs are clunky because there's no set theory yet. I should have built up constructive set theory first, but I didn't realize that back then.
Beautiful! Imandra is a modern Boyer-Moore style theorem prover for higher-order functional programming (its logic is based on a typed higher-order subset of OCaml rather than Boyer-Moore's basis on an untyped first-order subset of Pure Lisp (and later Common Lisp with ACL2)), but there are a lot of similarities! It also integrates computable counterexamples in a really cool way (they're first-class values reflected in the runtime and you can directly compute with them, including counterexamples to higher-order goals which involves function synthesis!). Imandra is used a lot for formal verification in finance, for example.
Also, not CL, but I've got bored with Scheme and computer abstractions requiring you to proof every step on recursion by induction. I just put a base case at f(0) with an invariant and I tested it against f(1).
A lot of people recommend the Natural Number Game for learning Lean, but, as an experienced functional programmer, I don't like it when the code is abstracted away from me. This is not a critique of the tutorial, which is fantastic for it's audience (presumably people more interested in theorem proving without a background in programming), but it might not be ideal for people who want to really get the feel of working in Lean.
For experienced programmers who prefer to learn in the IDE, I've found Theorem Proving in Lean 4 [0] to be a wonderful introduction to an incredible language and area of research. Lean 4 has really nice tooling if you use the VS Code plugin.
Functional Programming in Lean [1] also looks great, but doesn't seem to focus as heavily on understanding theorem proving, which, for me, is the entire point of learning Lean.
I’ve really been getting into Lean, and these tutorials are fantastic. It’s always interesting to see how many hidden assumptions are made in short and informal mathematical proofs that end up becoming a (very long) sequence of formal transformations when all of the necessary steps are written out explicitly.
My guess is that mathematical research utilizing these systems is probably the future of most of the field (perhaps to the consternation of the old guard). I don’t think it will replace human intuition, but the hope is that it will augment it in ways we didn’t expect or think about. Terence Tao has a great blog post about this, specifically about how proof assistants can improve collaboration.
On another note, these proof systems always prompt me to think about the uncomfortable paradox that underlies mathematics, namely, that even the simplest of axioms cannot be proven “ex nihilo” — we merely trust that a contradiction will never arise (indeed, if the axioms corresponding to the formal system of a sufficiently powerful theory are actually consistent, we cannot prove that fact from within the system itself).
The selection of which axioms (or finite axiomatization) to use is guided by... what exactly then? Centuries of human intuition and not finding a contradiction so far? That doesn’t seem very rigorous. But then if you try to select axioms probabilistically (e.g., via logical induction), you end up with a circular dependency, because the predictions that result are only as rigorous as the axioms underlying the theory of probability itself.
To be clear, I’m not saying anything actual mathematicians haven’t already thought about (or particularly care about that much for their work), but for a layperson, the increasing popularity of these formal proof assistants certainly brings the paradox to the forefront.
The selection of which axioms to use is an interesting topic. Historically, axioms were first used by Hilbert and cie. while trying to progress in set theory. The complexity of the topic naturally led them to formalize their work, and thus arose ZF(C) [1], which later stuck as the de facto basis of modern mathematics.
Later systems, like Univalent Fundations [2] came from some limitations of ZFC and the desire to have a set of axioms that is easier to work with (for e.g. computer proof assistants). The choice of any new systems of axioms is ultimately limited by the scope of what ZFC can do, so as to preserve the past two centuries of mathematical works.
>>> The HoTT library is a development of homotopy-theoretic ideas in the Coq proof assistant. It draws many ideas from Vladimir Voevodsky's Foundations library (which has since been incorporated into the UniMath library) and also cross-pollinates with the HoTT-Agda library. See also: HoTT in Lean2, Spectral Sequences in Lean2, and Cubical Agda.
Then you add in the Brouwer Hilbert mess that still causes challenges with talking about where the a priori assumption of the law of the excluded middle causes real problems with modern computational challenge.
> My guess is that mathematical research utilizing these systems is probably the future of most of the field (perhaps to the consternation of the old guard). ... Terence Tao has a great blog post about this, specifically about how proof assistants can improve collaboration.
Yeah, that was a great blog post. As someone who started in mathematics but ended up in computing, I read it and realized, hang on, he is talking about taking a devops approach to mathematics. It definitely struck me that this is the way of the future: mathematics transformed from almost a humanities-like discipline into an engineered enterprize.
It's a text book with many examples for you to try. There are sample solutions which if you get stuck (which I often do) you can take a peek for some help. I actually just grep the solution file -A1 , -A2 etc to peek at the solutions one line at a time
A user-interface wishlist item: I keep wishing for the ability to hit the up arrow to get the previous line I just entered (and hit up repeatedly to get lines before that).
(The "editor" mode helps with this, and I ended up switching to that.)
This is a seriously well-designed tutorial. There were several times that the instructions anticipated exactly what mistake I was about to make - clearly a lot of work and testing has gone into this.
(I'm the author of the game) Unfortunately it did not. That comment was made when I was optimistic that an undergraduate who'd added more levels as a summer project would go on to PR them, but then the term started and they were distracted by their degree. We have some kind of prototypes for even/odd world (e.g. "prove odd * even is even") and prime number world (boss level: prove 2 is prime), plus a hard world consisting basically of unsolved problems such as Goldbach, Twin Prime Conjecture etc. But they never made the transition from "lean file containing a bunch of theorems" to "lots of files each containing one theorem and some rambling".
Sure--I think I saw a very old version of this, so most of it is new to me--but it also refers to "Prime Number World and more", and at least Prime Number World doesn't seem to be on the map.
You only type in the reasons, not the results. Lean computes the result for you. Otherwise, if you do all the work yourself, you wouldn't need Lean :-)
If it shows you something of the form `X=X` like `37=37`, you then type in `rfl` to assert that they are equal by the reflective property which completes the proof.
This is pure constructive mathematics. No existence proofs. Everything is computable.
The first theorem: X + 0 = X
In Boyer-Moore notation:
FIX is just if it's a number, then the number, else 0. That makes the function total.The second theorem: X + (Y + 1) = (1 + (X + Y)
ADD1 is a primitive of Boyer-Moore theory. ADD is actually recursive ADD1.The user just lists the proof goals. Here's the input file.[3] The proofs are all automatic. This is more automation than many modern provers seem to have.
Toward the end, it proves McCarthy's theory of arrays, something I wrote. McCarthy stated those as axioms, but they are in fact provable. The proofs are clunky because there's no set theory yet. I should have built up constructive set theory first, but I didn't realize that back then.
[1] https://github.com/John-Nagle/nqthm
[2] https://github.com/John-Nagle/pasv/blob/master/src/work/temp...
[3] https://github.com/John-Nagle/pasv/blob/master/src/work/temp...