( https://westurner.github.io/hnlog/#story-38435908 Ctrl-F "TheoremQA" (to find the citation and its references in my local personal knowledgebase HTML document with my comments archived in it), manually )
> TIL i^4x == e^2iπx ... But SymPy says it isn't so (as the equality relation automated test assertion fails); and GeoGebra plots it as a unit circle and a line, but SymPy doesn't have a plot_complex() function to compare just one tool's output with another.
That's a lot of links to take in, and I don't do really anything with ML, but feel free to head over to https://leanprover.zulipchat.com/ and start a discussion in the Machine Learning for Theorem Proving stream!
My observation at the moment is that we haven't seen ML formalize a cutting-edge math paper and that it did in fact take a lot of experience to pull it off so quickly, experience that's not yet encoded in ML models. Maybe one day.
Something that I didn't mention is that Terry Tao is perhaps the most intelligent, articulate, and conscientious person I have ever interacted with. I found it very impressive how quickly he absorbed the Lean language, what goes into formalization, and how to direct a formalization project. He could have done this whole thing on his own I am sure. No amount of modern ML can replace him at the helm. However, he is such an excellent communicator that he could have probably gotten well-above-average results from an LLM. My understanding is that he used tools like ChatGPT to learn Lean and formalization, and my experience is that what you get from these tools is proportional to the quality of what you put into them.
Communication skills, Leadership skills, Research skills and newer tools for formal methods and theorem proving.
The example prompts in the "Teaching with AI" OpenAI blog post are paragraphs of solution specification; far longer than the search queries that an average bear would take the time to specify.
Is there yet an approachable "Intro to Arithmetic and beyond with Lean"? What additional resources for learning Lean and Mathlib were discovered or generated but haven't been added to the docs?
Perhaps to understand LLMs and application, e.g. the Cuttlefish algorithm fills in an erased part of an image; like autocomplete; so, can autocomplete and guess and check (and mutate and crossover; EA methods and selection, too) test [math] symbolic expression trees against existing labeled observations that satisfy inclusion criteria, all day and night in search of a unified model with greater fitness?
What have you to say about methods such as these:
"LeanDojo: Theorem Proving with Retrieval-Augmented Language Models" (2023) https://arxiv.org/abs/2306.15626 https://leandojo.org/
https://news.ycombinator.com/from?site=leandojo.org
( https://westurner.github.io/hnlog/#story-38435908 Ctrl-F "TheoremQA" (to find the citation and its references in my local personal knowledgebase HTML document with my comments archived in it), manually )
"TheoremQA: A Theorem-driven [STEM] Question Answering dataset" (2023) https://github.com/wenhuchen/TheoremQA#leaderboard (they check LLM accuracy with Wolfram Mathematica)
"Large language models as simulated economic agents (2022) [pdf]" https://news.ycombinator.com/item?id=34385880 :
> Can any LLM do n-body gravity? What does it say when it doesn't know; doesn't have confidence in estimates?
From https://news.ycombinator.com/item?id=38354679 :
> "LLMs cannot find reasoning errors, but can correct them" (2023) https://news.ycombinator.com/item?id=38353285
> "Misalignment and Deception by an autonomous stock trading LLM agent" https://news.ycombinator.com/item?id=38353880#38354486
That being said, guess and check and then develop a fitting casual explanation is or is not the standard practice of science, so
https://news.ycombinator.com/item?id=38124505 https://westurner.github.io/hnlog/#comment-38124505 :
> What does Mathlib have for SetTheory, ZFC, NFU, and HoTT?
> Do any existing CAS systems have configurable axioms?
Does LeanDojo have configurable axioms?
From https://news.ycombinator.com/item?id=38527844 :
> TIL i^4x == e^2iπx ... But SymPy says it isn't so (as the equality relation automated test assertion fails); and GeoGebra plots it as a unit circle and a line, but SymPy doesn't have a plot_complex() function to compare just one tool's output with another.