I discovered you can get eval to recognize natural numbers without introducing more primitives if you take the number in binary, enter it into a list, reverse it, and replace the 1's with t's and the 0's with nil's.
If you do it that way, you can binary right shift a number by consing a nil at the front, or binary left shift by taking the cdr. You can increment, decrement, add and subtract efficiently, too. Even take logarithms.
I had to use this to replace Lisp's namesake linked lists with another data structure that allowed many of the same niceties, but required some math, and therefore naturals, to find elements in the data structure. It was an interesting project. If you made it through the whole article, you might want to look:
> AM worked by generating and modifying short Lisp programs which were then interpreted as defining various mathematical concepts; for example, a program that tested equality between the length of two lists was considered to represent the concept of numerical equality, while a program that produced a list whose length was the product of the lengths of two other lists was interpreted as representing the concept of multiplication.
This is the old representation of numbers in raw Lisp, which uses lists of length n to represent the number n. Incrementing and decrementing are easy, you simply cons or cdr. Subtraction and addition are slower, though, and multiplication, division, left- and right-shifting are really slow.
Beyond that, I'm talking about creating natural numbers for internal use; no outside interpretation happens, as it does in Automated Mathematician. At no point does the computer output these natural numbers; it just uses them to navigate a data structure so it can run a REPL.
John Shutt's Kernel language [1], decomposes lambda into unevaluated operands, a lexical environment, and the function body:
'(λ args body)
can be expressed as:
($vau args env body) [2]
Interestingly, because $vau does not evaluate `args` and explicitly recieves a lexical binding, macros are not necessary in Kernel, and hygiene is a simple matter of (eval)ing with the right environment.
Arguably then, Kernel is more fundamental than Lisp.
I think you mean to say that Kernel doesn't need to implement macros because they can be built using $vau.
---
Besides that, I think you're overstating its significance.
a) Maxwell's equations don't claim to be 'most fundamental'. They are a concise basis set for a space of problems. They occupy a memetic sweet spot; abstractions both above and below them are more verbose. The metacircular interpreter of Kernel can't compete with McCarthy's version.
b) Any language with fexprs is 'as fundamental' as Kernel. Including LISP 1.
c) In its search for hygiene Kernel eliminates quote. Scheme folks will likely find this a useful tradeoff, while common lisp folks with their backquoted macros will shudder. In kernel you 'create macros' by explicitly consing code.
I concede your point about Maxwell's equations. Their historical significance in particular is the unbelievably rich research that ensued from its implications.
By this metric, Lisp is obviously the analogue in the programming world. It would have been nice though if Lisp had triggered a sudden paradigm shift instead of this painful 50 year burn we're living through.
"The metacircular interpreter of Kernel can't compete with McCarthy's version."
First, it doesn't cover macros, while Kernel subsumes functions and hygienic macros with a single construct. Second, I don't think a toy version of a Kernel metacircular evaluator would be much more complex than the toy McCarthy evaluator.
"Any language with fexprs is 'as fundamental' as Kernel. Including LISP 1."
Only if said language is lexically scoped, and fexprs receive the lexical environment in which they are called as parameter, as in Kernel. Otherwise you'll miss hygiene.
Yeah I'd like to see a toy version of a Kernel metacircular interpreter.
Maybe it's just me, but I don't consider 'better supports hygiene' to imply 'more fundamental'. It doesn't add new capabilities. It merely prevents certain kinds of errors.
When you say ($lambda (x) x), a new fexpr (that ignores the environment in which it is called, because lambdas don't need it) is constructed -- ($vau (x) #ignore x) -- and then it is wrapped, so that its operands are evaluated.
That beams you right back to the 1960s where FEXPRs were common in Lisp. They died a slow and horrible death at the end of the 70s under the influence of Scheme and Maclisp.
Kernel makes the point that much of the vilification of fexprs was actually caused by dynamic scope. Combining them with lexical scope makes for a surprisingly reasonable language.
If you read Pitman's paper, you can read that his critique is not so much about theoretical issues of interaction of dynamic binding, but about practical issues. Pitman was somebody who wrote and used lots of Lisp code which was used by users (for example Macsyma). Dynamic binding was a side issue - practical compilability a much bigger issue.
a) I have read Pitman; please stop insinuating that I haven't.
b) Pitman actually points out that dynamic binding makes compilation of fexprs harder. That was also one of the original selling points of scheme: lexical binding simplifies compiler implementation.
c) Issues of practical compilability are extremely likely to be context-sensitive and hard to generalize from. That some lisp 35 years ago had a hard time compiling some language involving fexprs doesn't help decide if this language here and now can use them.
d) Finally, you're attacking a strawman of your own creation. Pitman didn't focus on dynamic binding, sure. Who said otherwise? That insight was one of Kernel's contributions, that a world where lisps are lexically scoped is actually quite a good fit for fexprs.
Pitman said that compilation of FEXPRS is difficult. Even with lexical binding. Pitman argued for a macro expansion phase. Etc. If you read the original kernel paper it does not even discuss Pitman in more than two sentences. Since Pitman is still alive, he could have even asked him. The kernel paper had a seperate addendum where vague ideas about compilation were discussed.
Pitman was working on large software systems where compilation was the norm and useful.
All the references to lexical scope but one are annotations added decades later. The one early reference said "dynamic scope harder than lexical scope."
In any case, the claim that fexprs make compilation challenging for large software systems is much more nuanced than what you were implying earlier, that Kernel's no different from "the 1960s".
Now that compilation is less challenging thanks to lexical scope, and with computers being so much faster, it's worth considering whether fexprs can be compiled to be fast enough (say to ruby levels for a start). That would still be valuable. Right?
I think it's very superficial to claim that only 'two sentences' of Shutt's thesis were about 'Pitman'. He mentioned the concerns in the abstract[1], for crying out loud. He's addressing the issues throughout even if he isn't constantly paying homage to some sort of Pitman deity.
Certainly there's something very intellectually appealing about a completely transparent stack of abstractions in which all the upper layers can be ultimately be expressed in terms of atomic operations.
However, I'm not so sure this is really so important in practice. We tend to operate within bounded layers of abstraction for any given problem and it's far more important that those layers are predictable, comprehensible, and, ideally, well-documented.
This is why a language like Java, for instance, can be very productive even though the fundamental abstractions are not plastic. And this is also why more malleable languages like Scheme or Lisp can be less productive because there's not enough consensus on how the higher layers should be defined.
I agree, and I think we see this same thing bore out time and again in our field. This is also the distinction between monolithic and microkernels, part of the reason Haskell is succeeding while ML atrophied, the success of statistical AI over "classical" AI, and so forth. We are attracted to minimalism, but it's too much of a hair shirt in practice.
Coincidence. Two days ago stumbled across Racket. Pretty nice, because it's extremely easy to use and supports a number of languages. Just wanted to throw it in for everyone who (like me) always said "sometime". Everything easier than one my think. :)
Oh and for everyone who wants to get into functional programming, very quick (and very hard) simply has to read this Wikibook. It's short, but you will most likely have _understood_ two functional languages (Scheme and Haskell), if you do it seriously:
So Racket for a great (free) time and the latter one for Hackers. At least that's what it looks to me. I just have skimmed through them, but they are the best two things I have found in a while.
A great book I started reading recently, Lisp In Small Pieces, also starts with a simple Lisp in Lisp interpreter. It got me so inspired that I haven't gotten much further than the first 20 or 30 pages since I started writing my own Lisp (in C) a couple weeks ago. ( https://github.com/deliciousrobots/lisp ) It's still only about 1600 lines of C. The only thing that was even slightly tricky so far was adding TCO, which turned out to be pretty trivial. It's some of the most fun I've had on a personal project in awhile.
I have a little git project called 'cetis' where I throw cute/dangerous ideas for a next-gen language, and actually the idea of eval/apply as the "Maxwell's equations of software" is in there due to a talk by Gerald Jay Sussman which is transcribed. He says something to this effect: "Look at what these equations say. They say that a changing magnetic field causes a changing electric field, and a changing electric field causes a changing magnetic field. Look at what we have -- eval calls apply, apply calls eval."
So, cool, but why not push this metaphor further? That's the dangerous idea.
The dangerous idea is that Maxwell's equations don't use voltages. We love voltages. They're useful everywhere. In Einstein's special relativity, once you invent voltages (and their magnetic counterpart, the vector potential) all four Maxwell equations stop being separate, they all take the same form. So you tack the charges ρ onto the current j to get the 4-current Jⁿ = (ρ c, j). You tack the voltage V onto the vector potential a to get the 4-potential Aⁿ = (V/c, a). Then the entire set of Maxwell's equations becomes a single wave equation of your voltages emanating from your currents:
∂ᵢ ∂ⁱ Aⁿ = μ₀ Jⁿ
There are four of them, one for every n, but they're all the same form. If you understand one, you understand them all. There are a couple conventions -- these are summed over i, and if you really wanted to know the kinematics you would want to know the fields -- and those are instead:
Fᵤᵥ = ∂ᵤ Aᵥ − ∂ᵥ Aᵤ
The space-to-time components of the field matrix are the electric field, the space-to-other-space components of the F tensor are the magnetic field; the minus sign guarantees that there are no space-to-same-space or time-to-time components; those are all zero.
Now, before this becomes a rant, here is the idea. Is there a way in which apply can look like eval, and eval can look like apply? If Einstein is a lesson, we might have to invent a means of programming where you look directly at invariants which you want to create.
The goal is that you should just specify an inhomogeneity, a little piece of data and some information on how it should be. The computer then constructs the invariants through some sort of "computational wave" -- and from this the computer can derive the actual field, the instructions needed to actually run the process on a computer.
There are some tantalizing suggestions and possibilities. The Meta II compiler (recently extended in JS as the OMeta project, which is definitely on my source-code reading list) works by realizing that a bunch of different stages in compilation are all just the same idea of pattern matching.
Perhaps one day we'll do what we do with photomultiplier tubes: we just visualize the voltages that we'd need to focus and accelerate electrons into the metal plates, and not even pay attention to the lower-level picture of how these fields are being generated to do the right thing.
This is brilliant and well explained - I saw where you were headed just before you got there. And had a little clap of delight to see my suspicions were correct.
However, don't you think the reason for the success of Maxwell is because the objects it describes are well behaved and can be completely described by a set of partial differential equations? Where as the constituent atoms (hehe) of software are themselves complex objects - more than just point particles whose state can be completely described by say a wave function. And their behvior effectively solvable by simple deduction from physical postulates and the axioms of mathematics. I believe I saw a story about Feynman deriving a bunch of partial differential equations to predict the behavior of a program. But I do not think such an approach is scalable.
But the core of your idea I believe is that by specifying invariant in some theory which encapsulates the possible evolution of all possible programs then you can write programs far more simply? Essentially the next level of a synthesis of programming with unit tests, data and automatic programming. I believe the main impedance to this (I think) is that any such general evolution function would be incomputable. And where as with Maxwell equations the space is limited and the boundaries defined and embodied and easily deductively traversed, for a program the search space would be massive and difficult to search effectively. In effect, I believe you have just specified an AI complete problem.
Notice also that even a classical theory like general relativity yields few analytic solutions and numerical simulations are expensive. Perhaps this hints at why a unified theory is so difficult to come by? While the constituents are still simple particulates, the size of a theory/program which encompasses in totality all possible behaviors in nature might require a chain of reasoning so deep it lies beyond the ken of unaugmented humans. I believe a unified theory of efficiently searchable programs would be even more difficult - such a theory would yield a unified theory of nature as a special case.
I remember seeing a program that would output Haskell functions given just their type signatures. If you think of type signatures as invariants, this is exactly what drostie described. Alas my Google-fu was unable to find a reference to it.
I have considered this road, see this LtU discussion to see why this approach though wonderful, is still too limited for general programming. http://lambda-the-ultimate.org/node/1178
Essentially, as is found in many automatic program derivation attempts, recursion explodes complexity.
Nice idea, but i don't believe that eval and apply are the two sides of the coin. In the article, apply is just a part of eval, which was originally called evalquote.
Cool article, but I am not a particular fan of using lousy metaphors from physics or mathematics as a way to give more credibility or importance to ideas (regardless of the intentions, this is the result those metaphors end up having on some people) - the ideas should stand on their own. The meta-circular interpreter is certainly beautiful, but it is not clear to me if it stands among Computer Science most important results (as Maxwells equations do in Physics) and why would someone count it among those. I am not saying there is no ground to it, maybe there is, but I would like to see some more justification for calling it the "Maxwells equations of software".
It's a pity that there is no discussion of the fact that the 2 interpreters use different scoping. The 1st uses lexical scoping, and the 2nd uses dynamic scoping.
It's also a pity that there is no discussion of the “label” special form, even though its implementation was right there in the code being discussed. This is a very fertile area for conversation.
For example, why is it that McCarthy was able to implement recursive procedures with “label” so simply without using any assignment or mutation?
If you do it that way, you can binary right shift a number by consing a nil at the front, or binary left shift by taking the cdr. You can increment, decrement, add and subtract efficiently, too. Even take logarithms.
I had to use this to replace Lisp's namesake linked lists with another data structure that allowed many of the same niceties, but required some math, and therefore naturals, to find elements in the data structure. It was an interesting project. If you made it through the whole article, you might want to look:
http://dcussen.posterous.com/lisp-in-lisp-without-linked-lis...