OK, you asked! LISP was originally developed as a language for writing recursive functions of symbolic expressions (S-expressions). It was roughly based on lambda calculus, the language developed by Alonzo Church. But roughly is the key word. S-expressions are given by the context-free grammar:
S ::= A | [] | (S . S)
One can write data structures this way and lists by using the abbreviation:
The terms that manipulated these S-expressions were called M-expressions. They were first-order terms. McCarthy's key idea was to use a conditional in conjunction with a label form to define recursive functions (in the service of various AI applications). The M-expressions were defined roughly as follows:
M ::= S | x | if[M; M; M] | f[M; ...; M]
f ::= lambda[[x1; ...; xn]; M] | label[g; M]
(I'm not 100% confident that I remember the exact details on this syntax but the idea is correct!)
McCarthy wanted to show that his new language was Turing-complete. So he wanted to exhibit a universal function APPLY (derivable from f above) such that for any function f and arguments M1; ...; Mk such that f[M1; ...; Mk] evaluates to S-expression S, well, given a representation of f[M1;...;Mk], lets call it, hat(f[M1;...;Mk]), well
APPLY[hat(f); hat(M1);...;hat(Mk)] would evaluate to hat(S). This is pretty much a standard formulation of the recursion-theoretic argument. In order to close the sale, McCarthy had to exhibit such an APPLY and also the hat(.) function.
Sadly for all of us LISP lovers (!) he botched the definition of hat(.)! It left people utterly confused for 30+ years. Such a shame.
Fixed a couple of typos. Sorry! Fixed one other typo! It's been a while...
Fair enough, I felt I was droning on but it's true that I didn't show the key mistake. Here it is.
If you want to represent an arbitrary M-expression as an M-expression, it's most natural to use S-expressions for the representation language, these are the -values- in M-expression LISP. (In lambda calculus we have more choices, normal-forms or weak-head normal-forms). McCarthy defined hat(.), naturally enough, by induction on the structure of M-expressions. For each M-expression, we need an S-expression representation. (Note that we use uppercase symbols for the symbolic constants and lowercase symbols for identifiers.) Here goes:
But HOLD ON! The S-expressions have inductive structure(!). The definition of hat(S) should have been:
hat(A) == (SYM A)
hat(()) == (NIL)
hat((S1 . S2)) == (PAIR hat(S1) hat(S2))
There are sensible mathematical properties that this latter representation has that the former doesn't. It's a bit of a long story. But the bottom line is that QUOTE, was defined erroneously. (And John McCarthy burst out laughing when I explained it to him.)
I've written a Lisp compiler that handles quotations this way, so that QUOTE is not part of the target language. (Not that this was my own idea.) I can sort of see why you'd call McCarthy's s-expression Lisp a mistake, in that it adds something (QUOTE) not in the original M-expression Lisp, which you can do without. It still seems to me like a strange thing to emphasize, but OK.
BTW, the M-expression syntax for IF was "test -> consequent; alternative" which got encoded as a COND expression. (Doesn't matter, I'm just pointing it out as long as I'm commenting.)
Thank you for reminding me, McCarthy also invented COND which eventually led to the great modern pattern matching forms.
An ironic side-story of this that may or may not be of interest:
Because QUOTE was mis-defined, McCarthy had to hack his definition of APPLY/EVAL to get it to work. One consequence of this hacking was that the S-expression LISP "defined" by his version of APPLY/EVAL was a higher-order language while the M-expression LISP that he was attempting to model was strictly first-order. So in his S-expression LISP he could write the MAP function (called "mapcar" back in the day) but the syntax of M-expressions leaves no way to express MAP.
I find it so ironic that it took this little representation error to lead to LISP having the essential property of lambda calculus. (Guy Steele fixed most of the trouble with the grammar and introduced proper lexical scoping in Scheme but he didn't catch the quote bug.) It's also fair to say that M-expression LISP wouldn't have changed the world as S-expression LISP did.
I don't know if Paul Graham reads HN but Paul once wrote a book on macros in LISP. As far as I know, he doesn't know this story about QUOTE. It doesn't seem to have slowed him down.