The Kernel Programming Language (2010)

breckinloggins · on Feb 17, 2021

I was obsessed with this for a little while several years ago and ended up making my own version based on the same paper: https://github.com/breckinloggins/vau/blob/master/README.md

It was my second try at a lisp, the first being erlisp (https://github.com/breckinloggins/erlisp).

Neither of these projects really went anywhere, but they really scratched an itch I had and I learned so much about programming language implementation while working on them.

If you haven't tried yet I highly recommend writing a little lisp of your own. It doesn't need to take over the world to be a meaningful project in your life.

JoelMcCracken · on Feb 17, 2021

Indeed, Kernel is to Scheme as Scheme is to Common lisp. Kernel really is a truly beautiful little idea.

ungamedplayer · on Feb 17, 2021

I played around with erlisp and lfe for a little while. I didnt write anything substantial but it got me interested in common lisp.

xianwen · on Feb 17, 2021

Would you recommend Common Lisp as the lisp dialect for a new beginner and for an eventual hobby project? Do you recommend SBCL over other Common Lisp implementations?

varjag · on Feb 17, 2021

(not the parent poster)

Depending on the nature of your project. If it's some purely academic CS exploration, Scheme sometimes (but not always) allows you to express the concepts more eloquently. If it has to deal with practicalities of the broader world however, CL quickly takes over. The development experience is generally smoother in CL too.

If you run your things on a major architecture (x86 or aarch64) SBCL is a great starting point. However on certain platforms other compilers are more feature complete.

ungamedplayer · on Feb 24, 2021

However most if what you write in one lisp can be ported over to another lisp without too much hassle.

reitzensteinm · on Feb 17, 2021

My current side project is an implementation of a Clojure like language greatly inspired by and built similarly to Kernel.

Absolutely everything user accessible is a first class immutable closure that takes two objects, for arguments and calling environment.

If your object has been sent a message, you need to inspect it - by sending the message a message! Then maybe the message messages the environment to evaluate your message, which...

The turtles all the way down dependency is handled by equality testing being short circuited in the case of a pointer compare. If you (msg* my-list [:first]), my-list will message [:first] its own [:first] to inspect it. But if you send it the interned [:first] message, it'll recognize it implicitly.

This is, as you can imagine, absolutely horrendously slow - as the Kernel inventor found their own implementation to be. The core.clj file [1], which basically implements fn, destructuring let and cond, takes twenty seconds and 88 million function calls to execute.

However, the simplicity of evaluation (there aren't even local variables built in) and pervasive immutability means you get partial evaluation almost for free. Running in partial mode, eval evaluates what expressions it can and returns simplified code containing the unbound variables.

E.g. where x is unbound and y is 2, (+ (+ 1 y) x) => (+ 3 x)

You can partially evaluate multiple times as more arguments are known. In theory this allows you to elide all of the boilerplate when compiling. I'm working on the compiler now.

You can also cache whatever you want by writing it to disk, even if it closes over its environment. Every result is written and read to enforce that this works.

The initial objects are written in a bootstrap language (js). The goal is to keep that under 1k lines, although it could probably be half that. Then Clojure-like is bootstrapped, although not cleanly. Then the compiler is written in Clojure, which lets you JIT to js and get sane performance. I also have a C version using tcc, but it's harder to develop and significantly lags the js version.

It's been pretty interesting, although I struggle to see any practical uses. Writing code in a language with the clean mental model of something like the SICP metacircular evaluator is a joy.

[1] https://gist.github.com/reitzensteinm/d95d13609f6d6fd9629133...

QuesnayJr · on Feb 17, 2021

I didn't realize that Kernel ends up being necessarily very slow. Can you give a quick explanation of why?

mst · on Feb 17, 2021

A naive implementation has to do a ridiculous number of scope lookups, even before you get to anything else.

Also instead of compile time macros, one has runtime fexprs, which means the equivalent of macro expansion happens every invocation.

However, one can have kernel style semantics with at least moderately reasonable performance by basically applying a combination of constant folding and expansion caching.

It's a bunch of extra effort though, and non-trivial to make work, though I'm currently experimenting myself with something where I'm willing to do that.

A key part is to be able to recognise fexprs that are 'really' just macros and pre-expand them and cache the expansion - without at least that you're screwed.

I'm also experimenting with what else I can cache given the principle that everything is a lexical environment.

I'm rather enjoying myself ... but understand the risks.

reitzensteinm · on Feb 17, 2021

Because everything is a fexpr, with lambda built on top of it, when you call a function you're not passing it values, you're passing it code and a first class environment to use to execute it.

Further, because macro like features are first class, they're executed each time they're used as opposed to generating code.

I believe there are decent implementations but I bet they put a fair amount of work in to making them fast. The naive reference implementation is extremely slow.

While my toy project is a bit different and probably even more wasteful (with a list calling convention and immutability), it faces similar challenges. Take a read through core.clj and try to spot where the 88 million function calls are spent.

mst · on Feb 17, 2021

Yah, my kernel-like thing ... oh gods the number of reductions.

lambda (wrap) basically has to be VM provided code, and that's just the start.

But for sheer flexibility it's beautiful and I don't feel like giving up yet :)

sparkie · on Feb 17, 2021

klisp is quite efficient (though still slow compared to other Lisps).

Bronze-age-lisp optimizes some of the most critical parts in assembly language. This is probably the fastest Kernel you will find. Some details on the design were published in 4 posts to this group: https://groups.google.com/g/klisp

iovrthoughtthis · on Feb 17, 2021

I remember reading this thesis and implementing just such a lisp like language. The separation of lambda i to vau and wrap was really mind bending at the time but I was really interested in the fewest primitives required to bootstrap a whole language.

transfire · on Feb 17, 2021

Can anyone explain this in more layman terms?

sparkie · on Feb 17, 2021

I'll try:

In most languages, if you call a function f(x), the value x will be implicitly reduced before the resulting value is passed as an argument to the body of the function f. Most languages don't offer any alternative, although Lisp does offer alternatives, which are to quote x, or to implement f as a macro which is expanded at compile time.

In Kernel, the fundamental means of combination is an operative, which does not reduce its operands. It passes the operands verbatim, as they appear in code. For example, in $f(1 + 2), the body of $f does not receive the value 3. It receives the expression "1 + 2" (as a list). The operative also receives a reference to the current dynamic environment of the caller of $f, which can be used to evaluate (1 + 2) explicitly if needed, or this environment may be mutated by $f (Though this mutation is limited only to the local scope of the caller and none its parents).

One motivating example for operatives is the || and && operators in other languages. These operators both have short-circuiting behaviour if the left-hand side evaluates to true/false respectively. If these were implemented as functions, then both the LHS and RHS would be reduced before the operator itself is called, which is not what we want. In other lisps, these operators are special forms, which the compiler is aware of. However, as special forms, they are second class citizens of the language. You cannot assign them to variables. They must appear in their own name when you wish to use them.

The `$and?` operative in kernel is implemented in the following way in the standard library (It isn't a language primitive). This definition also supports an arbitrary number of operands too - it isn't just a binary operator.

    ($define! $and?
        ($vau x e
            ($cond ((null? x)         #t)
                   ((null? (cdr x))   (eval (car x) e))
                   ((eval (car x) e)  (apply (wrap $and?) (cdr x) e))
                   (#t                #f))))

So what of regular functions? In Kernel, they are referred to as applicatives. They are constructed using the primitive `wrap` (which is itself applicative). This takes a single argument which must be another combiner (A combiner is either an operative or an applicative). Usually the argument to wrap is operative because the use-case for doubly-wrapped combiners is limited. Operatives and applicatives are disjoint types, known by the runtime. (All types in Kernel are disjoint and there is no subtyping). An applicative causes the operands of a combination to be reduced before passing the resulting arguments to the underlying combiner of the applicative.

jessriedel · on March 1, 2021

> Most languages don't offer any alternative, although Lisp does offer alternatives, which are to quote x, or to implement f as a macro which is expanded at compile time.

Mathematica allows you to pass symbolic expressions in this way too, no?

sparkie · on Feb 17, 2021

For more detail on how this works, I'll try to describe how it is implemented.

When the evaluator sees a combination such as (f (x y)), it evaluates f and checks if the result is a combiner (and terminates if not). It then tests if f is operative, and if so, it passes the operand list (x y) to the body of f, along with the current environment. If f is applicative, it evaluates each item in the operand list (by recursively calling the evaluator on each individually) to produce a list of arguments. The evaluator then conses the underlying combiner of the applicative with the argument list, and recursively calls the evaluator with the result. Something like this:

    ($define! eval 
      ($lambda (object env)
        ($if (not? (environment? env))
          (error "Second argument to eval must be an environment")
          ($cond ((symbol? object) (lookup env object))
                 ((pair? object)
                     ($let ((comb (car object)))
                         ($cond ((operative? comb) (call comb (cdr object)) env)
                                ((applicative? comb)
                                   ($let ((app (unwrap comb))
                                          (args (eval-list (cdr object) env)))
                                       (eval (cons app args) env)))
                                (#t (error "The first item of a combination must be a combiner")))))

                 ;; If object is neither symbol nor pair, it must be a self-evaluating term.
                 (#t object)))))

The primitives used here, which I'll leave out for brevity:

    (lookup env symbol) ; returns the expression mapped to the symbol in the dynamic environment env.
    (call operative operand env) ; calls operative with operands and the dynamic environment.
    (unwrap applicative) ; returns the underlying combiner of the applicative.
    (eval-list objs env) ; evaluate each item in the list objs using the dynamic environment, in no specific order.

You can see the simplicity of this interpreter. It does not need special consideration for operators like || and &&, or any other special form. The only special forms are the primitives themselves, which are all operatives. The behaviour of `call` depends on whether or not an operative is primitive or not. Primitives are implementation specific as they're part of the interpreter itself.

Operatives which are not primitive are called compound operatives, and are constructed using $vau, as in the above example for the $and? operative. It has the form:

    ($vau formal-parameter-tree  symbol-for-dynamic-env  operative-body)

You can consider a compound operative to merely be a structure (or 4-tuple) which stores its 3 arguments plus the environment which is passed to `call` when the `$vau` operative itself is evaluated. (The static environment of the operative)

When the compound operative is later called, it constructs a new environment with its static environment as the parent, called the local-evironment. It then maps, in the local-environment, each symbol from its formal parameter tree to the respective operand in the operand list passed to it (the structures of the operands must match the parameter tree). It also maps the symbol-for-dynamic-environment to the environment passed to it by eval. Call then simply evaluates the body of the combiner with `local-environment` as the environment. Eg:

    ($define! call-compound-operative 
      ($lambda (operative operands env)
        ($let ((local-env (make-environment (get-static-env operative))))
            (set-parameters! local-env (get-parameter-tree operative) operands)
            (set! local-env (get-symbol-for-dynamic-env operative) env)
            (eval (get-body operative) local-env))))

sparkie · on Feb 17, 2021

A key part of the design is the way environments are implemented. An environment in Kernel consists of two parts: A local map of bindings from symbols to expressions, and an ordered list of references to parent environments. Importantly, the references to parent environments may not be accessed directly, and the parent list is immutable.

When a symbol is looked up in an environment, it first checks the local bindings, and if not found, it will then look through the bindings of the parents, in order. This forms a depth-first search through a potential DAG of environments, with the current environment at the root.

The constructor (make-environment) takes an optional list of parents. The order specified is the order in which they will be searched when a lookup occurs.

Mutation is only possible for the local bindings of the root environment. You cannot mutate the bindings of a parent environment. This is one of the key improvements over fexprs, which would be able to mutate anything in the dynamic environment. Since it is only possible to mutate the local bindings, and you can only do so if given a direct reference to an environment, this places a constraint on the potential side-effects an operative could perform. An operative can only mutate the local bindings of its caller, unless explicitly given reference to some other environment.

sparkie · on Feb 17, 2021

Last part unless you have any questions and I will expand

I mentioned all types in Kernel are disjoint, but didn't mention that they are encapsulated. All primitive types are encapsulated, but you can also create your own encapsulations. There is no bolts-attached object or records libraries like in lisp or scheme to achieve this: just one simple primitive: (make-encapsulation-type), which takes no arguments.

The primitive returns a 3-tuple consisting of a constructor, a predicate to test if an object is of the encapsulated type, and an elimination function to access the value with which it was constructed. Generally, you would expose the predicate by binding it to a symbol in an accessible environment, but you would not reveal the constructor and eliminator - you would use them only inside the static environment where defining your encapsulation.

To give and example, lets use the compound operative type I referred to above - which is essentially a 4-tuple. I used the functions `get-static-env`, `get-parameter-tree` etc without defining them earlier. I will define them here.

    ($provide! 
      ($vau 
       get-static-env 
       get-parameter-tree 
       get-symbol-for-dynamic-env 
       get-body
       operative?)

        ($define! (make-operative operative? eliminate-operative) (make-encapsulation-type))

        ($define! $vau 
            ($vau-primitive (formals symbol body) static-env
                (make-operative (list formals symbol body static-env))))

        ($define! get-static-env
            ($lambda (operative)
                ($let ((#ignore #ignore #ignore static-env) (eliminate-operative operative))
                    static-env)))

        ($define! get-parameter-tree
            ($lambda (operative)
                ($let ((formals #ignore #ignore #ignore) (eliminate-operative operative))
                    formals)))

        ($define! get-symbol-for-dynamic-env
            ($lambda (operative)
                ($let ((#ignore symbol #ignore #ignore) (eliminate-operative operative))
                    symbol)))

        ($define! get-body
            ($lambda (operative)
                ($let ((#ignore #ignore body #ignore) (eliminate-operative) operative))
                    body)))

$provide! here binds the list of symbols in its first operand into the dynamic environment where $provide! was called. The remaining operands to $provide! mutate the local environment of $provide! but not the dynamic environment from which $provide! was called. This means that the definitions of 'make-operative' and 'eliminate-operative' are only accessible from within the local scope of provide, but are no longer accessible directly once $provide! has been evaluated. Note that $provide! itself is not a language primitive, but a function in the standard library. It is a compound operative implemented using $define!

The user of this 'library' can only use the 6 combiners provided at the top. The details of how operatives are implemented (as a 4-tuple) are hidden from any use of these combiners.

This clever way in which environments, encapsulations and operatives work together is truly a gem. The decision to forbid mutating parent environments, and the ability to encapsulate objects fixes the problems of old fexprs, which were very unintuitive to work with.

oxymoron · on March 1, 2021

This is excellent. Thanks for putting in the time of writing it up!

orthoxerox · on Feb 17, 2021

Most languages have subroutine/function calls that receive evaluated arguments and special constructs that don't. For example, in `sin(x)` the sin() function receives the value of x, but `if (x) { print(1); } else { print(2); }` doesn't evaluate both arms.

Even homoiconic languages like various Lisps have so-called special forms which treat their arguments differently from regular functions. Kernel makes everything a special form, so `(sin x)` actually receives a symbol x and a way to evaluate it in the context of its caller, which it immediately does.

It's not of much practical use, but it's a very beatiful design, reducing the number of concepts required to describe the semantics of a language even further.

afranchuk · on Feb 17, 2021

I hope you mean it's not of much practical use to languages that already support macro definitions :) because otherwise, some would consider it very useful to be able to define arbitrary macros/syntactic forms!

sparkie · on Feb 17, 2021

Kernel's operatives aren't syntactic forms. They're first-class expressions which are evaluated at runtime. The behavior is quite different from macros and provides greater abstractive power.

IMO, Kernel is the ideal playground for testing out language designs and features. It's performance limits it from being a practical language for deploying applications.

orthoxerox · on Feb 17, 2021

Most popular languages don't let you define custom syntactic forms. C and C++ have a preprocessor, Scala has macros and you can squint and pretend that passing blocks in Ruby counts, but that's it. It's not a killer feature for the vast majority of programs, Go's boneheaded straightforwardness is a good counterexample.

haskal · on Feb 17, 2021

Isn't this a terrible language for a computer language? Let alone I have to search "Ruby programming" or "Python programming", "Kernel" is not even the top result for "Kernel language".

Naming things, indeed the hardest thing in CS.

dang · on Feb 17, 2021

"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

https://news.ycombinator.com/newsguidelines.html

Threads are super sensitive to initial conditions, so it's particularly important not to do this when the thread is fresh.