Hacker News new | past | comments | ask | show | jobs | submit login
Mezzano – An operating system written in Common Lisp (github.com/froggey)
213 points by arm on Aug 25, 2016 | hide | past | favorite | 63 comments



Very impressive.

Since the demise Lisp machines, there have been several attempts at developing a Lisp-based operating system which didn't deliver anything: LispOS, Tunes, and Loper; and one successful attempt at getting Lisp to run on the bare metal: Movitz.

Lisp was its own operating system on the original Lisp machines developed at MIT, and this evolved into Genera at Symbolics. Is this the case here?

I notice that Mezzano has files. Files are inherited from operating systems running non image-based languages. They're alien to Smalltalk, where everything is just stored in the image (or world, in Lisp parlance). This could also be done in Lisp. Text files could be replaced by long strings contained in the world. Hypertext, provided tags are balanced, could be stored as lists. When the system is shut down, and periodically beforehand, the world is written to disk.

You can go further, by using a single address space. Particularly large blocks of text which don't fit into RAM, and so have to be stored on disk, can be addressed by treating the disk as an extension of RAM. You can even go further on 64 bit machines by treating the entire contents of the internet as an extension of RAM.

I once wrote a small bare-metal Lisp interpreter in x86 assembler, which ran off a 1.4Mb floppy on an old laptop. The hardest part was writing the floppy disk driver, which has to handle frequent hardware errors. USB drivers are many times harder and I didn't attempt that. If the Mezzano developer(s) succeeded, I'm even more impressed.


I think files have proven themselves to be a useful abstraction in enough use cases across enough device form-factors to be worth backporting into whatever your idea of an ideal lisp machine might be.

You could, with a non-trivial amount of effort, replicate the file-like conveniences of global tagging and sorting and organizing of all the objects in your image. You could, also with a non-trivial amount of effort, work out the schemes for permissions, etc. so that objects with all those file-like conveniences can be shared, like files, on multi-user systems or between machines or over networks.

It's not clear to me that the above would offer any tangible benefit over files. So if you're going to put a non-trivial amount of effort into a lisp machine, why not just teach it what a file is?


> I think files have proven themselves

You will need to store data which doesn't fit into RAM in secondary storage. But that doesn't mean you need a file system or even files.

> to be a useful abstraction

Files are a necessary evil in non-image based systems because you need to store data somewhere when the programs using them aren't running. As the different objects they contain, such as plain text, hypertext, photographs, sound recordings, executables, etc. have nothing in common, they seem an unnecessary abstraction. They require that programs which use them parse/serialize their contents. This is unnecessary if the contents are already in memory, already in the format the program needs.

> replicate the file-like conveniences of global tagging and sorting and organizing of all the objects in your image

Why remember a file name and where it is in the directory tree when you could use a search engine to search for it based on content? Or simply chain through objects, going to the field you want and following the link? Programs, of course, will just directly link to the object.

> work out the schemes for permissions

I'd go for capabilities, rather than access control.

> why not just teach it what a file is?

Building a file system is a major undertaking. If it can be avoided, and to the extent it can be avoided, it should be. You would only need to know about files when you interact with systems which are based around files.


I think files are pretty cool because they're loosely coupled with programs. I.e. you can open and edit text or audio files in whatever you like, and even highly-specific binary formats like .doc or .psd have various levels of support in 3rd-party tools. This serves a very useful purpose of ensuring your data is truly yours, and will outlive the program in which it was created. This is a place where IMO we're taking a huge step backwards now, with increasing amount of work being done in the cloud and mobile ecosystems - both shed the concept of files for some amorphous database entries somewhere, thus taking away your control over your data.

Can you preserve this flexibility/loose-coupling feature in image-based storage? I don't know. I'd be interested to learn if it can be done so.


You can do the same with objects. Difference is the semantics is more flexible with an implementation that can be as simple or complex as you want. INFOSEC research used this to advantage where things like PSOS built files on top of simpler store and things like DASD could build in analysis or encryption at disk/object level.


BBN Lisp and its successor Interlisp, popular in the 1960s and 70s, had a persistent data store that was more closely integrated with the running Lisp image than typical general-purpose file systems (slide 41 in [1])

The proposed (but apparently un-implemented) LispOS [2] was to have a single-level store where objects in the running Lisp image would be transparently checkpointed to disk, with no conventional file system.

[1] http://www.international-lisp-conference.org/2005/media/bake...

[2] https://github.com/robert-strandh/LispOS/blob/master/Documen..., also chap-checkpointing.tex


Being file based is very much in MIT Lisp DNA, certainly MACLISP and the CADR MIT Lisp Machine. (Re-)Building from scratch from files and saving out an image for faster loading times (or resaving an image with your project's stable files), vs. the Smalltalk working on an image once you get one going.

The single address space OS was well known from Multics, but I think would have been too much additional work for the developers, and running such a system with the much less reliable hardware of those days required a lot of tape backup, e.g. MIT-Multics would back up any dirty segment (file) after ~1/2 an hour, which of course required 24x7 operator staffing, that sort of Information Utility thing wasn't really in this group's DNA.

The Incompatiable Time Shareing System (ITS) was in fact named in humorous opposition to the Multics predecessor, the Compatible Time Sharing System (CTSS), the latter was xompatible with batch, those jobs would soak up any spare cycles if possible, which was a big deal back then. ITS instead had features where you could pretty much take over the machine for a robotics experiment or demo, that sort of resource allocation was handled socially, and such features were hidden by obscurity and a sort of apprenticeship system, instead of being enforced by the OS.


One of my favorite things to learn when I come across any experimental OS is: "why?" Redox wants to do microkernel architecture The Right Way, and wants to use a guaranteed memory-and-type safe language (Rust). ReactOS wants to create a drop-in replacement for Windows that not only supports older Windows programs, but also Windows-compatible device drivers. MenuetOS wants to build something approximating the OS experience most users are used to, but on bare metal in assembly.

So: why Mezzano?


Basically, the point of lisp OSes, is that they allow you to seamlessly write code at all layers, from down into the darkest bowels of the system, thru and up to the highest level application scripting, all in a single integrated language and libraries and frameworks.

You don't have to reboot a machine just because you make a patch to the kernel, and at the same time, you can patch the system using the same high level development tools as you would use to develop an application.

The thing is that there's what we call an impedence mismatch between a unix kernel and applications running on it, in that the data types processed by applications which are of higher level, don't match the data types processed by the unix kernels. For example, a Ruby Integer is actually a bignum, and you cannot pass a bignum to the unix kernel: you have to provide a bit field with 32-bit or 64-bit. This is work, and this is pain. It is already pain when you write your applications in C or C++ where assumedly you already have bit field types, because your application could be compiled to run on kernels and processors using different word width! Imagine the pain it is when you write your applications in Common Lisp with very high level data types (ratios, closures, pathnames!), and when you have to map those data objects onto the lame bits accepted by unix syscalls. Consider all the literature written about logging, the byte vector logging provided by unix syslog vs. high level object logging provided by higher level libraries or other systems.

And of course, this is not limited to the interface between applications and systems, but goes beyond to the interface between applications. When the system provides a pipe abstraction where all you can pass between applications are streams of byte, this leads to a lot of suffering. You have to serialize and deserialize your data, you have to consider formats (a java float doesn't have the same syntax as a Common Lisp float!), encoding (utf-8? iso-8859-1, -15?). Have you ever been advised to never parse the output of /bin/ls? For good reasons!

Instead, if your system is written in Lisp, then you can directly pass lisp objects from one lisp application to another lisp application, and there's no need to serialize/deserialize, to parse or otherwise mangle the data: you just have lisp objects and you can use them directly. You can pass closures (which enclose the lisp object data along with the lisp functions needed to process them).


> Instead, if your system is written in Lisp, then you can directly pass lisp objects from one lisp application to another lisp application, and there's no need to serialize/deserialize, to parse or otherwise mangle the data: you just have lisp objects and you can use them directly. You can pass closures (which enclose the lisp object data along with the lisp functions needed to process them).

So, if I understand correctly, the memory block in which a particular object is stored doesn't belong a priori to this or that process, but rather you can simply hand down objects to other processes? This is very cool! I've been thinking for quite a while that the very idea of memory protection is just a lame workaround to deal with the fact C is memory-unsafe.


From what I can tell looking at the code, Mezzano doesn't have processes, just threads. There's no distinction between the kernel and user-space.

Whilst convenient, it isn't an unqualified good. Bugs in any part of the system have the potential to catastrophically break everything. It's also terrible for security. |Without the notion of a distinct kernel - or "unsafe" code that only privileged users can compile or load - there's nothing to stop user code reading or writing directly to/from IO ports or arbitrary memory locations.

Being able to pass closures isn't something that's particular to Lisp, nor does it require a lack of memory protection. It just requires that the calling convention can denote a closure as such, and can arrange for it to be invoked in the right security context - i.e. the one in which it was created.

Another issue with having a single global environment backing everything is that you can't easily experiment with changing built-in Lisp functionality without immediately crashing the system.

This sort of thing is probably perfectly fine for embedded systems, and "micro-service" VMs, though. The benefits may well outweigh the risks then. And of course it's great as a starting point for experimentation.


> From what I can tell looking at the code, Mezzano doesn't have processes, just threads.

I'd rather there were no distinction between processes and threads. It's artificial. I want to live in a world where processes cooperate to produce useful results for the user, rather than assume other processes are out there to corrupt their data. If there's any memory protection, I'd rather it be a compile-time, rather than runtime check.

> Bugs in any part of the system have the potential to catastrophically break everything. It's also terrible for security.

Proof-carrying code seems like a better solution than memory protection.

> Another issue with having a single global environment backing everything is that you can't easily experiment with changing built-in Lisp functionality without immediately crashing the system.

To be honest, I'm not so interested in “experimentation”. (I'm culturally not a Lisper. I like thinking and getting things right before I write code.) I just think concurrency would be a lot simpler if you could pass objects directly between programs written by different people.


This seems like a well-articulated point, written with a civil tone. Rather than downvoting, perhaps people could make a counter-argument? The approach of "thinking and getting things right" before writing code might not be common with this audience, but there are fields where it's the only way to work.

I think the idea of compile-time safety proofs is interesting. The number of runtime cycles spent on policy enforcement could be greatly reduced. Would it ever be possible, though, to derive such proofs for programs written in "unsafe" languages, or for arbitrary binaries?

An OS which only supports one language would be unlikely to be generally useful, although it might provide sufficiently compelling benefits for specialized use cases. I believe MirageOS represents an OCaml implementation of this idea, although it may in fact support other languages.


> It's also terrible for security.

The security angle is interesting.

The thing about independent processes is that we're relying on the OS to provide a wall around each process to limit it's impact by default. Maybe a better approach would to be include something like chroot/containerization for execution of closures within the language? Ie, if program 'A' receives data from program 'B', it can evaluate that data in a sandbox with limited permissions (IO limitations like no networking, or CPU resource limitations and time limits) and receive exceptions if the closure tries to exceed it's permissions. That could fit in pretty cleanly with Common Lisp's condition system.

A con of this though: you could argue that it really breaks encapsulation; it's bad enough when different parts of a large program become too closely coupled - allowing different applications to depend on the internal state of each other could be a nightmare.



I've really enjoyed your link-sharing lately. Thanks for spreading knowledge :)


No runtime analysis can simultaneously allow all intended sharing and prevent all unintended sharing. Best you can hope for is a static analysis that rejects programs that would share objects in unintended ways.


I'm not talking about automated runtime analysis per se. The type of permission setting I had in mind would be specified by the programmer and enforced by the runtime/OS.


Ah, okay, makes sense.


Well, if you're running on a typical processor than any process can use pointer arithmetic to access memory, so you still need some sort of low-level memory protection mechanism.

On some hypothetical Lisp/high-level-language processor, that needn't be the case, of course, but I honestly don't think we'll ever see something like that again.


> than any process can use pointer arithmetic to access memory,

The damage this can potentially cause can be prevented using proof-carrying code.


I'm not sure why you're being downvoted.

Store applications as byte code or source code and use a trusted JIT to generate machine code. With a system-wide garbage collector and bounds checking you would have a memory safe single address space operating system.


> I honestly don't think we'll ever see something like that again.

Why?


Probably fragility and security issues in that type of system.


One of earlier works combining high-security and functional programming was a security kernel for Scheme48:

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.37.4...

http://mumble.net/~jar/pubs/secureos/

That was the Scheme mathematically-verified in VLISP project:

https://en.wikipedia.org/wiki/PreScheme

There was also a mathematically-verified, Scheme CPU:

http://www.cs.indiana.edu/pub/techreports/TR544.pdf

I imagine combining even these three components into an integrated systems would provide a lot more security and availability than the average desktop. The simplicity of the CPU might also aid in producing a NonStop-like solution for five 9's availability.


This does a very good job of explaining the value-add of lisp as a whole OS! And it does make the prospect of a working lisp OS on modern hardware seem quite exciting.


Mezzano would be an OS on top of a language/runtime for a programming language which allows flexible development from low-level up to high-level (CLOS/MOP). One could experiment with different OS models, which would be integrating interactive programming.

I grew up with a machine which booted into a BASIC interpreter. It was relatively primitive, but fun. How would it look & feel with a more powerful programming language?

This may all have its limits, but it looks like an interesting experiment. At some point in time it might even useful...


> I grew up with a machine which booted into a BASIC interpreter.

same here (bbc b). i still miss that experience sometimes; a bash prompt is not quite the same thing.


> I grew up with a machine which booted into a BASIC interpreter

ZX Spectrum by any chance?


That's one possible machine. Most Z80 and 6502 machines booted straight into BASIC in the early 1980s.


Apple IIe and IIc. A friend's father had an IIe and I got a IIc for myself, later.


My first computer in 1977 that booted up into BASIC - Commodore PET (Personal Electronic Transactor). It had 8k of memory, and I paid a whopping $400 for the 32k memory expansion module. The original PET cost me $800 used.

My first game was a horse racing / betting game, reflecting my Mom and Dad's penchant for the ponies (OTB, you owe me big time ;) IIRC, I could only use graphics characters if all the letters were uppercase. My game had 3 horses to a race that ran from left to right based on a random number of which horse and how many spaces it moved.

I generated a random number of 0 to 3 for each of the horses with:

100 FOR R = 1 TO 3 110 X = INT(4 * RND(1)) 120 NEXT R 130 SPC(A)

You could bet based on fixed odds, and the payoff would show at the end of the race (bet * odds). My memorable joy was watching my Mom and Dad rooting at the 9 inch monochrome green screen! I was hooked on coding, but only at home. I rarely worked coding for a living.

It booted up into PET BASIC, and aside from some PEEK/POKE limitations, you could access all of it. People hooked up joysticks later to the user port, and hacked speakers or buzzers for sound. I loved the Datasette (cassette tape drive) for storage! You had to put the tape in the drive, instruct BASIC to LOAD "PROG", and then it would prompt you to hit 'PLAY' on the Datasette. I think you then typed RUN "PROG" when if finished loading.

I would go to the store where I bought it in NYC, and they had like 4 or 6 plastic bags with cassette tapes in them and a one sheet or a few sheets of instructions. I wanted FORTRAN or APL, but APL was not available on my PET.

I would love a real LISP Machine, even an historical one for the pleasure of it really being 'turtles all the way down'! I love Lisp more than BASIC, but PET BASIC will always have a special place in my heart, and in the cobwebs of my mind.

[Edit] It would boot up in about 4 seconds or so!


I saw a Commodore PET at that time in a local store. In school I then got access to a Commodore CBM 3032, which was a more robust model of the PET. It had a better keyboard, for example. Booted into BASIC, too.


They were beasts. I left mine at my parent's house, and was hoping to reclaim it, but my cousin John had borrowed it and sold it years later. I should have kept tabs on it!

When I look at the video of Kalman Reti running Symbolics Lisp Machine in an emulator [1], I am still blown away by how much more sophisticated and aesthetically pleasing it was compared to my PET or the Apple ii. The difference seems almost asynchronous, like time-traveler tech.

How you could just drop in to any system library, or even the kernel, and use the same language, Lisp, to modify anything live is astounding. A lot of people have dismissed any talk of how great LISP Machines were as a bunch of nostalgia, but I don't think any of them have watched 15 minutes of somebody operating in that environment. You can't look at it, and keep a straight face when talking about how great the Apple ii was or the Lisa for that matter later on.


The demo by Kalman is pretty cool. Over a period of a decade there were many high-end applications for the machines. The base system could cost from a few ten thousand to $250000 for full set up used in the TV and Broadcast industry: machine, console, software, color screen, huge amounts of memory, large disks, video tape recorder, graphics co-processor, graphics tablet, ... That's far away from the home computers from Commodore.

Example: https://youtu.be/T0PFUC4Nuuk


> So: why Mezzano?

One way to think of it is as a Lisp Machine [1] using an x86 CPU.

[1] https://en.wikipedia.org/wiki/Lisp_machine


Thank you for the link! Turns out I completely missed these pages of computing history.


You'll find that a few on this list of benefits...

http://www.symbolics-dks.com/Genera-why-1.htm

...still aren't available in mainstream OS's despite being totally awesome. I particularly like how they came with integrated editor and source so an OS-related problem in app makes editor show you the problem (data or whatever), the source code involved, and a REPL letting you live-update it. Academics keep building prototypes that approximate any one of these steps to some degree for Linux or BSD. Not the whole thing, consistency, and production grade. :)


Because it can be done!


One of the best reasons to write any software!


> Redox wants to do microkernel architecture The Right Way

What is the Right Way? I couldn't find their take on that in 5 minutes on their website; this seems closest: https://doc.redox-os.org/book/introduction/why_redox.html


I'd assume because a Lisp OS for modern-ish machines is interesting if you're into Lisp and its history.


> microkernel architecture The Right Way

http://yarchive.net/comp/microkernels.html


>Redox wants to do microkernel architecture The Right Way, and wants to use a guaranteed memory-and-type safe language (Rust)

Just curious, how much of the Redox code is wrapped around unsafe?


By definition, all of it. The Rust language doesn't and can't encode the semantics of, say, a DMA chip, or some other piece of hardware at some arbitrary point in memory which can do arbitrary things.

Rust is built around wrapping unsafe things safely - i.e. it's possible in nearly every case to develop a performant API that doesn't let you do anything to break Rust's invariant of memory safety. Usually, these are small, meaning they're testable, which is likely good enough for almost everyone.

The next step up in safety is proving a program against a model of a system, which is a heck of a lot of hard work - you have to create a formal model of every single piece of hardware you might ever want to use. The next step after that would be proving a system actually matches the model.


As of about a year ago [1] it looks like there were quite a few, although the Redox devs were aware and intended to cut down on them. Apparently it's something you can't do completely without, but the goal seems to be to have as few of them as possible, and none in userspace.

[1]: https://news.ycombinator.com/item?id=10295187


Wow, this is pretty impressive. The last time I checked its progress, it was very basic. Now it even has a working GUI with transparency! I should find some time to tinker with it.


I hope there'll be a driver for the Symbolics keyboard. I cannot be productive without a "meta", "super", and "hyper" key.


Neat. I rather like the idea, although these days any OS that doesn't target embedded systems and doesn't sport a web browser is instantly filed as a tech demo...


What'll be really awesome is once it's possible to run SBCL on Mezzano (or would it be Mezzano in SBCL?), to take advantage of all the optimisations SBCL offers.


This is a fun toy.

As our forefathers did program upon the metal in Zetalisp, now thou shalt program upon the metal in Common Lisp.

:D


I see instructions to install it in a VM, but is it possible to run it on bare metal?


I'm also interested in that, but I'm having difficulties in finding the bootstrap code since I'm not proficient in Lisp. Did anybody manage to find it?


I assume so: The only real issue would be BIOS compatability, so it might take some work, but it should be possible.


Well, the main issue would be it requires a specific network card to talk to its file server.


Ah, missed that. Yep.

I think USB, etc. are standard enough, so that should be it.

With Intel motherboard, and Intel integrated networking, it could probably run in my machine.

Wait. Graphics. %$&#!.


Depends on how graphics are implemented, right? Even a modern graphics card has some kind of support for older standards like plain VGA and VESA video modes, it seems like.


Yeah, but I don't even know if Mezzano queries for a GPU or graphics card. It might just decide to use onboard no matter what.


Somewhat off-topic but can anyone point me to any resources that explain the hardware architecture differences between x86 CPUS and the old LISP-machines? Was there hardware level support for s-expression evaluation?


They were microcode machines with certain optimizations for handling Lisp data structures. The memory had also extra tag bits per every word for the benefit of runtime type dispatch and garbage collection.


Wouldn't it be easier to port emacs to common lisp?

Hmm... Actually, probably not.


Depends what you call 'port' and 'emacs'.

Common Lisp already has some native variants of Emacs. Though not GNU Emacs.

The first one was called Hemlock and was developed in the early 80s.

https://en.wikipedia.org/wiki/Hemlock_(editor)


That's pretty nifty. I will have to fire up a VM and try it out.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: