Hacker News new | past | comments | ask | show | jobs | submit login
Vacietis, a C compiler targeting Common Lisp (github.com/vsedach)
97 points by reikonomusha on Dec 26, 2020 | hide | past | favorite | 27 comments



Fun fact: In Latvian "vācietis" means "A German man". https://en.m.wiktionary.org/wiki/v%C4%81cietis


Could also be a reference to a beloved poet: https://en.wikipedia.org/wiki/Ojārs_Vācietis


It's amazing how different are names to call German people across europe: german, deutsch, allemand, vācietis


Indeed. Confused the fuck out of me.


But why? Can someone explain the reason behind it? I'm confused.


I wanted a way for a Lisp operating system to be able to use OpenBSD or NetBSD hardware device drivers without modifying the driver source code. NetBSD hackers started writing their device drivers with well-specified interfaces between the driver and kernel (eventually that turned into the rump kernel idea https://wiki.netbsd.org/rumpkernel/), OpenBSD hackers borrowed a lot of the ideas (like bus_dma) into OpenBSD drivers. The Lisp operating system runtime would need to implement the applicable kernel functions used by the drivers, and designate certain pointers (hardware registers and buffers) as real memory writes/addresses, and the drivers should "just work."

Obviously this is also useful for userland stuff.

There are other approaches to this, but Vacietis maps C data and pointers to Common Lisp data structures, so they are inspectable but still memory safe (and you can run a bunch of things in a single address space if you want), and C functions are just Common Lisp functions so all of the development and debugging tools "just work."


I see some potential for rapid application development. Shouldnt give you this a C repl?


It does.


One advantage of C -> Common Lisp vs. FFI is that issues in the C code are less likely to be exploitable since the C is compiled to a more memory-safe language.


Vacietis is interesting and similar to the way C worked in the Symbolics Lisp Machines (Zeta-C), but I think llvm IR -> CL is more practical these days. Iota (made to port Doom to a pure-CL OS, Mezzano, I believe) is an example of this strategy: https://github.com/froggey/Iota


I misunderstand what you're saying or miss an obvious joke. I would have thought a garbage collected language is typically safer.


Given that one has plain C code, then it is possibly safer to run it when compiled to a memory safe runtime (with bounds checks, no pointers, etc.) than to use a C to machine code compiler and call the compiled code from Lisp via a Foreign Function Interface (FFI).


Thanks for clarifying!


I could see this being easier than FFI to use some algorithm implemented in C in a one-off lisp script.


To implement proper tail recursion?


Common lisp has exactly the same requirements on tail calls as C, and in fact has certain features that make tail call elimination not possible when used (e.g unwind protect and dynamic bindings)


The last commit to this project was over 8 years ago. Any plans on getting this thing moving along again? Was there anything learned from this project that can aid development of similar projects?


> The last commit to this project was over 8 years ago.

See Jiyuno Megami's repository: https://github.com/jiyunomegami/Vacietis

> Any plans on getting this thing moving along again?

Yes, sometime in the upcoming decade.

> Was there anything learned from this project that can aid development of similar projects?

What is a "similar project?" Yokota Yuki's with-c-syntax¹ takes a completely different approach to parsing and runtime representation (it is similar to ScottBurson's ZetaC², which I studied closely before writing Vacietis). I want to use this to run OpenBSD or NetBSD hardware device drivers on a memory-safe Common Lisp operating system runtime. For me a "similar project" is user-space device drivers.

¹ https://github.com/y2q-actionman/with-c-syntax/ ² http://www.bitsavers.org/bits/TI/Explorer/zeta-c/


I'd be interested in seeing some sample input/output examples.


Duff's device output cut due to length limit, but here you go: CL-USER> (vacietis::cstr-noeval " void send (char * to, char * from, int count) { int n = (count + 7) / 8; switch (count % 8) { case 0: do { * to++ = * from++; case 7: * to++ = * from++; case 6: * to++ = * from++; case 5: * to++ = * from++; case 4: * to++ = * from++; case 3: * to++ = * from++; case 2: * to++ = * from++; case 1: * to++ = * from++; } while (--n > 0); } } int main () { char tobuf[32]; memset(tobuf, 0, sizeof(tobuf)); char* from = \"Duff the magic dragon's device!\"; send(tobuf, from, 31); printf(\"from: %s\\n\", from); printf(\" to: %s\\n\", tobuf); return 0; }") (PROGN (PROGN (DECLAIM (FTYPE (FUNCTION (* * * ) T) SEND)) (VACIETIS::VAC-DEFUN/1 SEND (TO FROM COUNT) (DECLARE (TYPE (SIGNED-BYTE 32) COUNT)) (PROG* ((N (VACIETIS.C:INTEGER/ (VACIETIS.C:+ COUNT 7) 8))) (DECLARE (TYPE (SIGNED-BYTE 32) N)) (VACIETIS.C:SWITCH (VACIETIS.C:% COUNT 8) (1 2 3 4 5 6 7 0) (0 (VACIETIS.C:DO (TAGBODY (VACIETIS.C:= (VACIETIS.C:DEREF* (PROG1 TO (VACIETIS.C:= TO (VACIETIS.C:PTR+ TO 1)))) (VACIETIS.C:DEREF* (PROG1 FROM (VACIETIS.C:= FROM (VACIETIS.C:PTR+ FROM 1))))) 7 (VACIETIS.C:= (VACIETIS.C:DEREF* (PROG1 TO (VACIETIS.C:= TO (VACIETIS.C:PTR+ TO 1)))) (VACIETIS.C:DEREF* (PROG1 FROM (VACIETIS.C:= FROM (VACIETIS.C:PTR+ FROM 1))))) 6 ... ))))))) (PROGN (DECLAIM (FTYPE (FUNCTION NIL (SIGNED-BYTE 32)) MAIN)) (VACIETIS::VAC-DEFUN/1 MAIN NIL (DECLARE) (PROG* ((TOBUF (MAKE-ARRAY 32 :ELEMENT-TYPE '(SIGNED-BYTE 8))) (FROM (VACIETIS:STRING-TO-CHAR* "Duff the magic dragon's device!"))) (DECLARE (DYNAMIC-EXTENT FROM) (TYPE (SIMPLE-ARRAY (SIGNED-BYTE 8) (32)) TOBUF) (DYNAMIC-EXTENT TOBUF)) (MEMSET TOBUF 0 32) (SEND TOBUF FROM 31) (PRINTF (VACIETIS:STRING-TO-CHAR* "from: %s ") FROM) (PRINTF (VACIETIS:STRING-TO-CHAR* " to: %s ") TOBUF) (RETURN-FROM MAIN 0)))))

Once you evaluate the above, you can call main from lisp.


(Psst, indent four spaces for monospace blocks)

    Like this.  :)


Ha, I usually use a C->JVM compiler as my example of why the C standard has some ridiculous parts, but this works too!


Is there a C->JVM compiler out there that's any good and actively maintained? What are the major problems that come up with this approach?


I don't think there's anything that's fundamentally impossible (or even conceptually very hard) to solve, it's just that most of the times you're solving the interop problem with JNI/JNA. The opposite route is massively more complicated to implement and nobody wants to dump their time into that. IIRC, as a matter of fact, there are many amateurish level C->JVM compilers but nothing that you'd be comfortable working with on a daily basis.

I guess it boils down to "yeah but why".


One possible approach today is to compile C to WASM, then run WASM on the JVM as GraalVM supports WASM. https://www.graalvm.org/reference-manual/wasm/


If you're using GraalVM already, I think Sulong might be better -- my experience with WASM (for C) is that there's a bevy of POSIX-correct-but-ANSI-incorrect tricks that don't work properly on WASM. Debugging is also much nicer with LLVM IR (worst-case, just compile it to a binary and gdb that) than WASM (Firefox doesn't show the stack, Chrome has sourcemap bugs).


NestedVM works pretty well. http://nestedvm.ibex.org/




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: