Hacker News new | past | comments | ask | show | jobs | submit login
Sxc – An s-expression based language source transpiler for C (github.com/burtonsamograd)
101 points by kruhft on Dec 17, 2016 | hide | past | favorite | 34 comments



I made something very much like this! As I worked on it I slowly rewrote the transpiler in the sexpression language itself, which led to me using it to help guide new features. It's always tempting to write the compiler as the first program despite knowing that it produces a language designed for writing compilers. But it was just a fun programming project not designed to go anywhere.

I remember finding it was hard to remember how many parens to use where. For example is it (if (condition) (statement)) or is it (if (condition) (list of statements))? If it's the former you have to explicitly call out multi-statement if with something like progn, like (if (== foo 3) (progn (= foo 4) (return))), which doesn't feel very C-like, while if you pick the latter option it's easy to get the syntax wrong, when you have single-statement blocks because you need to double-paren them, like (if (== foo 3) ((return)). When you're writing scheme this isn't such a problem because it tends to be expression-oriented, but when you're writing C with a different syntax I found it pretty confusing.


You could use an if variant with keywords, like if* (http://franz.com/~jkf/ifstar.txt):

(if* (== foo 3) then (foo 4) (return))


That's the choice i made, with (if .. (else... )).


Gauche Scheme also has something called CiSE (C in S-Expressions) [1]. It's a bit under-documented though, but you can see examples throughout Gauche Scheme source code, for example [2]

[1] https://github.com/shirok/Gauche/blob/master/lib/gauche/cgen...

[2] https://github.com/shirok/Gauche/blob/master/ext/bcrypt/bcry...


We do this at work, but for pascal. Most algol-like languages can quite easily be converted straight away into sexpr. I just defined a neat little subset to begin with and mangled the rest in there. We generate a _lot_ of code using a guile/pascal macro system, using guile's syntax-case to destructure the macros and turning them into proper pascal.

It was really just a prototype at first. I was going to have a more pascal-like syntax in the end, but the prototype was fast enough, and manually parsing was more trouble than it was worth.

So, we don't write all of our code in sexprs, but have a preprocessor (PPP - the pascal pre-processor) that reads a sexpr version of pascal and generates code accordingly.


Related, s-expression Python: https://github.com/hylang/hy


also:

  - pixie (sexp + python + llvm)
  - pharen (sexp + php)


I remember this one: https://github.com/tomhrr/dale


Would you know if Hy can be jitted with PyPy ?


You should try, but it probably cannot, since it uses internal-ish interfaces of the official Python interpreter.


Finally, a sexpr-based transpiler that actually allows you to manipulate sexprs as compile time.

I don't know how people keep missing this (like, for instance, Lispyscript): newsflash, people: syntax-rules is considered inadequate for many types of macros by the community that invented it. If you want the full power of sexprs, you need to actually have the ability to traverse them.


What do you see as the most important differences (or, what are the main kinds of macros for which syntax-rules is inadequate)?


Syntax-rules has difficulty in any context where referential opacity or the introduction of bindings is desired. This means anaphoric macros, loop structures with break statements, and so on. It's not impossible (at least in Scheme), but it is hard.

In addition, syntax-rules is based on a DSL. If you want to build a macro that cannot be expressed in that DSL, or is hard to exress in that DSL (fairly common), tough.

With imperative hygienic macros (provided they give you control over hygene), or imperative unhygienic macros, this isn't a problem.


This is why R7RS has syntax-case.

HOWEVER... various bugs can be easily introduced if your macros are not hygienic. For many use cases, syntax-rules will suffice, but when you use syntax-case you must explicitly break hygiene.

It is a parallel situation to languages like Rust where unsafe operations are sometimes required, and therefore provided by the language but you must explicitly flag them "unsafe".


That's... very wrong.

>This is why R7RS has syntax-case.

It doesn't. In fact, sc-macro-transformer is a candidate for inclusion in r7rs-large.

>HOWEVER... various bugs can be easily introduced if your macros are not hygienic.

That's true, but you're confusing hygienic/unhygienic with imperative/declarative. sc-macro-transformer, syntax-case and friends are all hygienic imperative macro systems. So none of that matters.


Imperative unhygienic macros are what Common Lisp has, right? So you're pretty much guaranteed you'll be able to do whatever you want to do, the tradeoff being that you are responsible for watching out for unwanted name capture?


Yes, although I'm a fan of imperative hygienic macros myself.


@qwertyuiop924 Shoot me an email, I'd like to talk with you about this project: kruhft@gmail.com.


I wonder what the most compressable language is. Would it be a C-style language due to {, }, and ; or LISP like languages due to ()s everywhere?


Very nice.

What I particularly like about this is the C s-exp syntax. The s-exp C actually feels like C. It does not introduce additional tokens to identify functions or types.

This approach could be generalized to write transpilers for other languages with C-like syntax (Java, JavaScript, C++, etc).


Thank you. I tried to stay as much with the 'grammar' of C but with the syntax of s-expressions. Whatever came out most naturally as a C programmer was what I used in the S-Expr language.



cmacro is it's own language that doesn't have the power of Common Lisp. Same with magma; to each their own, but I prefer the capabilities of CL over a homegrown macro language.


This is nice, but only hardcore Lisp users prefer writing C-in-sexpr's to just writing C.

The "industrial-strength" alternative in the open-source world is just to parse C into an AST; several important projects like the Frama-C analyzer (and, IIRC, the Cocinelle project best-known for "spatch") use CIL (https://people.eecs.berkeley.edu/~necula/cil/), and of course many recently-started projects use the LLVM infrastructure.

sexpr's do make the AST more obvious than relying on an external compiler, but you need to write a lot of macro's before "more-convenient macro's" beat "more-convenient code".


> This is nice, but only hardcore Lisp users prefer writing C-in-sexpr's to just writing C.

I object! The choice between this and straight C isn't completely up to personal preference; using s-expressions gives you objectively more power because it lends itself to lisp macros, which are really the strongest reason to use any lisp at all.


I think he's suggesting that you could add the same functionality using lisp as a preprocessor without having to write what is objectively C semantics weapped in s-exps.


That's more or less obvious, and even has prior art by Java father James Gosling in the Ace preprocessor:

https://swtch.com/gosling89ace.pdf

Well, minus the "using Lisp" part.


I disagree. I hated C and found LISP's style weird when I used it for a BASIC-like 4GL. I used LISP since it made it trivial to do parsing, transformations, rapid prototyping, and extraction to C for portability. I still effectively coded in a structured & imperative style in BASIC. I only knew enough C... a subset of it... to convert the BASIC commands.

So, I could program super-fast in a high-level BASIC with C output. LISP was worth using to reduce effort as I have no idea how I'd do all of that in just BASIC or C with what little experience I had.


I would assume this is for writing a program generator. That is, you have a lisp program that will generate a C program.


In the end, that's where it's true power would come from where you use the capabilities of the much higher level language to compile ideas to the lower level language.

ALso, this allows for far greater compile time data generation for static initialization. Even files can be loaded in to do away with runtime IO, building programs that are closer to self contained 'images' rather than a collection of files.


I was trying to think of how to say the same and thought "sexy-c" might work. Cool project.


I just say 'sexy' :) Thanks.


This is a really fun approach.

How about an (unfinished) common lisp to go transpiler?

https://github.com/aaron-lebo/centro


This makes me think of the terra project ! http://terralang.org

Terra looks nicer to me, mostly because of syntax, but I can see those two projects having about the same power




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: