Hacker News new | past | comments | ask | show | jobs | submit login
Firm: a C library with graph-based IR suitable for compilers (kit.edu)
64 points by vmorgulis on Jan 18, 2016 | hide | past | favorite | 19 comments



Another interesting comparison between clang/llvm and cparser/firm has to do with their respective preprocessors and underlying AST structures. cparser's entity_t and related structures are clean, clear, and very easy to work with. That, together with the fact that cparser is written in C, allows you to effortlessly walk the AST while having access to its entire informationwithout the help of an iterator, and without having to call any specialized functions. Having worked with both (lib)clang and (lib)cparser, I consider cparser the better and easier library for source analysis, as well source-to-source transformation tools. Other areas where cparser's AST could be nicely exploited: a source editor that provides, in addition to syntax highlighting, information about linkage and effective optimization, and a source generation tool based on the user's own code gallery.


The following code crashes the compiler but is accepted by GCC.

    static float b5[0][0] = {};
    int main(void)
    {
        return 0;
    }
I have filed a bug report.


0-length arrays are a GNU C extension[1].

In ISO C99, they're only permitted as the last member of a struct with another member, and are written with an empty pair of brackets[] and no 0.

1. https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html


The compiler is described on its home page as "accompanying GCC-compatible C frontend." So what does this mean with respect to GCC features? In any case, I don't think the compiler should crash but report a syntax error.


It's sufficiently compatible to handle e.g. glibc, which is quite a feat, because it uses many and somewhat obscure gcc extensions.


Thanks for the bug report! :)


Fixed, thanks for the report. (: (stupid gcc extensions sigh)


Howdy! Very interesting project, coming from an avid LLVM user. Do you or your organization have a Twitter/FB/social page and/or GitHub page?


We have an IRC channel (#firm on FreeNode), a mailing list (see libfirm.org) and there are github mirrors for cparser (https://github.com/MatzeB/cparser) and libfirm (https://github.com/MatzeB/libfirm/). If you want to report a bug, there's our bugtracker (also linked on libfirm.org)


I'd be curious about the following aspects: since we have enough good C compilers we rather need code generators for languages that do things differently than C. In particular, how difficult is it to generate code for the following things:

1. Exceptions. There are several ways to implement exceptions: stack walking or stack cutting.

2. Calling conventions. A language might want to use internally a different calling convention than the one dictated by the ABI.

3. Garbage collection. This requires interaction with the runtime system which must know which values are pointers into the heap.

4. Tail call optimization.

There are probably other modern language features that require support from the code generator. Could you comment on how the design of the code generator (in rough sketches) account for these?


libFirm is not tied to compiling C. It is a more general intermediate representation library, which provides an API to attach a frontend, e.g. there is one for X10, too. Further, I doubt that we have enough good C compilers.

1. How exceptions are to be implemented is dictated by the runtime environment. Support for exceptions in general is currently work in progress.

2. libFirm's register alloctor does not make any restrictions about the calling convention. If you want to have a call, which passes the first and seventh parameter in %esi and %edx, then you can specify that in the intermediate representation and the register allocator will handle that. Of course there is no generic mechanism to export this insanity to the frontend, but the backend itself can generate code for that without problem.

3. There is currently no support for GCs, i.e. we do not readily export information, where pointers are. Though libFirm's type system is strong enough to contain that information, but somebody has to do the work to export this and connect it to a GC. There was not enough interest and too little manpower available to do this so far. For all projects, where a GC is used, we use a Boehm GC.

4. Yes, libFirm has a pass for this. It can even eliminate pseudo-tail calls like 'int fac(int n) { return n == 0 ? 1 : n * fac(n - 1); }'. This is no proper tail call, because the '' is the last operation, which happens (and not the call to fac()). Because '' is associative and commutative, we can still transform this into a loop. And it has many more analyses and transformations: local optimizations (aka "instcombine"), common subexpression elimination, partial redundancy elimination, scalar replacement, reassociation, implicit (due to being dependency-graph based) dead code elimination, Click's combo optimization, bitwise constant information, load-store optimization, control-flow optimization, conditional jump threading, if-conversion, operator strength reduction, division-by-constant optmization, function inlining, register-pressure aware instruction scheduling, block scheduling, belady spiller, SSA-based register allocation and copy coalescing, ...


Due to the development of the X10 frontend, Firm gained more support for object oriented features (stack allocation, devirtualization) and exceptions are next.

Firm is well suited for anything natively compiled (Rust, Go, etc). Support for precise garbage collection would be nice. Nobody is working on this atm.


Having a second Rust compiler would be awesome and a true validation of the language.


Differences between this and LLVM? I feel it should at least be mentioned on their page.


There's also this mail in the freebsd list archives on the subject. Not sure how current it is, but it gives more names of algorithms and people in case one wants to dig deeper. https://lists.freebsd.org/pipermail/freebsd-hackers/2008-Nov...


The main points of this (i.e my) post on the mailing list still apply. Nowadays everything is even more nice. (:


Documentation->LLVM: http://pp.ipd.kit.edu/firm/LLVM ;-)


It might be interesting to use LLVM as a backend to Firm, especially considering that two most important backends (ARM and amd64) are "in very early phases"


One important difference to LLVM is the backend (machine code generation). libFirm uses its graph-based intermediate representation with explicit data dependencies not only in the "middleend" (i.e. for architecture-independent optimizations), but also in the backend. In contrast, LLVM uses completely different representations for the middleend and backend. libFirm still has explicit data dependencies and SSA form in code generation, which simplifies several important code-generation phases. (LLVM has a more classical virtual/physical register approach.) It's simpler from the theoretical viewpoint of compiler construction and it's also easier for engineering a compiler. E.g. it allows for decoupling spilling and register assignment. It also enables checkable register allocation, because in our representation the register assignment is just an annotation and the explicit data dependencies (which operation uses which operands) is still the primary encoding in the graph, which allows having a checker for the register allocation. SSA form gives the usual benefits that the question "what is the definition of this value?" is directly encoded in the graph and is readily available. Having the same representation as the middleend allows for reusing passes in the backend. E.g. we run the middleend's code placement (loop-invariant code motion) in the backend after instruction selection.

Also there's been a lot of work on the AMD64 backend and for example it is able to run the SPEC benchmarks. The info on the page is just a bit stale. (Note to self: Update the page)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: