All good sibling comments. Also Clang is Chris Lattner's baby from his days back at University of Illinois. It was never intended to be "GCC without the GPL". It demonstrates an entirely different compilation paradigm where you optimize abstractly based on clang's IR. If you can compile to IR (a clang front end) you can apply the same optimizations universally. Obviously it has it's practical hiccups like still needing to have an architecture specific optimization pass but it's a cool idea and has proven rather successful.
In defence of the OP, notice that they did not talk about the original intent of clang, which is as you mention. They talk about the main reason of its corporate sponsors, which is an entirely different thing. It seems plausible to me that these sponsors may have different motivations than the creator of the project.
That idea is nothing new, it has been used in IBM's RISC project, Amsterdam Compiler Kit, Microsoft's Phoenix compiler toolkit, IBM i, IBM z/OS, Unisys ClearPath and plenty others.
And getting back to GCC, I had to study GIMPLE during my compiler assignments.
What LLVM has going for it versus GCC, is the license, specially beloved by embedded vendors and companies like Sony, Nintendo, SN Systems, CodePlay can save some bucks in compiler development.
Based on that, my understanding was that while intermediate representations were certainly not new, being strict about not mixing the layers was still quite rare. He specifically claims that GCC's GIMPLE is (was?) not a fully self-contained representation.
I'm not an expert in any of this. Just sharing the link.
My experience with the Amsterdam Compile Kit (ACK) is that while ACK successfully managed to separate frontends, optimizers and backends using a single intermediate language, at the end it was the intermediate language that held it back.
The intermediate language was strongly stack oriented, to the extend that local CPU registers were mapped to stack locations. This worked well for pdp-11, vax, mc68k, and to some extent x86.
But when Sun's SPARC got popular it became clear that mapping stack register windows was not going to result in good performance.
One option would have been define a new register oriented intermediate language, just like llvm has now. But by that time research interests at the VU Amsterdam had shifted and this was never done.
Key points, using a memory safe systems programming language which apparently would be too slow for the target hardware, thanks to the compiler IL representation and multiple execution phases (sounds similar?) achieves the goal of being usable to write OS in 70's hardware.
> What LLVM has going for it versus GCC, is the license, specially beloved by embedded vendors and companies like Sony, Nintendo, SN Systems, CodePlay can save some bucks in compiler development.
The license is probably considered an advantage by many companies. However it is definitely not the only reason for LLVMs success. There are many technical reasons as well, e.g. cleaner code and architecture. My personal impression is that a lot of research and teaching has moved from GCC to LLVM as well, universities usually do not care that much about the license.
Yes, GCC has GIMPLE (and before that just RTL) but it is not as self-contained as LLVM's IR. In GCC front-end and middle-end are quite tangled on purpose for political reasons. Nevertheless I agree that LLVM isn't as revolutionary as the poster you are replying to is claiming, reusing an IR for multiple languages was done before. However I don't think any other system was as successful as LLVM at this. E.g. Rust, Swift, C/C++ via clang, Julia, Fortran, JITs like JSC/Azul JVM are/were using LLVM as a compilation tier, GPU drivers, etc. Those are all hugely successful projects and if you ask me this is an impressive list already while not even complete. It seems most new languages these days use LLVM under the hood (with Go being the exception). IMHO this is also because LLVM's design was flexible enough that it enabled all those widely different use cases. GCC supports multiple languages as well, but it never took off to the degree that LLVM did.
I don't know all the compilers you mentioned but how many of those were still maintained and available on the systems people cared about by the time LLVM got popular? Are those proper open-sorce projects?
Yes that is an impressive list, but I bet if LLVM had gone with a license similar to GCC, and everything else remained equal, its adoption wouldn't be as it is.
No those projects aren't open source at all, they used their own compilers, or forked variants from GCC which you couldn't reveal thanks NDAs, now thanks to clang's license they have replaced their implementations, only contributing back what they feel relevant to open source.
Is that really an "entirely different compilation paradigm?"
Most (all?) GCC frontends compile to a common IR. The main difference really is that GCC doesn't market that as an interface for interacting with the compiler. In LLVM, the IR is the product, in GCC its the individual language compilers.
And for those who think there's some malicious (as opposed to friendly) competition between the two projects: Steve Naroff was also instrumental (both technically and managerially) to get Steve Jobs to pay us to get Objective C++ working in gcc back in the NeXT days.
I think that’s a typo. Objective-C, as you may know, is about as old as C++ (Wikipedia says it’s from 1984 and C++ from 1985, but both were evolved over several years, so I wouldn’t categorically say Objective-C is older)
I think Objective-C++, basically an implementation of Objective-C that makes it possible to link with C++, must be from 2000 or later, but even a semi-exact date isn’t easy to find (probably in Clang’s release notes)
Yeah nowadays it is very hard to find anything about Objective-C++, even Apple has taken the documentation down, so only old timers still have some references.
Even the link I provided, who knows for how long it will still stay up.
So there was a real technical need. IIRC, there was also a personal need, as Steve wanted to do something else, and I am sure not being dependent on gcc was also a big deal.
From long experience, Apple doesn't like to be dependent on others for crucial bits of their tech stack. Relations with the gcc team weren't the best, even without the GPL issues, although the new GPL v3 was also seen as a problem. I think Apple switched before having to adopt v3.
You seem to be implying having a competitor to GCC is a bad thing? While there is something to be said for duplicated effort, I think it's actually helped improve GCC a lot because they also 'copied' good things Clang did (like much better warnings, built-in static analysis etc.). So really everybody has benefited.
It's good to have a bit of diversity and options, especially when they're all trying to be compatible.
Having two good compliant compilers is great for C++ too: it keeps both compilers honest and also makes spotting weird behaviour easier in big codebases (i.e. to first order if your code results in different behaviour in GCC or Clang then it could be dangerous).
Also has the added benefit of reducing the need to use compilers like Intel's (not sure on current benchmarks) but I really wouldn't want to ship something for AMD CPUs with Intel's compiler.
Besides GCC, Clang, and MSVC++, most other C++ compilers license the EDG frontend, so there's less heterogeneity than you would expect. I don't actually know if there's another implementation of C++11 besides the ones I listed. Even Microsoft now uses the EDG frontend for Intellisense, rather than their own (more incorrect) frontend.
> It's good to have a bit of diversity and options
It can be consistent to hold that those two statements do not hold in this case. GCC requires that developers uphold certain user freedoms (i.e. compiler extensions must be free software). Clang allows user's freedoms to be more easily violated.
It's fine if you want to hold those two opinions, but if you state them as if they're the only opinions, that's not great. If you don't acknowledge that they're predicated on beliefs such as "user freedom isn't important to the expense of software improving in other ways" (or such), then of course you'll have trouble understanding why one might view clang/llvm in the negative light of effectively enabling "GCC, but without GPL".
I've heard this said before. Why would someone want this? AFAIK the GPL isn't really relevant unless you're modifying and redistributing GCC itself. Even if you use modified GCC internally for compiling for-profit software you just need to allow the employees who use GCC to see your modified code, which doesn't seem like a big deal since you already trust them with your application code.
One argument I've heard before is that GCC's architecture is intentionally designed to make it difficult to extend GCC or integrate it into other tools such as IDEs, because that sort of modularlization could enable interoperability with non-GPL software. I'm not sure whether this argument has any merit, but see e.g. this email from ESR: https://gcc.gnu.org/legacy-ml/gcc/2014-01/msg00209.html
Edit: that email from ESR cites a talk from an LLVM developer which goes into more detail about this argument and about a lot of the architectural differences between GCC and LLVM: https://www.youtube.com/watch?v=lqN15lrADlE. I'm not sure how up-to-date this is though, as it looks like it was recorded in 2012.
At least in the Apple ecosystem, the results of LLVM certainly spoke for themselves. Immediately after hiring Chris Lattner and adopting LLVM, Apple's developer experience began to improve massively in a very short time, coming out with automatic reference counting, Objective-C 2.0, the Clang static analyzer, live syntax checking and vastly improved code completion, bitcode, Metal shaders, and Swift within just a few years. Of course, I don't know how much of this was due to technical reasons rather than legal reasons (but Apple did make most of this work open-source).
While RMS is opposed to such a thing there is no architectural reason for it not to happen, just little to no support for anything not GPLed. ESR has no idea what he is talking about. It is true that GCC's architecture dates to the 80s and so isn't modular in the same way LLVM is, but that is hardly due to some sort of policy-embodied-in-code.
As well, RMS no longer controls gcc, and has not since the late 90s when I engineered a fork of gcc from the FSF and into the hands of an independent steering committee. At the time such an idea was radical...thankfully it is now commonplace.
Thank you for the clarifications, this is really quite interesting. Compiler development is fascinating, and I love these sorts of threads because I always learn something new about the technology/culture/history.
> AFAIK the GPL isn't really relevant unless you're modifying and redistributing GCC itself.
A lot of interesting LLVM use-cases are all about that, adding custom frontends or backends used in software that is distributed. Some random examples:
And how many of those were/are commercial or similar? The point is that GCC requires you to adhere to the GPL, which may not be desirable.
The Island platform that I linked to is one such commercial example, made and sold by RemObjects[1]. OpenCL support in graphics drivers is another example.
It is somewhat easy to work and modify llvm tools: an optimization pass for instance. Also, LLVM tools were designed to be interchangeable. One can pick specific parts of LLVM and integrate into the workflow of others.
Distributuon is a common use case. Not as much for a compiler, but for other things. My company spends a lot of time (money) ensuring we can ship the right source code when asked even though paying us $5 for it is silly when you can get it from the internet for free.
I'd say nobody has ever asked, but that isn't true. One person in the test group actually read the entire eula and sent us $5 to get the source code. (I found out when his letter was returned to sender - we officially went through the entire release process just to fix the address at great expense. Even though it was still internal not released legal demanded we show good faith in correcting the problem) testers like that are worth far more than anyone pays them.
You don't technically have to disclose internal mods to all employees because everyone is part of a single legal entity and no binary distribution occurs.
Perhaps but then you will kick yourself the first time you want to give a customer or business partner access to your modifications. Unforeseen business needs or opportunities can always make you need to distribute a tool you previously thought would only be internal. I'm not saying this is a slam-dunk reason to not use GPL tools, but I'm just pointing out that there is a legitimate worry here that is not fully alleviated by your point.
- As somebody else mentioned, Apple redistributes developer tools, clang being the poster child
- Since they releases OS products, they don't want to co-mingle their software with GPL code. (So they use an older bash on Mac OS X.)
- fear of an Apple developer quietly copying GPL source into a commercial product (well-founded, actually)
- Apple Legal exerting an "abundance of caution" on IP
- at this point, it's institutional. When I worked there, Linux and MySQL were forbidden, for example, but that has relaxed recently.
Also, I think you misunderstand the GPL. If you distribute modified gcc, anybody receiving it can ask for sources. So employees plus end-users.
(One of the strangest examples is that Yamaha uses real-time linux in their synths, and you can download the GPL portions. I can't imagine a musician ever wanting to do that!)
Specifically GPLv3. One reason, I think, is the tivoization clause. Users must be able to modify GPLv3 software and then run the modified version. If they included a GPLv3 bash into iOS, iOS users must be able to modify bash and use the modified one instead.
Your synth example is a good one actually. As a synth owner, I'd probably love to be able to replace the software on it with modified versions from the Internet or modify it myself. Linux is GPLv2, though.
How that morphed into disallowing GCC, idk. Maybe they want to prohibit users from installing their own compilers at some point?
Others have mentioned the patent clause. That one seems reasonable as well. In fact LLVM uses the Apache 2.0 license which also has a patent grant, albeit with a smaller scope. Apple could probably file a patent for a feature, then get a university department to implement that feature, then sue other LLVM users (like Sony). With the GPLv3 that loophole does not exist.
Which I think is reasonable - I do think GNU went too far with the version 3 licenses. I think Rob Landley's perspective is interesting, where he was involved with Busybox and the legal action against companies violating the GPL with that software, but later created an alternative to Buysbox that was more permissively licensed because he felt that the whole GPL legal action exercise had been counter-productive for open source software in general (net effect of not encouraging many companies to contribute back but instead just making many companies avoid GPL-licensed software altogether).
Isn't Apple clang open source anyway? So what is really the point?
Even distribuiting GPL software isn't a big deal, nowadays even Microsoft does that, shipping an entire Linux distribution in Windows!
There are no technical motivations for not using GPL software, you can do that as long you respect the GPL license (i.e. release the modified source code).
I think that what Apple does is more a policy to go against the FOSS community for political reasons that anything else, and to me is bad, in a world where now even Microsoft is opening up a lot to the open source world.
> I think that what Apple does is more a policy to go against the FOSS community for political reasons that anything else
This is the real reason why FAANGs push for non-GPL licenses.
GPL's end goal is to build a community where developers, testers, power users and regular users connect with each other and share knowledge, not just code.
FAANGs want to wedge themselves as the middleman between developers and end users. They view such community as a threat.
> Isn't Apple clang open source anyway? So what is really the point?
The best explanation I have seen is the speculation that apples software patents are seen as a critical part of apples business model and competitive strategy, especially 10 years ago when their anti-gpl stance was formed. GPLv3 patent clause adds risk, especially the patent agreement clause, and if you intend to spend millions over software patent lawsuits then staying away from GPLv3 looks much more reasonable, especially if you ask the patent lawyers.
Presumably you mean that Apple specifically avoids GPL3 then, because bash was never distributed under anything other than the GPL to my knowledge. Bash moved from GPL2 to GPL3 though.
I don't think that was the original motivation, and Clang also introduced other features that GCC didn't / doesn't have: good error messages, cross compilation without having to recompile the compiler, library interface, etc.
Pretty much the sole point of Clang/LLVM to the corporate sponsors is to get the GCC, but without GPL