Hacker News new | past | comments | ask | show | jobs | submit login
Intel C/C++ compilers complete adoption of LLVM (intel.com)
183 points by pella on Aug 12, 2021 | hide | past | favorite | 163 comments



I'm not sure I love this. I mean, LLVM is awesome and everthing, but software monocultures are pretty bad. It's not like the situation is dire yet (given that there are also GCC and MSVC), but you can imagine a situation in a couple of years when Microsoft goes "yeah, we're also gonna go with LLVM now" and also GCC starting to fade away. It's a bit concerning.

Also: if icc is going to go with LLVM as a backend, then what is the point of using icc at all? Why not just use clang?


> Also: if icc is going to go with LLVM as a backend, then what is the point of using icc at all? Why not just use clang?

From the blog post:

"Not all our optimization techniques get upstreamed—sometimes because they are too new, sometimes because they are very specific for Intel architecture. This is to be expected and is consistent with other compilers that have adopted LLVM."

I don't see any good technical reason in this marketing language though, tbh:

- "Our optimizations are too new" -> So put them behind a feature flag when upstreaming? And why inflict them on icx customers if they are "too new"? What does that even mean?

- "Our optimizations are architecture specific" -> ...yes? Emitting good architecture-specific code is the whole point of an optimizing compiler? How is that an argument against upstreaming?

- "Other compilers also don't upstream everything" -> That's a non-argument.

I get it, there's no money to be made in just implementing all your optimizations in clang directly. But these reasons (together with the repeated emphasis of how they do contribute to LLVM) seem silly. To be clear, I'm all for Intel helping improve LLVM, but the technical arguments for maintaining a separate commercial version don't convince me at all.


What that sentence actually means is "we don't want to help our competition (AMD, NVIDIA) but we don't want to re-implement the wheel".

The API and fundamental part of LLVM is its IR, LLVM-IR, which is where most optimizations happen.

From this point-of-view, LLVM is a "platform", and an extremely brittle one: every release has breaking changes to the IR, the IR is constantly evolved to support new hardware and new optimizations, etc.

When you build a tool on top of LLVM, you are buying into this rapidly changing platform.

The only proven way of using the LLVM platform competitively is to be part of its future: follow upstream closely, upstream most of your code, and actively participate in the platform evolution so that your competitors can't turn it against you.

If you keep most of your code private, following upstream gets very hard. You skip one release, and then its 10x harder, so you skip another release and stop participating in LLVM's evolution cause you'll have to wait years for changes to upstream to land on your compiler. Your competitors do what's best for them, and those can be things that are bad for you, and then you are proper screwed, cause you can't migrate away from LLVM either.

Companies like Intel and NVIDIA do this, e.g., nvcc and ISPC are stuck on LLVM 7 (~4 years old), but these companies have built huge technology stacks like CUDA or DPC++ on top of it!

Intel and NVIDIA might have enough manpower to maintain their own outdated fork of LLVM forever, but at some point it just stops being LLVM, and these companies are not really much better off than where they started.

IMO, building all your technology on top of a platform that either is or can be under your competitors control is just a really bad idea.

One would hope that these companies would realize this and contribute back and help the community as much as possible, but in practice they just don't. By the time they realize it, it's already too late.


ISPC is not stuck on LLVM 7. Only fails on 12 currently, works on 11 and earlier:

https://github.com/freebsd/freebsd-ports/blob/85cccf4f15c42d...

But hopefully that will be fixed soon.


+1 thanks for the correction, should have verified this. Last time I used ISPC it was using an extremely outdated LLVM version (5) or so, and it took years to move it to LLVM 7. I wasn't aware that they have been slowly migrating to newer LLVM versions.


I think its even simpler than that. They don't want to reveal any of their secret sauce through available source be it tipping off competitors to hardware designs or just revealing compiler optimizations they believe are valuable enough to keep as trade secrets. Why would they make AMD's job easier?

Intel's C compiler has always played games with non-intel x86 architectures compared to their own.


Same thing with C++Builder whose last version from this year is still based on LLVM 5, I think. Even Apple's clang is behind LLVM upstream if I'm not mistaken.

The joy of permissive licenses!


Copyleft licenses have no way of forcing anyone to follow upstreams more closely.

Android is a great example. Typical Qualcomm/Mediatek/etc. behavior: take a "stable" kernel, stuff it with custom undocumented drivers and junk, be stuck on that version forever. The only thing the GPL changes is the vendor posts a source dump of dubious usefulness in some obscure section of their website.


> Android is a great example. Typical Qualcomm/Mediatek/etc. behavior: take a "stable" kernel, stuff it with custom undocumented drivers and junk, be stuck on that version forever.

I don't think that's such a great example in this context.

Qualcomm, Mediatek, etc. are hardware vendors after all. Software to them is a necessary evil, not a reason d'être or complementary tool (as it is for NVIDIA and Intel). Their customers fall in the same category - smartphone manufacturers want to sell units, not keep software up-to-date.


Intel and NVIDIA make their money selling hardware.


> "NVIDIA is a software-defined company today," Huang said, "with rich software content like GeForce NOW, NVIDIA virtual workstation in the cloud, NVIDIA AI, and NVIDIA Drive that will add recurring software revenue to our business model."

https://www.msn.com/en-us/money/technologyinvesting/nvidia-i...

It's a very similar story for Intel. NVIDIA and Intel are selling hardware because of their software portfolio. How many GPUs would NVIDIA sell in the HPC market if it wasn't for CUDA and the various support libraries around it?

Intel's non-client solution revenue was over 40% of their total revenue in 2020. This includes their (software-) AI solutions, applications, licence business and services. So Intel, too makes a significant amount of money from software and services around their software ecosystem.


> How many GPUs would NVIDIA sell in the HPC market if it wasn't for CUDA and the various support libraries around it?

We don't have to hypothesize about this. The HPC GPU market has multiple vendors, just look at how many HPC GPUs AMD or Intel are selling: AMD and Intel have ~0.8% or so market share. NVIDIA has >99%.

Pretty much every review of HPC GPUs states that AMD GPUs are both faster and cheaper.

So how come they don't sell?

The answer is software.


Yes that's true, the licence alone is not enough.

That said, this "junk" dump can still be used by the community of users to upgrade themselves the software on that hardware, so that's still a nice improvement.

Back to LLVM the question is whether these companies decide to not contribute upstream because they don't bother to make clean patches or because they want to keep it for themselves.

I'd argue that they would be much better off long-term making clean patches anyway, so that's not a valid reason for not contributing.

And even if theynjust dumped their patches, the community could still take them and incorporate nice optimization into upstream.


Apple's is up to 6mo old at release, so currently around clang10/11. Every so often it gets 2 releases behind as they are both on 6mo cycles. It's not bad really and currently Clang 12 brings few compelling reasons to upgrade. The issue will be Clang 13 is supposed to have a bunch of C++20 things, I think, and it will also be one of those ones that has a long time before Apple releases their next one.


>The only proven way of using the LLVM platform competitively is to be part of its future: follow upstream closely, upstream most of your code, and actively participate in the platform evolution so that your competitors can't turn it against you.

>If you keep most of your code private, following upstream gets very hard. You skip one release, and then its 10x harder, so you skip another release and stop participating in LLVM's evolution cause you'll have to wait years for changes to upstream to land on your compiler. Your competitors do what's best for them, and those can be things that are bad for you, and then you are proper screwed, cause you can't migrate away from LLVM either.

This is an interesting take. I've heard the claim that GCC kept its codebase cryptic to prevent companies running home with it and not upstreaming changes, maybe that's LLVM's strategy.


This is not really a strategy, but rather a consequence of LLVM having hundreds of developers, most of which use LLVM upstream, and therefore don't care about maintaining an API compatibility that they themselves do not need.

It turns out that it is impossible to convince people working on LLVM on their free time or academics to invest part of their time on "preserving" API compatibility instead of adding new features, fixing bugs, or improving perf, for the benefit of companies that don't want to contribute their improvements to the community.


I like this strategy to encourage contributing back to open source by being a huge and foundational and producing so fast that the cost of forking is almost never worth it except in extreme circumstances. This feels more like voluntary cooperation than being smacked in the face with ~~the book~~ a license. Though I suppose it wouldn't always work if you're not a big project with lots of contributions.


Besides C++ Builder already referred in a sibiling comment, that is how you get stabilitiy on bitcode for watchOS apps.


You over-estimate how much LLVM might want Intel's code dumps.

Based on (minor) personal experience of gcc forks, it's not unusual to make some fairly significant change to some major data structure to make your CPU work better, but which would break several other backends.

There may not be any nice way of integrating these changes in a way which would make the acceptable upstream, without multi-months of work refactoring huge chunks of the compiler (which would still also need lots of work on all those other architectures, to make them compatible with your changes).


> sometimes because they are very specific for Intel architecture

Like the famous check-for-intel-model-instead-of-feature-flag optimization? [0]

[0] https://www.agner.org/optimize/blog/read.php?i=49#49


I never understood why Intel sold their compilers anyway. The revenue from compilers must be a tiny drop in Intel's revenue bucket. And if you're fine tuning your compiler to work best with your architecture it seems like you'd want to be giving that away to further promote sales of your chips.

Now that they're using LLVM they should just upstream everything to the LLVM project and quit selling icc.


I agree they should upstream everything to gcc/clang to make their processors smoke the competition, but always assumed icc being a paid-for product came with additional support and access to Intel's expertise in using their tools and processors.


"I get it, there's no money to be made in just implementing all your optimizations in clang directly."

Sounds really shortsighted. I would imagine revenue from Intel dev tools is a rounding error in Intel's complete revenue stream?


It's a marketing tool to make their competitors look bad. That it is sometimes useful is a secondary side effect.


The Intel post didn't say "Other compilers also don't upstream everything". So you shouldn't put that in quotes.

However, that is an actual argument. This isn't GCC where that would be required. This is LLVM where that is allowed. This is a choice, Intel's choice, because LLVM allows that, and it's probably at the corner case level of an Intel-specific icc-specific feature, as in, not particularly generally useful.

There is a benefit of this for normal LLVM development as well. It means that Intel is responsible for maintaining it and normal LLVM developers aren't. If I do something that breaks something in the LLVM AMD GPU backend, that's on me. If I do something that breaks something in Intel's code, that's on them.


> I get it, there's no money to be made in just implementing all your optimizations in clang directly.

ICC Classic is legacy/dead and also free now. The LLVM-based DPC++ compilers are free as well. These are only making money indirectly and through support contracts.


Yeah, but if you give all the golden eggs away, others can offer support contracts as well.


They can, but who knows more about the issue in question, Intel who designed it, or a third party? This is also a curse to Intel though: they need to ensure their people actually know something, level one support with a bad script and no access to anyone with more ability will get caught quickly when the third party offers better support and generally can figure things out.

There are pros and cons for sure. I'm not sure what it best.


>> I'm all for Intel helping improve LLVM, but the technical arguments for maintaining a separate commercial version don't convince me at all.

This is exactly why GCC is GPL and why RMS didn't want to make it more modular. Taken to the extreme we could end up with proprietary hardware that requires a proprietary (closed) compiler even though it's built on open source. Going back to "trusting trust" things might not be so good, and we know Intel is happy to build untrustworthy chips.


What does “trusting trust” have to do with this? That problem is a bootstrapping problem when trying to establish a trusted base. Simple summary is: if the first compiler was backdoored to detect when it is compiling other versions of itself and then inserting the same backdoor in the assembly of the target then how can you trust anything compiled by it?


You're right, the "trusting trust" compiler hack is way more complicated.


LLVM is a somewhat modular system. If Microsoft decided to build their own MSVC front end and AMD decided to build their own AMD64 back end, and Intel decided to build their own special set of x64 optimisers, is it really a monoculture? That sounds more like the GNU form of real software diversity—where you can make your own perfect tool by assembling it from parts.


> and AMD decided to build their own AMD64 back end

They kind of already did. And it's on LLVM, too.

https://developer.amd.com/amd-aocc/


Because not everything on icc is going to be upstreamed to clang, thanks license.

It is not like clang is enjoying C++ Builder RAD abilities, PS 4 and 5 optimizations, or bitcode format used by watchOS.


Yeah, I guess LLVM is preferred mostly because license.

Companies can keep their optimizations in secrets. Just like why Playststion's OS is based on BSD.


LLVM is also preferred because it has much cleaner internals than gcc (although gcc has gotten hugely better over the years). And of course, all production compilers become monumentally complex over the years (I once worked with Open64 - gah).


It's nearly impossible to keep optimizations secret. They are the easiest thing in the world to reverse engineer: just look at the output assembly. (And you can't exactly output obfuscated assembly because that would make the performance worse, nullifying the benefits of the optimizations.)


Though in a practical sense, if you've done the work of implementing useful optimisations and keep them closed source, it's not necessarily a trivial matter for others to recognise and replicate how exactly your optimisation is implemented, when it can be used, when it can't be, etc.


I don't get it.

Looking at output assembly language won't tell me anything about the algorithms behind solving the register allocation problem (aka: live data problem, which is a knapsack problem and/or graph-coloring problem IIRC).

I'd be able to see that yes, compilers are good at deciding which registers should hold which data. But that's not sufficient at actually learning how the algorithm / register selection process works.


that is for the simpler optimization. A more complicate optimization algorithm can generate quite different code depending on the inputs. So you might need many different input codes to fully explore the internal, similar to the problem facing by fuzzing and/or symbolic execution.


I don’t think the license meaningfully changes anything. For example: rustc has a WIP GCC backend, despite not being GPL. I imagine icc could do something similar


>Also: if icc is going to go with LLVM as a backend, then what is the point of using icc at all? Why not just use clang?

I assumed the icc secret sauce was in the backend: deep knowledge of how each and every single operation is implemented in every uarch.

Maybe they've written their own x86 backend for LLVM, and are using the rest of LLVM for IR->IR transformations and vectorization?


Exactly, many comments here are saying the difference is in LLVM IR ("middle-end") optimizations but IMHO more secrets are hidden in the backend. For example the latency of different instructions (and their operands) which has huge impact on the overall scheduling.


I like to think of LLVM as a piece of infrastructure, like neutral fiber wiring that any ISP is allowed to use. From that perspective, I’m a lot less concerned about having few options that I am about browsers.

Although the diversity of C/C++ compilers is descending, the number of compiler backends in general is still pretty high (Go, .NET compilers & runtimes, Java compilers & runtimes, JS engines, etc), and LLVM can’t fill all the niches. I don’t think that we’re at a point where research is stifled.


Everyone says software monocultures are bad, without evidence. In fact, there seems to be evidence against this: the Linux kernel.


The Linux kernel isn't a monoculture yet. For example there is no indication Apple or the BSDs are going to stop developing their kernels nor is there indication Windows is dropping NT and most anything can/is easily moved between them (particularly the *NIX kernels) still. That doesn't mean there are certain places the only choice is the Linux kernel but that alone isn't enough to make it a monoculture in this sense of nearing domination of the space.

Browsers are the most oft cited example of monoculture issues be it the old days with IE or the new days with Chrome, I'm surprised you haven't run across comments on that one over the years.


Linux powers most of the internet, most mobile devices, and most embedded and IoT devices. Maybe it's not a complete monopoly, but it's close.


Except it doesn't, go check termux's future on Android, as its developers refuse to adopt the Java APIs that are actually what it public on Android (Linux kernel syscalls aren't a public API).

https://developer.android.com/ndk/guides/stable_apis


> Also: if icc is going to go with LLVM as a backend, then what is the point of using icc at all?

They're not upstreaming all their optimisations. I'd be surprised if they upstreamed all their FPGA support as well.


> They're not upstreaming all their optimisations

That's quite reassuring for users of LLVM on non-Intel hardware, where ICC "optimisations" turn into pessimisations.


The recent CPPCast episode on LFortran might shed some light on that?

https://cppcast.com/lfortran/

In it, the guest talks about how he couldn't rely on LLVM for things like optimizing array operations; LLVM apparently does a poor job of that so he had to implement his own. Given that one of the key selling points of Intel's compiler is that it does a better job with SIMD optimizations, it may be exactly the same story here.


Without listening to that podcast, I'm gonna go out on a limb and say this is a slightly Fortran-specific problem (or at least, not at problem encountered for C or C++), namely how to optimize Fortran array expressions.

The LLVM IR doesn't understand array operations, so the Fortran frontend must 'scalarize' the array expressions, that is, turn them into the equivalent loops. Only after that can the LLVM middle and back-ends try to vectorize that scalar IR for execution on SIMD (or vector, or GPU) HW. The problem is that this scalarization loses some information, and thus for good performance on array code the Fortran frontend must implement some optimizations on array operations before scalarizing.

There is an LLVM subproject called MLIR (https://mlir.llvm.org/ ) that aims to build a higher level IR that understands arrays, and can be useful for things like optimizing deep learning graphs, but also things like Fortran frontends could make use of it. AFAIK the flang Fortran project aims to make use of it, but I haven't followed development that closely.


I'd imagine it'd be hard for MSVC and gcc to fade away --- gcc has some notable architectural differences from llvm (such as having multiple, layered IRs) and on some optimizations gcc is stronger. While for MSVC they target Microsoft's own ABI. Unlike other technologies that already feature a monocultural landscape I guess it's fair to say there's no one, single, best way to design a compiler that accounts for the wide-ranging differences of backend architectures and frontend languages (in addition, when you compare the costs of developing a brand new compiler with say, developing a new CPU architecture, you'll find the price associated with the former is much more affordable, and might be reasonable under different circumstances).


Is gcc really starting to fade away? Wasn't Rust working on some backend interop with it? That shiny cool Cosmopolitan thingie was also built on top of it IIRC.

I was also under the impression GCC still outputs better asm than LLVM overall.

Conversely, I recall reading the zig team ran into some woes with the latest LLVM release, and that there's a lot of churn in LLVM APIs from version to version.

(I'm legitimately curious)


Yes, I think though gcc will go first.


GCC will never completely die. MS though have been moving in the Clang/LLVM direction for years, so I would have zero surprise if VC++ was completely removed tomorrow.

My feeling is, as soon as they can be sure they have achieved 100% binary compatibility, MS will jump. The optimizers in VC++ are light years behind, and companies like Apple are embarrassing them on the compatibility front with things like Rosetta 2 (which is a much harder problem to solve)


My understanding was the llvm was made because the code basis of gcc is too complicated. Since seems to be a killer move.


That was a long time ago and the actual primary reason was apparently Stallman missed an email [1], although I think the intentional technical hurdles to plugins was another of many reasons. When GNU moved all their software to GPLv3 and the business community decided they didn’t like that license and Apple brought LLVM as a serious contender.

GCC hasn’t remained stagnant though.

[1] https://lwn.net/Articles/632062/


Thanks for the article, I am not sure I fully understand it. Could you point to the section about the missed email? Also, do I understand it right, that emacs doesnt integrate well with llvm? (maybe that explains why in llvm presentations ppl use the 'other' editor, I always wondered why not use emacs)


> An interesting side note to the debate emerged when Liang Wang posted about LLVM creator Chris Lattner's offer to try to get LLVM's copyright assigned to the FSF back in 2005. It was part of an effort (that seemingly went nowhere) to integrate LLVM and GCC. Stallman never heard about the message:

So basically, if Stallman hadn't have missed the message, there's a very real possibility that LLVM would have ended up licensed under GPLv2 & then re-licensed to GPLv3. The engineering world would probably look radically different if that had happened.


That was ions ago, and if not for anything else, Linux kernel will keep GCC around.

Azure Sphere OS is also GCC only, despite Microsoft's new foundled love for clang.


There is significant investment in building Linux with Clang so I wouldn't count on Linux keeping GCC relevant forever. Doubly so with the inevitable rustc requirement.


Isn't that the other way around? I thought clang was trying to support all gcc extensions to be able to build Linux kernel.

I would rather see both of these compilers to stay competitive to push themselves higher.


Kind of both. They've added some gcc-specific extensions to clang, but also removed some gcc-ism's from the Linux kernel.


clang already builds Linux kernel on Mountain View dungeons, and it has been doing that for a couple of years now.

Those changes just don't get to upstream.


That's...not true?

https://www.kernel.org/doc/html/latest/kbuild/llvm.html

I could be missing something, but I don't see any suggestion that you need a specific forked tree with patches to build with LLVM, and I've seen people filing bugs about using LLVM sanitizers to build the vanilla tree, so I don't think the expectation is that you need to apply a huge out of tree patchset for this to work any more?


There's an semi-official github[0] for this.

AFAICT from the issues page, Clang and binutils/LLVM tools work fine with no patches for the mainstream archs and when not trying to be super-fancy with custom flags. The more non-mainstream one goes with arch or flags the more likely one will run into something.

[0] https://github.com/ClangBuiltLinux/linux/issues (Note they use github for issues/wiki, not code, so no surprise the 'linux version' in code is oldish).


I follow Android, not Linux itself.

My latest update was that not all patches were accepted upstream, or Google didn't care about upstreaming them, whatever.

There are some Linux Plumbers talks, or from Linaro, about this a years back.


The ongoing efforts to add a Rust frontend to GCC prove otherwise.

No Rust in GCC? No play.


Unlike C/C++, there is one canonical rustc where the language is developed and GCC will likely always play catch up.


It doesn't matter, if only GCC gets to play with Linux kernel.

Rust is only following down the footsteps of D and Go.


Android Linux fork compiles with clang for ages now. Upstream doesn't want them.


Why would the Linux kernel keep GCC around?

I'm fairly confident the FSF would keep it alive for a long time, but I don't see why it would be necessarily be a priority for the Linux devs to keep GCC forever.


Linux kernel makes use of GCC C, not ISO C, also targets hardware that LLVM doesn't support.

Android Linux kernel fork actually builds with clang, and Linux kernel is yet to accept the changes made by Google.


There are probably lots of MSVC specific extension (COM, C++/CLR, some pragma file_hash-ing https://www.reddit.com/r/cpp/comments/ep77ey/cl_ph_or_pragma... that would neeed to get ported) - just some examples that come to mind.


They could deprecate most of that stuff and let the old compiler still exist for those who need it, much like how they've been handling XP compatibility for the past several years.

But I see no signs of MSVC going away any time soon; Microsoft has a very active and capable compiler team.


But they also had active and capable browser engine team, isn't it?


I mean GCC kinda picked their path back when they made it hard to be modular. The market has spoken and looks like LLVM made the right call.


> Why not just use clang?

Because clang is painful to use.


GCC still has time to relicence if they want to survive.


Why would GCC relicense? It exists for the license, not the other way around.

The entire point of GCC and the GNU project is to have an entire system with only free software, and the way to achieve that is not by allowing in more proprietary code.

Let GCC die if it has to die, but changing the license would make it meaningless.


One more proprietary compiler suite [1] bites the dust. Hopefully this will mean improvements will flow to the open source upstream, rather than just being a cost-cutting measure for the vendors with all improvements kept proprietary.

[1] A year ago IBM announced they are switching their "XL" compilers over to LLVM: https://developer.ibm.com/components/ibm-power/blogs/c-and-f... . (I haven't followed up what has happened since that announcement.)


https://ispc.github.io/ it is interesting that ispc is also LLVM based.


It's still a little strange to me the whole free compiler thing. I used to pay hundreds of dollars each for, first Turbo C/C++, then Borland C/C++ and Zortech C/C++. I think IBM's C/C++ for OS/2 was the first time the compiler came free with an OS that I had on my computer and I was literally shocked that it included—for free—the GUI for building GUI applications. I guess being old is what makes me perfectly fine with paying for tools like IntelliJ which are a relative bargain.


What’s even stranger is that as a 26 year old this is my first time hearing that compilers used to be paid for :) what a great indication of the FOSS movement’s success


"used to be"

They still are in many cases. Proprietary compilers for embedded devices (MCUs, FPGAs, etc.) are still commonplace. The license for the full Intel compiler suite discussed here isn't free either.


Intel's compiler is free (as in beer) now. It used to be commercial though.


Early Unix systems came with bundled compiler toolchain. I think it was the SunOS which unbundled it first.


Yes, and made plenty of people rush to GCC's implementation effort, which was largely ignored until that moment.


This is part of why I specified “on my computer.” I used to use university/work systems and whether it was paid for separately or as part of the OS, I generally had a variety of compilers available to me. My personal favorite oddity was that IBM had two different Pascal compilers for VM/CMS. One was called Pascal/VS and the other VS/Pascal. One was slightly more capable than the other, but I don't remember which it was or what, precisely the difference was. It never came up in compiling TeX and its related software. I do remember the long process of installing PL/I on the UIC mainframe back in the 80s though.


I'd add that in the early 90s, Linux was very much a fringe OS. I had contemplated running FreeBSD (or was it BSDFree?), but as I only had one computer and getting a second seemed an unimaginable expense, I never did so (I don't think virtual machines on x86 were a viable thing yet either).


It's wasn't much of a prize. At school, the Sun C++ compiler, required for the OOP class, was responsible for a lot of people switching out of CS.


Apple has been the longtime LLVM stalwart, starting from when they first brought out Chris Lattner from the University of Illinois. There are whole subsystems which they've driven such as GlobalISel. However with the M1, Apple has moved wholesale over to ARMv8. Consequently, their efforts are with the AArch64 backend (which they also wrote) and not with X86.

Yes, this announcement has to do with the Intel toolchain above the IR layer but I think it signals that they'll be contributing more patches to the X86 backend as well.

The X86 backend is upstream, and LLVM owns it. Indeed its named Code Owner works for SiFive. Intel won't be grabbing ownership of it. But I think they'll be contributing more to it. This is a good thing for LLVM.


Apple also kind of removed itself from clang contributions as their focus is on Objective-C and Swift, with most of their C++ use cases covered by ISO C++14.

Which is one of the reasons why ISO C++20 support in clang is lagging, alongside Google also kind of reducing their involvement.

Concepts support was mostly done by one dev.


This is exactly why I've been concerned about LLVM's use of a non-copyleft license: unlike with GCC, vendors can create their own proprietary extensions and optimizations without contributing them back to the community.

(In the past, people have dismissed such concerns, saying that it would be impractical for anyone to actually do such a thing. But this project by Intel shows that it isn't just possible--in fact, it's almost inevitable.)


> In the past, people have dismissed such concerns

The problem is GCC made these concerns completely irrelevant. You wanted refactoring support for C or C++? Either build it on clang or get screamed at by RMS for leaking GCC internals when text base search and replace was enough for him in the early 80s.

The simple fact is that there is no GNU based alternative to clang/llvm.


The “good” news on this front is that RMS has been allegedly removed from GCC’s steering committee as a side-effect of the flames of April. Emphasis on “alleged”, since some heavyweights say he was never there to begin with.

https://lwn.net/Articles/853230

Somewhere down the thread there are questions about AST, I think.


I think it is relevant to point out that Intel’s compiler team would almost certainly never have chosen LLVM in the first place, if it forced them to release whatever secret sauce is giving them the performance increases we see in those performance plots. So it’s not so much a “look at how much LLVM is missing out on by not using copyleft”, it’s more of a “LLVM has gotten some benefit from Intel adopting their tech and putting some of their work back into the open source project”.

Sure, I’d love for Intel to upstream every optimization they make, but I don’t see that as necessary or even good business sense for either Intel or LLVM.


Why is this worse than totally proprietary & closed source icc?


Leeching on the work that Johnny/Jane dev might have contributed to LLVM as part of their PhD thesis in compiler optimisations.


If Johnny/Jane didn't want their contributions to be used like that they could have licensed them differently though, or contributed to gcc instead. Of course llvm wouldn't accept their contributions if they were licensed under, say, the AGPL. But presumably getting their changes upstreamed is not a condition for completing their thesis, so it's their choice. As a PHD student they are hopefully capable of reading and understanding these licenses (to the degree that someone who hasn't studied law can understand such things).


True, except in this fictional story maybe the condition was set by the professor, university department and not a free option of the respective student that now sees his work being abused by a mega corp too cheap to contribute back.


Sure, in that very specific and somewhat unlikely scenario. Maybe the professor demanded a handjob as well. Or perhaps the university claims ownership over all intellectual property produced by the PhD student as part of their thesis, and they're unable to upstream their source to any project under any license.

In any case, it sounds like you want to blame the llvm license for the bad feelings the student might have, but that blame is misdirected: the student should aim their anger (if any) at their professor or university for introducing unreasonable requirements. Or maybe they should suck it up because without llvm they would have had to pick another subject for their thesis. I suspect that llvm wouldn't have grown to where it is now (for better or for worse) with a copyleft license. And then our student would have been "forced" to contribute to gcc instead, and sign over their copyright to the fsf instead.

Finally I think the academic world is a lot like permissive licenses already. Anyone can read papers, use that information in any way, and there is no obligation to contribute back. Only when writing new papers are you expected to acknowledge your sources.


Nice way to pick the example instead of the actual reason why Intel and others are replacing their C and C++ compilers with clang.

How are those PlayStation 5 optimisations working out?

Guess which compiler is currently lagging in ISO C++ 20 support.


Yes, this is a good reason for people to think twice about using non-copyleft. However, if they choose non-copyleft anyway, it's a bit presumptuous to then say the authors are stupid or misled to choose it.


AMD's own optimizing compiler is also LLVM-based. So, we now can wind up with the situation of both compilers adding spiffy optimizations and then never upstreaming them.

It's less of a problem if the two of them don't decide to add their own proprietary "extensions" to the language. That may be something to look out for going forward.


> if the two of them don't decide to add their own proprietary "extensions" to the language.

icc has always had its own dialect of C++, which in practice means that there is "C++" code that only compiles on icc but is rejected by clang++ and g++. With Intel switching to the clang frontend, I would hope that their interpretation of C++ will become more, not less, standard conform.


ICC like many commercial compilers used the EDG front-end, not their own.


From the article it is not clear how the Intel compiler differs from base clang/llvm, except for vague marketing bs (“expected” fp performance?!?)


Proprietary optimisation secret sauce that's not up-streamed most likely.

Basically they cannot go into detail because that would be publishing trade secrets or help the competition. I also suspect that they use the compiler for testing pre-production silicon and the like so those changes won't be publicly discussed either.


Anyone actually use Intel compilers


A few years back, Intel was striking marketing deals with large game developers that included them giving away free copies of their compiler and support. It's unclear how prevalent that was, though there were a few titles with an Intel splash logo that were part of this.

This is often assumed to be the cause of the difference of performances between AMD and Intel CPUs in games btw, and while that can definitely come into it, it's not always correct. Sometimes the engines just don't know about the CPUs and treat them like the previous generation (like at Ryzen launch where some games ignored SMT on those), sometimes the devs only tested on one arch, or sometimes it's some extra library that makes incorrect assumptions.

Interesting read on the topic with Cyberpunk : https://cookieplmonster.github.io/2020/12/13/cyberpunk-2077-...


Yeah, I used in my last job (commercial CFD code), and it was widely used by others in our sector. It's nice if you need to support Windows/Linux as well as brings for e.g. MPI, etc. and compiling scietnfic libs like FFTW on Windows is a PITA, so using MKL which comes bundled with it is very easy.


What makes compiling FFTW on Windows so troublesome? That's currently on my to-do list.


Well, most Windows applications you have to actually deploy are compiled with MSVC, and historically most OSS libs don’t use cross platform build scripts, etc.

FFTW these days I think has a CMakeLists.txt file, and Visual Studio well supports CMake now but it didn’t used to.


They are used in High Performance Computing as far as I know. Not systematically though.


We use Intel Compiler mainly on supercomputers with Intel CPUs. IT can produce faster code. However it is not great to work with it. It lacks features of the newer standard and it is really slow. Usually we develop with clang or GCC and then we go on the cluster.


I am sure that plenty of existing HPC cluster will remain intel, the interesting question is how much of the new HPC computers will be AMD.

or is there any HPC specific reason why Intel would still be preferred?


Several new HPCs are already using AMD [1] and that trend will probably continue if AMD continues to put out competitive products.

[1] See the Top 500 list that, by my count, has 26 of the top 100 HPC systems using AMD EPYC CPUs as of their June 2021 listing https://www.top500.org/lists/top500/list/2021/06/


Existing install base most likely. It's not as if HPC clusters get replaced every other year. Even upgrades usually stay with the same platform and don't switch from Intel to AMD or ARM-based systems.


Yes, this is the only place I see alternative compilers used much. We have Intel and Nvidia compilers available to our users.


Pretty much in computer vision / image processing related companies and software products. The performance related advantage on Intel x86 architectures in particular is not to be neglected.


My understanding (but I'm not an ICC user) is the majority of the performance improvement are from the vectorized math libraries (that technically you can use from GCC as well).


It uses a default lower accuracy floating point model than GCC, so initially people think it's faster til they realise their results are slightly different and have to mess around with the `fp-model` flag


Right, of course, IIRC it does the equivalent of -ffast-math by default.


Yes. Very good optimisation (to be expected), but without the insane exploitation of UB that the GCC/Clang crowd seem to think is necessary for optimisation. Instruction scheduling and selection probably makes the biggest difference.


Intel tried to convince us, and I think I still have a todo down on the list to try their compilers out. They gave up their attempts quickly when they discovered we have very little code that would benefit from vectorization.


Lots of people in the HPC and scientific computing sphere. Some of their math capabilities is either much faster or more accurate (or both) than other compilers.


A lot of game companies use the Intel compiler (anecdotally).


Absolutely sensible move. Time for Microsoft to do the same with MSVC.


I disagree. Competition is good.

Don't want to end up with every C/C++ compiler being a clang skin in the same way as (almost) every browser is a chrome skin.

And for what it's worth cl compiles faster for me than even clang-cl. I like having both available though.


> Don't want to end up with every C/C++ compiler being a clang skin in the same way as (almost) every browser is a chrome skin.

Why not ?

There is one open C++ spec, that's really hard to implement, so we have a dozen C++ compilers that are impossible to support properly because they each have their own set of different ten thousand bugs.

The value of supporting all of this is really small in practice, and the cost for everybody involved is huge.

Having a single, e.g., C++ parser with a single set of bugs is a much better value proposition for C++ programmers.


We did have one major C++ compiler on Windows (MSVC) and one major C++ compiler elsewhere (GCC) and they both stagnated in many areas before Clang came along and forced them to advance to stay competitive.


GCC wasn't elsewhere.

Up to 2005 I was using UNIX own C++ compilers, meaning aCC, xlC, SunPRO.

Then there were Microchip, ARM, TI,...


There are real benefits to be gained though from new compilers. Look at "Circle" by Sean Baxter as an example. Outside of supporting a powerful form of compile time computation and allowing things such as reflection and writing shaders directly in C++, benchmarks have been floating around showing that it compiles quite a bit faster than GCC and Clang whilst still being able to compile big projects such as Boost. Although LLVM is the backend, the front-end is written from scratch by one guy.

https://www.circle-lang.org/


Clang is lagging behind in C++ 20 support while MSVC is miles ahead. LLVM is also getting slower and slower every release.


So?

I don't see how focusing the available manpower on one implementation instead of splitting it over 10 different implementations would make this worse.

I do see how it would make this much better.

What one implementation lacks, another implementation provides. This is a weakness of the current ecosystem. Most software projects restrict themselves to the minimum common denominator, and splitting manpower across compilers lowers it.


You sound like you haven't worked with many BigCo projects. Clang is turning into an "enterprise" junkyard due to too many cooks spoiling the broth, crumbling so much under the weight of its own complexity that 2022 is almost here and support for C++20 still isn't anywhere near complete (unlike GCC). The more parties that get involved, the worse the code will get. Clang already has more people working on it than MSVC, yet MSVC is iterating faster; this suggests a fundamental problem with the architecture or development processes of Clang, which is not something that will be fixed by more hands on deck (remember the mythical man-month).


Many of the proprietary C++ compilers (Intel included) are just repackaging of the EDG front end. The monoculture has existed for a long time in the C++ world due to the complexity of implementing the language to spec.

Honestly, I think it will be great to have one less set of compiler-specific oddities to worry about.


MSVC does not use EDG for the frontend as far as I'm aware. IntelliSense does, though.


Not anymore, they changed that in the last few years


Funnily, the only reason I ever used ICC was as a linter since it was the most easily available EDG-based commpiler for me and different frontends help find different bugs.


> Competition is good.

On the other hand, collaboration is also good. Why waste time reinventing the wheel?


Because reinventing the wheel is how progress is made. You car isn't made with wheels from 17th century.


Who told you reinventing wheel is how progress made?

Can we be precise here? I never had a memory about a major technical progress being reinventing the wheel.


To be precise:

1. Just because there is Linux, doesn't mean everyone should jump on the Linux bandwagon and abandon all work on illumos, NT, QNX, Fuchsia etc. Focusing everyone on Linux would kill progress.

2. Just because there is x86 or RISC-V, doesn't mean noone should invent new architectures. Apple went with their own and that's what gives them their edge now.

3. Just because there is already Emacs, doesn't mean that all editors should be Emacs mods.

4. And back to compilers, just because LLVM already has optimizers, doesn't mean that other people shouldn't explore other designs for their backends. Especially that LLVM is really slow, and a major bottleneck for new compilers now (see Rust, Zig and Jai e.g.).


> illumos, NT, QNX, Fuchsia

Who told you these are reinventing the wheel of Linux?

AFAIK, NT was a consumer desktop OS turned into server and meant to serve the foundation for both consumer and server OS.

QNX was a auto operating system, for which Linux does not work.

Fuschisa is meant to be a unibersal mobile OS.

These are not the same thing as Linux.

I did not meant to label theses as reinventing wheel. If I left you with such impression that was my fault in communication.


So your car wheels are still made of wood with metal roundings?


If modern tires are considered reinventing the wooden wheels, sure, then reinventing wheel is the main form of making progress. But I doubt that's how people thinking about reinventing wheel, when they use that phrase to mock others of wasting energy inventing something that has already a very-well-working alternative.


So you would drive a chariot on the same places as a four wheel traction drive Jeep?


Sorry, I was saying that reinventing wheel does not apply to the invention of modern tires...


So why aren't we all using JOVIAL to develop IoT systems?

Apparently it wasn't required to reinvent any other high level language for embedded systems programming, it was already solved problem in 1960.

Why reinvent the wheel for embedded systems programming?


Idk about JOVIAL, but I certain mind too see some modern Ada out there.


Wasn't LLVM reinventing the wheel at the time? They could just collaborate to GCC.


They wanted to, but RMS wasn't interested at that time: https://lists.gnu.org/archive/html/emacs-devel/2015-02/msg00...

I have also been told by more than one person that does professional compiler development that GCC's codebase is very difficult to work with. At least some commentary I've read suggests that this was a deliberate choice on the part of the GNU project.


> I have also been told by more than one person that does professional compiler development that GCC's codebase is very difficult to work with.

Add another. (Now-former professional compiler developer.)

> At least some commentary I've read suggests that this was a deliberate choice on the part of the GNU project.

This is the ‘RMS loophole’ in the GPL. In theory you can do what you want with the source; in practice you need help from the insiders.


Sure, I'm not denying any problems with the GCC codebase and/or how RMS handles GNU software.

Just pointing that reinventing the wheel is not necessarily a bad thing or a waste of time.


Maybe if GCC project wasn't opposing the ability to easily integrate third-party tools we wouldn't have LLVM to begin with.


> Don't want to end up with every C/C++ compiler being a clang skin

I've wasted too much of my life dealing with compiler differences, and I certainly do want that.


Well competition is good for consumers but is not for the actual market players. And what's the point of competing in the compiler space?


> And what's the point of competing in the compiler space?

To an extent, processor vendors sell CPU's based on how they perform on SPEC{int,fpu}. So what matters is not how fast your HW is, but rather on how fast the combination of HW + compiler is.


No thanks, I like having MSVC available for fast iteration builds-- clang builds too slowly.


Your wish is granted. MSVC already supports LLVM as an out of the box option.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: