The main reason is the optimizer abandons trying to figure out flow-of-control when half the expressions can throw and present a path to the catch blocks. Furthermore, this all inhibits en-registering variables, because exception unwinding doesn't restore registers.
If you want your code to be fast, use 'nothrow' everywhere.
I don't know about newer versions of Clang, but I recall Chandler Carruth mentioning that LLVM abandons much optimization across EH blocks as infeasible.
Java JITs are considerably more advanced than this. It's expensive to put in all the additional control flow edges to model exceptions, but once done, control and data flow analyses just work, as well as loop optimizations, code motion, inlining, register allocation--all of it "just works" (TM). Then you have to spit out a metric crapton of metadata to allow searching for the correct handler at runtime, but that's the slow path.
All of that is to say that Java try blocks do not have any direct dynamic cost when exceptions are not thrown.
Even throwing exceptions and catching them locally in Java can be fast. If everything gets inlined into one compilation unit and the exception object is escape analyzed, HotSpot will absolutely just emit a jump.
Not that I disagree with your overall point--Virgil just doesn't have exceptions--but from your description here it just sounds like your compiler is far behind the state of the art in terms of optimizing exception-heavy code.
Java has a far, far more restricted view of exceptions than C++ and D have,[1] and hence more opportunities for optimization. I did implement exceptions in the Javascript compiler I implemented 20 years ago, and they're a cakewalk compared to C++. I also implemented a native Java compiler, including EH.
As for clang, see what Chandler said. But maybe things have changed in the last couple years.
[1] for example, Java doesn't have objects on the stack that need their destructors run. That's a massive simplification.
I don't know the internals of clang very well, but everything I have heard second hand (and third hand) makes me think that its approach to modeling exceptions isn't very good.
I actually agree that they technically aren't zero cost (notice I didn't even write that), but the cost is indirect. I've worked on a number of Java JITs and in practice, not a lot of hot code has catch blocks, and even when so, inlining is typically so deep that lots of exception edges (e.g. arising because a possible NPE) get optimized away.
Most of the lost optimization opportunities are second-order costs, not first-order costs. Java JITs make up for the extra flow edges by focusing more on global optimizations rather than local (e.g. GVN vs LVN, global code motion, global flow-sensitive load/store elimination), etc. Generally a possible exception edge splitting a basic block doesn't hurt because the non-exceptional control flow will still benefit from flow-sensitive optimizations (i.e. it has only one predecessor anyway).
We're splitting hairs anyway. Like I said, Java JITs are significantly more advanced at optimizing exception-heavy code. I'd be really surprised if you saw anything more than a 1% increase in performance, actually, no, scratch that. I doubt you can even reliably any speedup distinguishable from noise from just disabling all support for exceptions in most Java code, unless you are talking about metadata. Top-tier JITs really are that good.
Not sure what you mean here, but generally Java JITs generally don't use callee-saved registers at all because they need precise stack maps for GC. So whatever small amounts performance they might lose here isn't due to exceptions.
OpenJDK's (HotSpot) stack maps do support callee-saved registers -- and they are used in some special cases, like safepoint stubs (that spill all registers at a poll-point if a safepoint is triggered) -- but you're correct that they've been removed from ordinary Java calls altogether on all platforms, now.
Allocating local variables into registers rather than assigning stack locations for them. Registers are faster than memory. EH unwinders restore the stack before jumping to the catch block, but not the register contents.
Stack maps wouldn't be necessary for non-pointers, like an integer variable. Stack maps also have their own performance problems, which is why D doesn't use them.
Inside a single physical frame, that contains many inlined Java methods (the current default is inlining up to depth of 15, not counting "trivial" methods that are always inlined), locals are always in registers (unless they're exhausted), and are spilled only at safepoints, e.g. when another physical call is made, which is where all languages have to spill, too. Stack maps include only pointers and incur no runtime overhead on the fast-path, and are not used when throwing exceptions, even outside the current physical frame. There's additional debug info metadata associated with compiled code, which also incurs no runtime overhead on fast paths, that maps even primitives to their registers/spill locations; that debug info also maps compiled code locations to their logical, VM bytecode, source ("de-inlining"), and is consulted in the creation of the exception when a stack trace is requested.
The term used most often for this is "spilling". I figured this is what you meant by "deregistering" but I wasn't sure, so I didn't want to assume.
> Registers are faster than memory. EH unwinders restore the stack before jumping to the catch block, but not the register contents.
I get that, which is why Java JITs don't use callee-saved registers. I mean, they use all the physical registers, of course, but their calling convention does not have callee-saved registers.
But what's a variable, really? After SSA renaming, optimization, SSA deconstruction, then liveness analysis, coalescing, and finally live-range splitting, variables are history and the register allocator is only dealing with live ranges, typically.
I thought the not-thrown case was pretty much zero-cost, at least in newer versions of Clang... How much of a slow down are we talking here?