Ignition: V8 Interpreter

hittaruki · on Aug 10, 2015

https://twitter.com/andywingo/status/630705153922838528

abrowne · on Aug 10, 2015

The tweet says "V8 is replacing its baseline JIT with an interpreter, not unlike JSC and SpiderMonkey".

hittaruki · on Aug 10, 2015

There is an 80 char limit on subject!!! so had to cut short.

pwg · on Aug 10, 2015

See this: http://i.stanford.edu/pub/cstr/reports/cs/tr/94/1520/CS-TR-9... (ADAPTIVE OPTIMIZATION FOR SELF: RECONCILING HIGH PERFORMANCE WITH EXPLORATORY PROGRAMING)

Esp. section 5.3+ starting on physical pdf page 56.

acqq · on Aug 10, 2015

For others who consider if it's worth clicking: the paper is by https://en.wikipedia.org/wiki/Urs_H%C3%B6lzle and it's written in 1994.

s_kilk · on Aug 10, 2015

Can anyone summarize what this move will mean for performance on the average web app?

Will they be giving up some speed to claw back some memory?

Klathmon · on Aug 10, 2015

The TL;DR: is that non-hot code will run slower (it looks like about 1.5 to 2 times slower) with the benefit of reducing the code-space to about 25% of it's current size. (saving any memory usage that would have been used by that)

"Hot" or optimizable code will still be optimized and run just as fast.

rakoo · on Aug 10, 2015

What portion of a "standard" web site (say, Google search) is hot vs non-hot ?

Klathmon · on Aug 10, 2015

The specifics of this are a bit outside my knowledge area, so take this with a few major grains of salt.

But basically V8 has tools built in that will decide when the cost of compiling is worth the possible benefit.

So the AJAX submission javascript will probably never hit the optimizing compiler, but your function inside 3 nested loops will probably be hit ASAP.

s_kilk · on Aug 10, 2015

Thanks :)

cbsmith · on Aug 10, 2015

Kind of hilarious that we're worried about code size in browsers these days.

Klathmon · on Aug 10, 2015

It's not disk size that we care about, it's memory size. Reducing the space that the "parsed" code takes up in memory will reduce the amount of memory used by the engine.

cbsmith · on Aug 11, 2015

> It's not disk size that we care about, it's memory size.

Yeah, I get it. We have a lot of memory these days. Gigabytes. Not always enough for images and videos, but for code? for client-side browser code that is downloaded & run on the fly? Makes you wonder how we got there.

legulere · on Aug 10, 2015

Using an interpreter instead of a compiler for the lowest JIT tier can reduce startup time. When it's done in a good way it could actually even lead to speedups in common cases.

twotwotwo · on Aug 11, 2015

This might be about RAM-constrained phones; the design doc refers to '--optimize-for-size devices', suggesting it won't (initially?) be on for everyone. Android and Chrome OS already enable Linux zram to get more out of low-memory gadgets. Incidentally, Google announced a (to me very ambitious-sounding) push for $50 Android phones today (http://www.engadget.com/2015/08/10/google-revamps-android-on...) that better low-RAM support could mesh nicely with, though of course it's just a coincidence the dates lined up.

gsg · on Aug 10, 2015

Interesting decision.

The architecture sounds a little unusual:

"The interpreter itself consists of a set of bytecode handler code snippets, each of which handles a specific bytecode and dispatches to the handler for the next bytecode. These bytecode handlers are written in a high level, machine architecture agnostic form of assembly code, as implemented by the RawMachineAssembler class and compiled by Turbofan"

It also seems as if all calls are mediated by code generated by the compiler, which has the advantage of avoiding the awkwardness of different calling conventions between native and bytecode functions (possibly at some cost to performance?).

Fascinating reading. Thanks V8 people for allowing such documents to be public!

barrkel · on Aug 10, 2015

The architecture is a threaded interpreter, a fairly old and not particularly unusual interpretation technique.

https://en.wikipedia.org/wiki/Threaded_code

When written in C, is typically done using GCC's computed goto and the && label address-of operator extension.

Writing the handlers in machine-agnostic assembly is interesting; I'm guessing they want to tune the output more than writing the handlers in C would let them, or they can't rely on something like GCC's computed gotos.

yoklov · on Aug 10, 2015

There's a post by Mike Pall, author of LuaJIT, which explains why writing C is very suboptimal for maximally performant VMs:

http://article.gmane.org/gmane.comp.lang.lua.general/75426

Worth noting that IIRC, for a while LuaJIT in interpreted mode was able to beat V8 in optimized mode not all that infrequently (although it depended on the use case, they are different languages, and I do not know if this is still the case).

mraleph · on Aug 10, 2015

> Worth noting that IIRC, for a while LuaJIT in interpreted mode was able to beat V8 in optimized mode not all that infrequently

V8 had no optimizing compiler when Mike Pall sent his (in)famous mail about "LuaJIT interpreter beating V8 compiler"[1].

Also usual disclaimers about cross-language benchmarks apply (e.g. nobody looked how those benchmarks differ between JS and Lua implementation).

[1] http://lua-users.org/lists/lua-l/2010-03/msg00305.html

barrkel · on Aug 10, 2015

Common register allocation across different bytecode interpretation sequences was one of the things specifically on my mind that could be tuned using a high-level assembler.

Very suboptimal might be a slight overstatement. I can see a way, given known register calling conventions, that you could write an interpreter written as tail calls and post-process the machine code to effectively JMP instead of CALL. Guaranteed tail calls would save you a bunch of effort, and register calling convention would give you some guarantees about consistent allocation.

mike_hearn · on Aug 10, 2015

Not that unusual. In fact the planned architecture sounds exactly like HotSpot: a fast assembly based interpreter that profiles the code, with tiered fast/slow compilers producing machine code for hot spots, with deoptimization support to allow more speculative optimisations.

It actually seems a bit of a shame that V8 and Nashorn are competing despite heading towards very similar architectures.

vitalyd · on Aug 11, 2015

I got the same impression from a cursory read. Profiling in the interpreter has its downsides, chief among them is lack of inlining causing profiles sometimes to be less useful. Tiered compilation can help here, but requires more careful code cache management.

gsg · on Aug 11, 2015

Having TurboFan compile the interpreter from machine-independent templates is what struck me as unusual.

The rest is familiar enough, true.

mike_hearn · on Aug 12, 2015

The HotSpot interpreter is done the same way, I believe. It's also a template based interpreter.

gsnedders · on Aug 10, 2015

It really reminds me of JavaSciptCore's LLINT.

themckman · on Aug 10, 2015

I like the way Google projects seem to use Google Docs for this sort of thing. I've noticed in the past docs come up for projects like Angular and Go. I'd like to hear more about some of these documentation policies if anyone knows anything.

tuckerman · on Aug 10, 2015

I no longer work at Google, things could have changed, but there was a strong culture to use Google Docs for project requirements gathering and design documents. There were some internal templates you could copy if you didn't want to think of the headers yourself.

While it's great this is part of the culture, discoverability/versioning were still problems, i.e. had to be done yourself.

munificent · on Aug 10, 2015

There's no official "policy" that I know of. It's just that most Googlers tend to prefer Docs. It works anywhere you have a browser, is easy to share on the web, has nice collaboration features, and built-in comments.

pietaalpha · on Aug 11, 2015

Would it be useful a <hot-code> tag for the purpose of indicating that optimizations are required for the code?

in Lisp we have compiler options (declare (optimize ...))

ndesaulniers · on Aug 10, 2015

So V8 will be moving from a two tier to a three tier architecture.

Spidermonkey is three tier and JSC is four.

yoklov · on Aug 10, 2015

Are they adding an extra tier, or replacing the lowest? I thought they were replacing the lowest.

ahomescu1 · on Aug 11, 2015

AFAIK, V8 already has 3 tiers: full-codegen (the basic non-optimizing JIT), crankshaft and turbofan. With ignition, they're replacing full-codegen with an interpreter.

It's strange that there's only one mention of crankshaft in the entire document, but turbofan is all over the place. Are they also planning to get rid of the former?

ilyaigpetrov · on Aug 13, 2015

Does this make implementation of other language compiled for V8 easier?

felixangell · on Aug 10, 2015

The link to optimizing an ANSI C interpreter is broken, it's missing a colon after the protocol. I dunno if there's anyone here that can fix that, just thought I'd say it anyway.

Actually it's not the colon, there are two protocols:

    http://http://dl.acm.org/citation.cfm?id=199526
    ^^^^^^^

My bad.

coldtea · on Aug 10, 2015

[flagged]

lucian1900 · on Aug 10, 2015

It just means collaborative editing is turned off, as it logically can't scale to very large numbers.

simonw · on Aug 10, 2015

You can still read the document, which gives me a lot more confidence than if the file just became unavailable.

rancur · on Aug 10, 2015

by my deeds I will honor him, V8

rancur · on Aug 10, 2015

source for the uninitiated: https://www.youtube.com/watch?v=egSwfPKOX1c