Hacker News new | past | comments | ask | show | jobs | submit login
V8 Optimization Killers (github.com/petkaantonov)
244 points by diggan on June 25, 2014 | hide | past | favorite | 80 comments



Which of these (if any) are purely incidental to the current state of the V8 project, rather than being inherently insurmountable problems for JS optimization? In other words, if I memorize this particular list, what subset of my knowledge may become obsolete in a few months or years?


> if I memorize this particular list

Please don't memorize this list. It's just for one JS engine among several, all of which are changing rapidly. For example Apple just shipped FTL, an LLVM-based JIT for Safari (so any list for Safari would have just been largely obsoleted).


> It's just for one JS engine among several

If you are writing code for nodejs, none of the other engines are relevant. If the overwhelming majority of your users are using chrome, then v8-specific optimizations aren't a bad idea.

On the issue of the list, most of the guidelines here apply here and in every other engine. For example, I doubt a future JS engine can optimize `arguments` abuse while preserving ES5 semantics.


Even if you only care about V8, it would be JIT version specific.


I actually don't want to memorize any list, unless it's a list of things that are inherently expensive/impossible to optimize in JS, in which case it's worthwhile. I just hate the idea of having to write all my code according to some list-of-the-week. What I want is a list that separates the always-know stuff from the engine-specific-tweaks or might-not-apply-next-year stuff.


Actually re-reading the article, I see this:

    Currently not optimizable:

    Generator functions
    Functions that contain a for-of statement
    Functions that contain a try-catch statement
    Functions that contain a try-finally statement
    Functions that contain a compound let assignment
    Functions that contain a compound const assignment
    Functions that contain object literals that contain __proto__, or get or set declarations.

    Likely never optimizable:

    Functions that contain a debugger statement
    Functions that call literally eval()
    Functions that contain a with statement
Which I now realize is sort of what I'm looking for.


That list won't ever be accurate. The only way to know is to re-run your performance tests every time Google and Mozilla ship new releases.

SpiderMonkey now optimizes a bunch of difficult cases it never optimized before; some of them seemed like they would never get optimized, but now they're fine. V8 does some nice optimizations now that weren't possible before, similarly.


The try/catch bit is definitely incidental to the current state of V8. Other JITs (e.g. SpiderMonkey) will happily handle try/catch.


|with| makes reasoning about function behaviour extremely difficult, take:

var x = 5; something.propertyOnSomething = 5 with(something) { x += propertyOnSomething } alert(x); // What is the value of x?

eval() causes problems as it can introduce new variables into the local scope making it logically difficult to reason about program behaviour subsequent to the eval(), e.g.

function f(x) { eval(something); return Math.sqrt(x); }

What is the result of f(9)?

It is possible to set things up to make these not catastrophically harm performance, but the question is whether you would rather the engineers working on these engines make uncommon (and arguably bad) things common, or make normal and frequent code fast. Any time spent working on making uncommon things fast is time not spent making other things fast.


Really great! Some notes that popped out for me are that to always cache the .length property for any array or arguments:

     function doesntLeakArguments() {
         var args = new Array(arguments.length);
         for(var i = 0; i < args.length; ++i) {
             args[i] = arguments[i];
         }
         return args;
     }
becomes:

     function doesntLeakArguments() {
         var len  = arguments.length;
         var args = new Array(len);
         for(var i = 0; i < len; ++i) {
             args[i] = arguments[i];
         }
         return args;
     }
And also, if you've got a switch statement with more than 128 cases, you've probably got bigger problems on your hands than v8 optimizations.

I see some things in here that I can start switching to immediately for some of my node modules.


Huge switches are common in the inner loop of emulators and interpreters. They tend to end up with a 256-case switch to handle the next byte in the instruction stream.


I understand that use case, but in my opinion it would be much cleaner (and speculating faster) to have an 256-slot array...and routing with opcodes[byte.charCodeAt(0)](); or something similar.


Function calls are expensive if they are not optimized. However, the same is true for switch/case trees, and even for JIT'd code (dynamically created functions) ...

In the Javascript world, "Your Mileage May Vary" truly is a motto.


Would it make sense to use a bitwise test and two 128-part switch blocks?


Why would the use a switch instead of a hashtable?


Good compilers optimize a switch into a jump table, which is much, much faster than a hashtable.

Using a hashtable for instruction dispatch in an emulator would be utterly insane. You'd use an array (but the compiler can do a better job by generating raw asm for a jump table.)


Could you explain why it would be insane? I was under the impression that all you needed was fast lookup time.


A jump table is O(1). A hash table lookup is gonna be roughly O(N) for the hash + O(M) for the size of the hash bucket. Small hash tables can end up with very large buckets. Hash buckets are sometimes linked lists as well, so you pay the cost of that pointer indirection to walk the bucket. It's possible that a typical JS runtime can cache the hash value for a given string (especially literals), so that will help a bit for string keys. Integer keys might have a similar optimization.

People overestimate the performance of hash tables. It's true that they are quite fast on a modern CPU, so you get away with using them extensively in dynamic languages like JS, but they are still way way slower than a jump table or directly accessing fields at statically known offsets.


This used to be good advice, but I'm pretty sure that the modern engines all perform this optimization for you now.


Try this benchmark out: http://jsperf.com/fastest-array-loops-in-javascript/56

It looks like you're right, very right unless there's something else going on in the perf tests.


There's something going on with the perf tests - most of that code is being optimised away, in one case I got over a billion cycles per second. The trick to these things is writing code which cannot be optimised away, like returning Math.random() from a function and using the return value to increment another variable.


Yes. At present if you want a jsperf test to be remotely accurate:

1) It shouldn't exercise the GC. GC pressure totally skews jsperf results in a way it can't compensate for.

2) It should compute a result on every iteration that is checked at the end to make sure it is accurate. (This ensures that the compiler can't optimize out unused results from your benchmark loop)

3) The check should have actual control flow so it can't be ignored.

It's also worthwhile to warm all your test cases once in the setup portion of the page before the loop runs. Some performance issues only appear once you've passed multiple argument types into a function, so if you run a test case that passes in ints, and then run a test case that passes in floats, the second test case will be slower for no obvious reason. You can swap the test cases around and the second run will still be slower.

(There are actually a huge list of problems with javascript microbenchmarks, and jsperf only adds to them by being low quality software. Please use it sparingly.)


I can't say for arguments, but optimizing for a changing array.length in-loop must be difficult if not impossible.

for(var i=0;i<a.length;i++) if(check(a[i])) a.push('foo');


Compiler optimizations are never perfect. In this case, a compiler would just assume that push could be called and not hoist the length calculation.


JavaScript is complex enough that pretty much any code inside of the loop could end up doing an array push. Even `console.log(myCompletelyUnrelatedObject.x)` (given an evil property descriptor).


I don't know a whole lot about JIT compilers, but would it be possible to emit code with length optimized, and trap changes to it to fall back to a slow path in a way that wouldn't hurt the fast path? For example, you might mark the array so that when other code mutates it, that code also modifies the return address on the stack to point to some fixup code that puts you onto the slow path.


I don't know much about this, either, but I don't think that is a feasible approach.

- where are you going to 'mark the array'? Does that mean every array gets the overhead of having room for that mark or do these marks live elsewhere? If so, where, and how is code checking it going to be performant?

- in nested calls, are you going to allow multiple marks? (Solvable, I think, as you can get away with an integer mark count)

- that would mean that all code that modifies any array must check whether such a mark is present and then find the return address on the stack (and nested return addresses)

- even if you do all of the above, the simple checking for presence of such a mark can kill performance in tight loops, even if you do that with branch prediction hints.

- how do you know what mutating changes are important for the calling code? You can use a function pointer as the 'mark' and call that so that it can figure it out by itself, but that surely will kill performance in tight loops.

I think the normal way is for compiled functions to detect when their own assumptions (may) no longer hold, not for other code to signal any compiled code on the stack that its assumptions (may) no longer hold. In tight frequently run code, you may be able to inline calls and recompile, but that is costly, too.

EDIT: I just realize that it is possible to do something in the spirit of this without needing any marks. The really bad calls, such as those to eval, will have to tell the JIT compiler that the current game is over, and that they can start over. It is too costly to do that from any innocuous call, such as that which adds an item to an array, though.

As to that length optimization: it use to say that C doesn't have a for loop; because it evaluates the end condition through every iteration, its 'for' is a 'while' in disguise. Pascal, on the other hand, has a for loop.

JavaScript inherits that IMO bad decision.


Not if the function is inlineable or an intrinsic (console.log will likely be either), which is the most basic trick of any JIT.

Unless you can specifically measure caching the length is faster, I wouldn't do it. Most of the time it doesn't matter anyway (won't cause a bump in performance), and even then most of the time the compiler will be able to figure it out for you.


Maybe my comment was misleading - the important part is the property read, not the `console.log`. The following code might be an infinite loop in JavaScript:

  function adder(arr, obj) {
    for (var i = 0; i < arr.length; ++i) {
      obj.x += i;
    }
  }

  var arr = [1];
  var obj = Object.defineProperties({}, {
    x: {
      get: function() { return 1; },
      set: function(v) { arr.push(v); }
    }
  };
  adder(arr, obj);


V8 can inline getters and setters like it can inline normal functios.


Can you elaborate? "pretty sure" sounds like a hunch rather than actual data.


The article is "optimisation killers", you're right about caching the array length lookup check being faster, but it doesn't kill V8's ability to optimise functions like the rest of the examples do


I wonder if there is any way to check the optimization status in the browser? It would be awesome to have a chrome audit that parsed through all of the functions in a js file and told you the optimization status of each and why.


Not the friendliest program but http://mrale.ph/irhydra/1/ can display all sorts of intermediate representations spit out by v8


My team's developers, who are in a Firefox-only setting (so no V8 to consider), use every optimization/squishing trick they can find on github - grunt, browserify, uglifyjs etc - thinking that the smaller the number of javascript resources and the smaller those files are the faster the pages will not just download but also render.

Is it probable that SpiderMonkey and other engines suffer from forms of this deoptimization hell when rendering? And would it be a safe guess that whatever mod_pagespeed does to a page's javascript resources is less likely to result in this hell than these other tools? Thanks.


The only size-related optimization that I can think of is that V8 doesn't inline functions (but does optimize them) when they are more than X characters long, comments included.

I'm not sure about SpiderMonkey, but they may very well use this kind of optimization too. However, I would guess that Uglifying files should not increase execution speed for any noticeable amount.


I'm not aware of any common minification techniques that hurt performance. They generally improve parser performance and reduce memory usage (less memory spent on line-ending data and script sources, for example)


It's a good question but minification shouldn't cause runtime performance issues. Keep in mind that these are mostly micro optimizations when running outside a large loop. If you go from 1 mil ops per sec to 20 mil ops per sec your user won't see a difference unless the code is being run many times over.


Which is better `undefinedVar === void 0` or `typeof undefinedVar === 'undefined'`?

here says: http://www.2ality.com/2013/04/check-undefined.html void 0 is safe.


I was curious too, so here's a jsPerf:

http://jsperf.com/check-for-undefined-argument

On Chrome 35, looks like comparing to void 0 is a bit faster. Would love some more samples though.


I think minifiers use the former because it's shorter, and thus requires less space.


the other benefit is that `void 0` cannot be overwritten in sloppy mode, whereas you can do something crazy like this:

    undefined = 'lol';


No you can't. You can do something like function(undefined) {}, but the global 'undefined' is non-writable and non-configurable.


That's an ES5 change. With ES3, you can overwrite NaN, Infinity, and undefined.


Only in strict mode I though no ?


Generally. Technically, it's a "breaking change".

As you can imagine, this didn't actually affect anyone.


An excellent overview! A bit worrying that using an object as a hash table and iterating over its keys using ForIn would prevent optimization - I had always thought that this was a common use case that would be well-supported by the optimizer! I suppose in that case, if you need fast reads of all keys and can afford slower writes, you could maintain an array of the keys at insertion time and just loop through that?


`for ... in` is a relatively slow construct anyway, the faster option (a lot more verbose) is:

    var keys = Object.keys(obj),
        length = keys.length,
        key, i;
    for (i = 0; i < length; i++) {
      key = keys[i];
    }
While this is not exactly the same as for..in, it usually behaves how you'd expect and is significantly faster for a couple of reasons:

1. In a for..in loop the engine must keep track of the keys already iterated over, whereas in the fast version we can simply increment a counter and do a fast array lookup.

2. It's possible to add properties to the object that you're iterating within the body of the for..in statement, and these will be iterated too. Doing such a thing is obviously very rare but it's the kind of edge case the JS engine must keep track of.


Is a plain for loop with a counter significantly faster than a forEach? With Object.keys I've been enjoying the simplicity of statements like

  Object.keys(obj).forEach(function (key) {
    console.log(obj[key]);
  }); 
but does that hamstring my performance?


Your mileage will vary depending on the browser. If you really care (as in profiling identified this as a hotspot) you should measure the alternatives you're considering.

But note that in your specific case chances are the cost of a bunch of console.log() calls completely swamps the cost of either a for loop or a forEach call. console.log() has _incredibly_ complicated behavior.


A native counter for-loop will always be faster: http://jsperf.com/angular-foreach-vs-native-for-loop/13


Is that (#2) defined behaviour, or is it undefined? Because iirc, other languages disallow it for similar reasons.


Adding properties is undefined in ES5 and I believe defined in ES6.

Deleting properties is guaranteed to not break iteration and deleting an as-of-yet unvisited property causes it to never be visited.


Object.keys() does not need to malloc() ?


Keys is an alloc. That is not 'faster'.


I created a jsperf to measure a hashtable-like object with for-in/Objekt.keys

The performance differs only with 50%, I expected more actually. Am I doing something wrong?

http://jsperf.com/for-in-with-hashtable-like-object


I guess it depends on what you're intent is;

http://jsperf.com/for-in-with-hashtable-like-object/4


The article states "Code compiled by the optimizing compiler can easily be, say, 100x faster than the code generated by the generic compiler"

So expected a quite more than a factor of 2 ...


Instead of avoiding all those optimization killers, a much more higher impact approach is to fix this in the V8 project. You not only make your own code faster, you make thousands if not millions of pieces of code faster.


I was wondering this too. I expect/hope the answer would be "we're working on it" in most cases, but I worry it would be "it simply isn't possible" in some or most scenarios.


Useful stuff. Does anyone have similar information for other javascript engines?


Title should probably be "V8 Optimization Killers".


I'm really sick of these recent JavaScript performance related posts that only talk about V8.


It's like searching for any js problem on SO and almost every single question containing top answer with response along the lines "use jquery".


Here's a page with some performance tips for both V8 and SpiderMonkey: https://github.com/sq/JSIL/wiki/JavaScript-Performance-For-M.... Unfortunately given the pace at which JS engines move it may be slightly out of date.


I would hazard a guess that it's because the intended audience is the node js crowd.

Maybe apart from games, most of this is irrelevant for browser side javascript.


Yes, it's actually a wiki page in a node.js project.


It would be nice to see more information about optimizing for IE and Firefox's JS engines, but to be fair, V8 accounts for a large percent of browser JS and all of Node.js as well.


Node.js is the answer, here. Most stuff you do on the client (outside webgl and game calculations) shouldn't be impacted heavily by these kinds of optimization snafus. But if you're trying to get your server to be as performant as possible, these can really kill you if you snafu naïvely.


Just a cursory glance at browser stats seems to show chrome as the most popular which would make sense. I highly doubt anyone using internet explorer is coming to SO.


Umm, why not? I use IE 11 in day-to-day usage and as my primary dev target. Chrome is god awful on my machine (it sucks on high DPI displays, has terrible touch support, worse/slower UI responsiveness and animations due to single UI thread, etc). I only use it for compat testing.

Also nice that IE doesn't still require prefixes for established things like transition/transform. If only Chrome and others would get around to implementing other useful CSS3 features that IE has (like grid) we might be able to actually use them some day.


Chrome on iOS doesn't even use V8.


Chrome on iOS is Safari in a box.


I think the general idea is that asm.js is both well understood and not something a human should ever generate, so why would anyone but a transpiler writer care?


I'm not sure why you're talking about asm.js (I guess you're implying that asm.js is the only Gecko-specific optimization?), but I just want to note that I find it pretty hard to find good documentation about how to code with it.

The "not something a human should ever generate" doesn't seem like a right statement : before a transpiler can generate this kind of code, a human has to understand it.


I don't know why you are bring up asm.js.


Well V8 is the one that matters most. More users use Chrome then Firefox, so it's more important on the client-side. And server-side javascript is probably going to be running on V8. Node.js, Google's scripting platforms, all run V8.

It's like complaining if a C++ performance article only tested with GCC. Sure, there's other compilers out there, but that's the big one.


To be fair, many of these apply to other JS engines as well:

- manipulating `arguments` should be avoided

- when possible, avoid looping over the properties of an object (using `in` or `Object.keys`)

In the process of optimizing for v8, you may expose other bad patterns (e.g. megamorphic functions) whose replacement will boost performance everywhere.

For example, even though https://github.com/petkaantonov/deque was developed and tuned for v8, it is still much faster than alternatives in other engines.


While true, some other bits of advice here (like pulling your try/catch into a separate function) are very V8-specific and may well produce _worse_ code in other implementations.

http://jsfiddle.net/G83mW/14/ has a testcase for the try/catch thing that is interesting to compare in different browsers...


We changed the title to say "V8".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: