Hacker News new | past | comments | ask | show | jobs | submit login
Node.js: Some quick optimization advice (medium.com/c2c)
203 points by SanderMak on Oct 12, 2015 | hide | past | favorite | 72 comments



This is a good thing to raise awareness of, but the solution is poor advice.

Asking developers to remember the "gotcha" of 600 characters is not really viable. Instead, if this is important to you, consider the addition of minification or comment stripping to your production deployment process. Minification will also make the variable names and syntax use shorter, saving you further precious characters.

Learn the problem, but automate the solution.


In the context of the article being on nodejs code; I don't know about other developers, but I don't minify my production nodejs code because it runs directly on the server. Whether I have `var nameThatIsStupidlyLong` or `var nTISL` doesn't make a difference because size of code isn't a concern.

What do other developers out there do?


Yeah, but the whole point of the article is that the names of your variables (and even your comments!!) do make a difference in performance.


In this one context, this one micro-optimization applies but for 99% of other cases, they don't (and shouldn't).

Note the 'benchmark' in OP is run 500 million times to see the performance difference. This is definitely not a common scenario.


For me, it's more the principle of the thing. This just seems like an insane way for an interpreter to behave. Even more insane is the proposal of minifying server-side JS!


Not that insane considering many people are using things like Babel to run ES6/7 code. If you're already compiling that code to ES5 for production, there's no harm in also minifying things.


I use https://babeljs.io/ to compile es6 code for production. It could support this kind of transformation though I'm not sure if it does by default.


Babel doesn't, by default.


This is a microbenchmark. It's a tight loop that calls the function 500 million times. The function itself just adds two numbers. It's pretty close to the best possible improvement for inlining a function. If that's what your program does, and it's a big part of what your program does, then inlining it may be a big performance win. Even then, addressing this may not be a worthwhile tradeoff. Even then, as others have pointed out, you could use a minimizer.

If your program is not CPU-bound, or if it doesn't have a function that's called millions of times in a tight loop, or if that doesn't make up the majority of time spent on CPU, or if that function does more than execute a couple of instructions, then the performance difference will likely be enormously less.


I think the oddity here is that the function length takes the comments into account.


Agreed that's interesting (in a mostly academic sense), but the thrust of the post is about using that fact to make choices in writing code, and there's an awful lot of discussion here about that idea. That's what I was addressing.


If you're wondering why comments are a factor at all, and aren't just discarded by the lexer, remember that comments in JS are preserved and available to things like `Function.prototype.toString()`. I've seen this used to do evil multiline string support a few times. Slap the multiline string or template into a comment inside a function and then have another function that toStrings it and strips the boilerplate.

This sounds like a horrifying hack but it's also pretty similar to how Angular's shorthand DI works.


    > This sounds like a horrifying hack (...) how Angular's shorthand DI works.


> I've seen this used to do evil multiline string support a few times.

Love me some FOAM: https://github.com/foam-framework/foam/blob/master/apps/todo....


That might be one of the scariest things I've looked at all day.



it's incredible that that project has 11,000 commits, a huge amount of effort has gone into it but the code is really unpleasant to read. I feel like I must be missing something about FOAM - who is using it and why?


Let's stop trashing on people's work.

The success of an endeavor is proportional to the number of shitty hacks that have come before it. Sometimes this it true in a literal sense -- sometimes a project consists of shitty hacks. But the astute reader will notice that a hack is only known to be shitty because someone did it, and had the courage to make their example public.

Do we reward their courage? No. We act like cliquish teenagers and rip them apart.

I don't mean to single you out. But this subthread consists of a developer at Amazon, an unknown, and a founder -- the very types of people I wanted to respect -- yet the content is little more than "Look at how stupid these people are." What kind of example are we setting here?

Be excited about X! Whether X is FOAM, Javascript, COBOL, C++, Python, Erlang, Scheme, NPM, ASDF, Vim, Stallman, or a song. Be happy. Be amused. Be anything but bitter.

In this instance, FOAM shows what's possible. Is it necessarily a good idea? Who cares! We've learned something new! Be excited!

How certain are you that an idea that strikes you as bad is actually bad? For every possible circumstance? What about with a slight tweak? In fact, "An idea that seems bad" is the short definition of "startup."

Most ideas that seem bad are, in fact, bad. But it's important to fully explore the problem space before dismissing them, else you'll dismiss Facebook.

This subthread is about a software technique, not a startup idea. But is it really such a different domain? Would the idea of Python have survived if it had been introduced in the era of Multics? It wasn't a compiled language, so it couldn't hope to survive back then. Yet its time was coming, whether or not it would have been dismissed at the time.

"Under what circumstances could this be a good idea?" That's the valuable question. And if you also have a good answer to "Why now?" then you may be onto something. In fact, you might be one of the first people to notice that an idea has flipped from bad to good. Seems like a pretty powerful position.

It seems like most influential work started as a hack. (TeX is a notable exception.) So if you want to do influential work, be delighted by hacks. It'll make them easier to explore, and you might end up in a position few others realize is valuable.


Most everyone agrees with you in general about the usefulness of "trashing other people's work", but I think your response here is disproportionate to the circumstance.


For what it's worth I was asking a genuine question, most experimental projects don't get 11k commits so I assume that people are actively using it - my question is, given the plethora of alternatives available, what is it about FOAM that leads people to use it?

I shouldn't have said that the code was unpleasant, which is obviously subjective, but I should have said "the code style is quite unusual for JS", which it is. It was certainly not my intention to "trash their work".


Reminds me of Magento code for some reason.


es6 template literals will be a godsend for stuff like this. but thats particularly ugly.


The bug has been marked as WontFix by the V8 team: https://code.google.com/p/v8/issues/detail?id=3354. It looks like they are considering fixing it for TurboFan and not for Crankshaft, but I'm not sure what that means. Does that only apply to asm.js code?


I think Turbofan aims to supersede Crankshaft, but gradually.


Turbofan is an additional layer of optimization after Crankshaft.


Wow, the v8 optimizer can sometimes be a pretty blunt instrument. Shouldn't it look at the size of the inlined function after it compiles it to some intermediate state that erases things like comments and names?


Such a state does not exist inside V8. Things happen as they are parsed. Since V8 has no idea what a piece of JS will do until it gets all the context, it has to use a silly thing like character count to perform inlining. It could create an intermediate state, but that would require a code overhaul that would likely slow down the entire thing anyway.

If I had to suggest a change, I would use the 600-char count as a starting point for one-time functions, and then count the number of calls to a function, counting how many instructions it produces (basically tossing an incrementing variable into the compilation state, and storing it per Function object).

But then again, I've never played with the inside of V8, only the outside. I consider myself lucky to be in that position, because V8 is pretty dang sweet.


No, V8 parses your Javascript source into an AST. From the AST, there are two paths it can take. The first step is the "full-codegen" compiler, which walks the AST directly and emits unoptimized native code:

http://wingolog.org/archives/2011/07/05/v8-a-tale-of-two-com...

If a function is hot, then it gets promoted to V8's optimizing compiler, which is currently Crankshaft but will soon be replaced by Turbofan. This compiles the AST first into a high-level IR, Hydrogen, which is an architecture-independent SSA-like form vaguely reminiscent of LLVM. Then that's lowered into an architecture-dependent IR, Lithium, where register allocation and instruction scheduling is performed.

http://jayconrod.com/posts/54/a-tour-of-v8-crankshaft-the-op...

If you read the bug for this that someone posted up-thread, the AST information is fully available when inlining in Hydrogen, but a patch to remove the character limit tanked one of their benchmarks. Also, they didn't preclude fixing it in Turbofan, only in Crankshaft, and that's largely because Crankshaft is on its way out.

(I've played a little with the internals of V8, but am not an official developer. Also, one of my friends is the Bay Area TL for V8.)


> Things happen as they are parsed.

If comments aren't completely ignored by the parser, then there has to be some case where a comment inside a function has an observable effect?


Yup, see @pauljz's comment above - https://news.ycombinator.com/item?id=10375297


I still don't understand that. Function.prototype.toString() wouldn't break because the parser stripped the comments, it would just output the source without comments? Does any code anywhere depend on the comments being preserved?


People literally toString functions and parse them to implement features. Those features depend on there being comment text in the function that they parse. Pretty much a terrible hack but it is true that you can't remove the comments without breaking current code.


Wow. That's just something that should be an obvious fix in a future ES. Not only does it hurt the optimizer, it also seems to imply that the content of comments sometimes has semantic meaning in the code, such as people having invented directives etc.

Surely this must be undocumented features of the language spec?


Are you sure that the size of the intermediate representation is a good indication of suitability to inline? I think we'd be replacing one arbitrary heuristic with another. If we wanted to do really well we'd look at the cost of the actual operations in both the caller and callee, and reason about how they interact with inlining and things like that, but that's a bottomless pit of a problem and by then the user has browsed to a different page.


Hotspot and V8 serve very different niches, but doesn't Hotspot use the size of the bytecode for this decision?

(Or rather, that's the first step of deciding. It will inline larger methods if they're sufficiently hot).


True, just a better heuristic (Hotspot uses it). Not clear that with javascript you have time to do a better job. Even if it was done in the parser I think you could make a better size counter that didn't include comments and names.


One comment mentions a quite significant speedup when switching the variable initialization step of the for loop from let to var. As excepted, you'll also get this if you keep the "let" keyword, but move it outside of the loop.

What kind of optimization step is prevented here?


Posted that yesterday, didn't get much traction.

Whenever I begin to convince myself that JS and Node are actually fine languages / environments to code on, some weird edge case like this pops up.

Not entirely sure if it's just its popularity, which is bound to expose its rough corners, or if it's fundamentally bad designed (written in 10 days in the late 90's, etc)


Pretty much every language that gets rapidly adopted has a lot of weird edge cases. You want to see weird edge cases, take a look at C++. Java got plenty of criticism too for things like the primitive/Object distinction, or that when they tried to fix it with autoboxing, it now meant that ordinary addition could throw an exception. And don't get me started on Perl or PHP.

The one possible exception may be Python, but this is because the PEP process requires that you very rigorously lay out interactions with existing language features, and Guido is pretty conservative at accepting them. This means Python evolves slower than many other languages, and the last attempt to fix many of the awkward rough corners (Python 3) led to a multi-year transition that adversely hurt Python's popularity for a long time.

As Bjarne Stroustrop said, "There are two kinds of languages: those that nobody likes, and those that nobody uses."


I've had the same experience with node, but normally there aren't that many problems with V8 or the language itself. Most of the rough corners seem to be in node's core libraries.

IMO this is what we need:

* a better designed standard library. Maybe more like Dart's[1] - we already have Bluebird as the de-facto replacement for callbacks - all we need now are something like Dart's streams to fix the built in quirky streams.

* built from the ground up with optional types i.e. TypeScript or Flow (eliminates most of JS's type-related quirks while preserving most of the dynamism and duck typing and being very helpful when the time comes to refactor)

Now that typescript has support for type definitions inside node_modules[2] there are no barriers remaining to do this. node + V8 + typescript can provide a development experience that matches Dart (exceeds it in some cases, e.g. REPL) without throwing away the entire existing JS ecosystem in the process.

[1]: https://api.dartlang.org/1.12.1/dart-async/dart-async-librar...

[2]: https://github.com/DefinitelyTyped/tsd/issues/208

p.s. [2] can actually detect some situations in which there would be problems when multiple library versions coexist together (e.g. when you pass instances that come from library v3 to a function of library v2, explained in detail at https://news.ycombinator.com/item?id=8213273). However if the interfaces are structurally satisfied, the compiler will not complain. As a result I'm pretty sure that its possible to write a tool that detects semver violation based on .d.ts files (similar to what Elm does) or suggest the exact amount of version number increment by comparing the definition files


Well, to be fair. Node is not a language and this hack is not a "feature" or whatever of the language, rather it's a hack to get V8 to inline functions. JavaScript as an language is all right, but as always, making this language work and perform in multiple environments is tricky sometimes.


Sure. And CPython isn't a language. It's the canonical Python implementation that is used 99% of the time.

When it comes to non-browser production code (e.g. servers), it's NodeJS or nothing.


I don't think it's fair to blame something like this on Javascript per se, but that doesn't mean it isn't fundamentally badly designed ;) However, people actually appear to have started to notice this, and the latest result, ECMAScript 6, looks actually like it might be OK. At last, when people tell you JS isn't that bad, you can actually believe them - though if they told you that before, maybe do your own research too.

With the latest version, you have block scope, you have a shorthand function syntax, the "this" nonsense is teeny bit less awful, there's help at hand for the inevitable callback chains, and strict mode is still there. There's also a new class syntax (though actually - aside from "this"! - I thought the old-style classes were the least of its problems).

I won't apologise for rolling my eyes a bit at how it's taken until ver 6 to get block scope, but better late than never.


>written in 10 days in the late 90's

Netscape browser has not been used in a long time, and JS has been standardized as ECMAScript since the late 90's.


"Standardized" does not equate "good" though. ECMAScript is a very confused language that keeps borrowing from other languages and turns itself into a mess. Not to mention that "Universal JavaScript" is an oxymoron as most real-world JavaScript (well, ECMAScript) is not supported by major browsers and developers need to use things like Babel and the likes. And I don't see an end to it.


> a very confused language that keeps borrowing from other languages and turns itself into a mess

I could describe almost any language that builds on previous language idioms like this. C++, most of the later Lisps, Java for sure, C# especially, Objective-C certainly... etc.

You're assuming your conclusion then trying to prove it with personal opinion.


Well, JavaScript became popular, because it was omnipresent and was simple (so that even Web Designers without a Computer Science degree could use it). Now, there are vast differences between browsers and the toolchain has grown huge.

We need to revisit the scripting in browsers. Instead of twisting arms to use "JavaScript", because of it "universality", we should focus on WebAssembly and the likes. Have a safe scripting VM in the browser, use any language that compiles to it - no need for Babel and the likes.


> we should [...]

No, again, you're reforming "I would like..." into "we should". There's no should about it.

JavaScript is nowadays quite a nice language. ES2015 and the way forward gives us a language that could happily go for another 20 years with few issues.

WebAssembly doesn't solve the problem btw - most of the things people hate on JS about (that aren't just dumb "I don't like loose typing or understand prototypical inheritance" complaints anyhow) are to do with how it integrates with the key web feature: the DOM.

DOM is where everything goes to hell on the web and it's going to be just as much of a mess in C/Rust/Haskell/Clojure/brainfuck as it is in JavaScript.

Also your comment about Babel is just strange. The reason to use Babel is to support older parsers that only understand ES5 syntax. That's never going to go away. We'll be dealing with older parsers for 10 years after WebAssembly launches - so guess what? Now your toolchain got longer, more complex, and you still have to output JavaScript anyhow.

Yaaaaaay.

Or, you know, you could learn JavaScript properly and realize that it's a completely useful and great language that is just as good as whatever your pet favorite is when applied to the tasks at hand.


People have been revisiting scripting in browsers for 15+ years. The hard part isn't getting everyone to revisit scripting in browsers. That's the easy part. If Google couldn't make it happen despite their efforts with Dart, I don't who else could. :( The push wouldn't come from the W3C either. I suspect that in 5-10 years, we will have a safe scripting VM in the browser, but it's still going to be JavaScript.


This cheerfully reminded me of the thing in MRI Ruby of (relatively recent) yesteryear where strings longer than 23 characters were twice as slow, qv. http://patshaughnessy.net/2012/1/4/never-create-ruby-strings...


Good to know as a "gotcha" but typically when you get to an optimization like this you should have covered all other bigger optimizations like minification, ... that this won't actually return big yields.


Who need this? Prople writing really really performant code like parsers. Who write parsers? 1% of ppl? Sad this get so much exposure. Stop losing time with micro optimization, this is a bug that's it.


I'm at work and can not test this out, but does adding whitespaces increase the function length here?


Isn't there a way to force the inline optimization of a function, regardless of the character count?


I had this exact thing happen to me. It drove me crazy. Thank you!


So, is this is a point in favor of not commenting your Javascript code or using inline docs?

Historically, I preferred to use inline JsDoc style comments as the source of documentation for my public APIs. Recently though, I decided that I didn't like them and that I wanted something better. I was hoping to find some tool that parses my JS to AST, figures out what was being exported (e.g. what was public) and writes a JSON document that I could diff over time, to figure out what was documented and what wasn't (in my external docs). So far, I have found this project called `doctor` which sounded close to what I'm looking for, but it still relies on comments ~ https://github.com/jdeal/doctor ~ So, I might have to write it myself using something like Esprima to get the AST...


As others have said, you can always strip comments in production builds. But I agree that this is utterly silly - I'd like to see a movement to deprecate including comments in parsing at all. Anything that makes use of such a feature is hacky weirdness from the start.


Function.prototype.toString() should be deprecated anyway. There are very few to no legitimate uses of it that are any better than awful eval() hacks.

Re: this comment, if you just don't make huge comments inside the body, but rather above it - as is the usual standard - this is less of an issue.


> Function.prototype.toString() should be deprecated anyway. There are very few to no legitimate uses of it that are any better than awful eval() hacks.

The toString() on a function is very useful in more ways than awful eval hacks. The most useful way is how I use it in msngr.js for creating keys for methods that handle events.

So let's say your custom object has a way to handle events. Surely you want to allow more than one method to be hit for each event, right? So internally to your object you keep track of this by the method's contents as a sort of key or hash that always points to that specific method. Then you can remove handlers by simply passing in a method. No requiring special keys or anything to identify a function as a function is its own key.

Make sense? It's most useful for developers who work on frameworks or objects that require some custom eventing that can be used by multiple places.

If you really want to deprecate this functionality then you need to provide a way to hash to create a unique key based on a function itself. The only other way around it is adding more verboseness to the language or event handler calls which doesn't add anymore clarity.


Every JS dependency injection framework that I've ever seen uses it.


Serious question: why use a dependency injection framework with JS?


Yes, and that's widely been regarded as a mistake.

Now that we have ES6 imports and wide CommonJS support I don't see any good reason to use hacky DI with Function#toString.


Thank you. So, due to the extra work that is required I'm going to say YES - this is one point in favor of not preferring inline docs.

I think all the dissenters missed that part (about it being "one point" in favor/not in favor). I thought programmers were supposed to be good with subtle details, but it seems like the majority of them lose that ability when talking about religious topics.


I wonder what performance gains/losses you would experience if you had the comment outside the function block, before the function declaration.


Exactly, if the comment were outside the function, there would be no issue.


If the function is nested within another function, then the comments to document the inner function are going to exist inside the outer function. And having nested functions is a very common scenario in JS.


> So, is this is a point in favor of not commenting your Javascript code or using inline docs?

You can of course strip comments when making builds. In fact you can even write functions of more than 600 chars and get away with it as long as you use a minifier. If you have a minified function of more than 600 chars then that's a bulky function which should instead be split in smaller ones.


no, just minimize your code prior to production, and that'll rip out the comments. That's been best practice for years.


> just minimize your code prior to production

In frontend JS code sure, but minifying backend node.js code is certainly not "best practice".


Exactly. And if you did, you would have to look at source maps to decode your stack traces.


No one minifies NodeJS code. The reduced file-size of minified JS is irrelevant on the server.


At first I thought the comment was actually serious and couldn't believe it before I read the whole article. Thanks for pointing this out




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: