How JavaScript works: memory management and common memory leaks

pizlonator · on Sept 14, 2017

> As of 2012, all modern browsers ship a mark-and-sweep garbage-collector. All improvements made in the field of JavaScript garbage collection (generational/incremental/concurrent/parallel garbage collection) over the last years are implementation improvements of this algorithm (mark-and-sweep), but not improvements over the garbage collection algorithm itself, nor its goal of deciding whether an object is reachable or not.

WebKit uses a constraint-based garbage collector that does not rely on reachability alone. This is an improvement over the classical garbage collection algorithm.

mbudde · on Sept 14, 2017

For reference: https://webkit.org/blog/7122/introducing-riptide-webkits-ret...

randomguy1254 · on Sept 13, 2017

Nice article. Minor gripe about the static vs dynamic memory section. The requirement that the data sizes be known at compile-time for static memory (with the example of an array allocated to a user-inputted size), seems to be based on a past restriction of the C language. C has since remove the requirement that stack-based arrays are sized with a compile-time constant; there is nothing at the hardware/assembly level which prevents such arrays. So these stack-based non-compile-time-sized arrays don't fit into either the static or dynamic memory categories presented here.

phyllostachys · on Sept 13, 2017

I was bothered by saying that static memory is assigned on the stack (implied that it is only on the stack). As an embedded guy, at least in bare metal situations, local function statics, any const variables, and any globals are either in the read-only data section[1] or in the general data section[2], both before the heap and definitely not in the stack section.

[1] - .rodata section in gcc

[2] - I think this is in the .data and .bss sections, where .data is copied from the file and .bss is zero initialized before calling main by the crt. If you peak into a linker script, or dump one by passing --verbose to ld, you can even see where it puts all the C++ bits and pieces. A dump I did on Debian with g++ is here: https://gist.github.com/Phyllostachys/0682a3bda13ef9c6b49d04...

phyllostachys · on Sept 13, 2017

After looking at that default linker script dump, I noticed it didn't have the stuff I saw in the ARM linker script that I'm used to. So I attached one for a Silicon Labs EFM32 part. It's below the x86-64 linker script and is a little easier to read.

phyllostachys · on Sept 13, 2017

After a little investigation[1][2], it seems things get weird with an OS, as would be expected with virtual memory, et al.

[1] - https://stackoverflow.com/questions/16360620/find-out-whethe...

[2] - http://duartes.org/gustavo/blog/post/anatomy-of-a-program-in...

kahlonel · on Sept 13, 2017

Except for the fact that the arrays you are talking about are still put on the stack.

randomguy1254 · on Sept 13, 2017

Apologies if unclear, that is what I meant be stack-based. Article implies that you cannot create a dynamically sized object on the stack.

vvanders · on Sept 13, 2017

Wait, is #3 for real? If that's the case it seems like a huge oversight.

IgorPartola · on Sept 13, 2017

Yeah, of the four things listed, I think it's the one that would trip me up the most. I think the safety net here would be dead code elimination: `unused()` should be detected as never called and removed during compilation/transpilation so that this wouldn't be an issue.

I am sort of surprised that `unused` is not picked up by the garbage collector in the first place though. Since JS functions are objects, shouldn't it detect an object that's not referenced during the mark and sweep?

In general, I really hate having to debug memory leaks in JS or Python. The interpreter for both will randomly allocate additional memory as it runs, so using tools like Valgrind is next to impossible. The only reliable method I've found is to pepper my code with logging statements that show what the current memory usage is, run the code like 1000-10,000 times, and see the points between which the memory usage goes up without coming down on a consistent basis. Python's built in `gc` module seems nearly useless for determining what's actually stuck in memory, and having a billion libraries that can have their own memory leaks is also not fun. These are the times I miss C: when you leak memory in C you know it because it becomes painful fast and it's usually easy to find, if your code is sane.

wahern · on Sept 14, 2017

The fix is to implement the closures properly, by only closing over individual variable slots. It looks like the engines are implementing closures by closing over entire windows of slots--that is, if two functions have the same scope, they inherit the _union_ of the variables they reference as a single window/block of variable slots.

The original article has a much simpler explanation and solution: https://blog.meteor.com/an-interesting-kind-of-javascript-me...

To learn more about closures than you ever thought possible, try reading this paper describing how closures are implemented in Lua: http://www.cs.tufts.edu/~nr/cs257/archive/roberto-ierusalims...

favorited · on Sept 13, 2017

I haven't done JS in a while, but it sounds to me like it is referenced, just not in source code. `someMethod` references it implicitly.

Is my understanding correct?

gsnedders · on Sept 13, 2017

Yes. The function's frame contains a reference to the object storing local variables of the parent frame. To do better you need to store a list of variables referenced (which you can only do if there's neither direct-eval nor a with statement).

Ajedi32 · on Sept 13, 2017

> In general, I really hate having to debug memory leaks in JS or Python. The interpreter for both will randomly allocate additional memory as it runs, so using tools like Valgrind is next to impossible.

I don't know about Python, but for JavaScript can't you use the built-in devtools? Maybe try grabbing a heap snapshot, or recording a record allocation profile and go from there?

paulddraper · on Sept 13, 2017

Yeah. I thought so too (and still think so).

SO post: https://stackoverflow.com/questions/19798803/how-javascript-...

Chrome bug report: http://crbug.com/315190

Meteor blog (linked in article): https://blog.meteor.com/an-interesting-kind-of-javascript-me...

Live example (will crash due to memory leak): https://s3.amazonaws.com/chromebugs/memory.html

---

The reason this exists in all JS engines is for performance; it's easier to have on context record instead of several.

Other languages do not do this. Off the top of my head: Lua, Java, Scala

vvanders · on Sept 14, 2017

Yeah, there's a nice link in the comments on the chrome bug on how Lua does it with upvalues: https://bugs.chromium.org/p/chromium/issues/detail?id=315190...

dingo_bat · on Sept 14, 2017

> Live example (will crash due to memory leak): https://s3.amazonaws.com/chromebugs/memory.html

Happy to report that no crash in Nightly.

Edit: No crash in Edge either.

paulddraper · on Sept 14, 2017

It'll crash Chrome because it puts stricter limits on JS memory. (Or something.)

Firefox and Edge won't crash, but you'll be using 3GB+.

dingo_bat · on Sept 14, 2017

Yup Firefox was at 3.6GB and edge at 2.8GB.

maxxxxx · on Sept 13, 2017

Would it be better to have to explicitly declare what variables you want to import into the closure like PHP or C++ do? C# also captures everything by default and reference which has tripped me up quite a few times.

jhgb · on Sept 13, 2017

That might help but it seems like either an implementation or language spec fix is in order. There doesn't seem to be a reason for a function without free variables to turn into a closure at all, thus preventing the issue.

paulddraper · on Sept 14, 2017

Yep.

Currently, the ECMAScript specification says nothing about GC.

And it seems every major JS engine has decided that this type of memory leak is okay.

So it's rather unlikely something will change.

bzbarsky · on Sept 14, 2017

It's for real, in V8. Other JS implementations may not have the same problem.

Jach · on Sept 13, 2017

The particular memory issue is new to me (though now I can watch out for it, yay) but I'm not surprised... JS lacking proper lexical scope causes many issues.

barrkel · on Sept 13, 2017

Lexical scope is a semantics concern - observable behaviour that is required for correctness. The case under discussion is an implementation concern - there's no necessity for it to leak by creating a linked list of activation records as further analysis could break the chain. The two are not related.

chasd00 · on Sept 13, 2017

when you say "proper lexical scope" do you mean just block vs function level scoping of variables? If so, i wouldn't say javascript is wrong it's just different.

fenwick67 · on Sept 13, 2017

Who is this article written for? There's a whole section on "What is memory?". If you are optimizing to remove memory leaks I really hope you already know what memory is.

vijaybritto · on Sept 13, 2017

It was helpful for me. I don't have a CS background and I have been coding javascript for around 4 years now. There are a ton of other devs who don't have a proper knowledge of basics. If you are well versed then you can skip over to the next section.

mhh__ · on Sept 13, 2017

In my experience there are (I don't know how many) some programmers who, given how they were taught/learnt, can do productive work but generally don't know computing/programming in the abstract e.g. Memory at the machine level, or type systems.

userbinator · on Sept 14, 2017

can do productive work but generally don't know computing/programming in the abstract

e.g. Memory at the machine level,

I think it's the other way around --- their usual level of abstraction is too high to understand such things...

or type systems

...and slightly too low to understand others.

skullum · on Sept 13, 2017

if you program you have some concept of what memory is. I found the review of basic terms helpful in understanding the common JS leaks. YMMV

SadWebDeveloper · on Sept 13, 2017

m with you pal, memory leaks are the last topic you touch when optimizing a function, usually you start lowering the execution time, followed by I/O blocking issues then proceed to server-related issues, network latency and after all that is covered you start looking for "memory leaks" so explaining memory is usually unnecessary at this point, since usually JR's devs are more focused on producing software rather than optimizing it.

IMHO memory leaks are important only on embedded software because you start there with a really low memory available for your software to run.

lhnz · on Sept 13, 2017

It's also very important when building long-running applications (e.g. electron applications).

styfle · on Sept 13, 2017

Jump straight to "The four types of common JavaScript leaks" section:

https://blog.sessionstack.com/how-javascript-works-memory-ma...

dualogy · on Sept 13, 2017

Didn't jump for me, but this should: https://blog.sessionstack.com/how-javascript-works-memory-ma...

styfle · on Sept 13, 2017

Oops, it looks like medium removes the hash on load so my copy/paste didn't work. I fixed my link.

fn1 · on Sept 14, 2017

The main problem with JS being sluggish (in electron apps for example) is memory and especially the GC.

I can optimize CPU usage all I want, but only after I optimized for minimum allocations, the tiny, but noticeable lags now and then would disappear.

The average javascript-GC must be really simple/naive compared to seasoned workhorses like the JVM's various GCs.

There I can happily create millions of short-lived objects before getting problems in a single-user application.

chillacy · on Sept 14, 2017

Well, you can run JS on the JVM through Oracle's Rhino (now Nashorn), but apparently perf is still largely worse than Sunspider or V8. The language doesn't lend itself to optimization as much as java does for JVM bytecode: https://blogs.oracle.com/nashorn/nashorn-architecture-and-pe...

unkown-unknowns · on Sept 14, 2017

Probably part of the problem is also the fact that JavaScript is a very dynamic language.

I think even the JVM team would struggle to improve on the state of the art in js vm tech. Their experience in making JVM might not be all that useful in the context of js.

irtefa · on Sept 13, 2017

Haven't read such an easy to read technical article in a while. Kudos!

dispo001 · on Sept 13, 2017

having stuff do stuff for you is useful until it doesnt

haburka · on Sept 14, 2017

I'm really glad that I use a garbage collected language. Unless I'm doing low level programming that requires controlling allocation and freeing, it's amazing. Yes, I still have to understand the basics of memory but I'm just very glad that most of the time, the basics are far more than enough.

theprotocol · on Sept 13, 2017

I agree. I strongly dislike "magic" in programming. I prefer to call it "denial" because you need to know about the complexities anyway, and "magic" often means sweeping them all under the rug.

prophesi · on Sept 13, 2017

Exactly, you need to de-mystify the garbage collector so that it does what you want it to do.

abritinthebay · on Sept 13, 2017

At some point every abstraction above Assemby meets that criteria though. The lines are personal and mostly arbitrary.

taeric · on Sept 13, 2017

Meh, if you try and split hairs, even assembly has magic in it nowdays. Not all instructions take the same amount of time. Some flush caches, thus cause unexpected memory behavior, etc.

However, I think it is fair that most people learn roughly what the side effects are of each line at a local level.

Ironically, this is an argument against many functional languages. There are not side effects of the logic, per se. However, there are massive implementation side effects that are not necessarily easy to reason on.

The saving grace for the vast majority of people is that typically you can get by without knowing all of this. The people that care, do care. But statistically you are not one of them. :)

abritinthebay · on Sept 13, 2017

Like I said - it's mostly arbitrary. ;)

That said I think the issues with assembly you mention aren't magic as such, they're just consequences of the commands. They don't really hide much (if anything) behind the scenes that you'd have access to anyhow.

It's just that CPUs do so much more than they used to.

horsawlarway · on Sept 13, 2017

It's not that simple though. Most modern CPUs that support the x86_64 instruction set don't actually run them as instructions on the hardware. They do all sorts of magic to queue operations, increase pipeline throughput, manage register access, make branch predictions, etc...

You can think of assembly on those cpus as a high level language. It has little correlation with what's actually happening in hardware.

This is EXACTLY the same type of "magic" that is getting complained about above. The real implementation details are hidden and unknown, but the abstraction is useful.

abritinthebay · on Sept 13, 2017

Ah interesting. Not familiar with x_64 really. Mostly 16 & 32 bit experience here.

taeric · on Sept 13, 2017

I worded my post poorly. The "However, I think it's fair" was me agreeing with you. Pretty much completely.

I was just musing on how the arbitrary line is probably not as difficult to see as many other lines we have out there. I think this would fall into "systems languages" and related things.

theprotocol · on Sept 13, 2017

True, in this context the side effects are not intentionally hidden under pretext that "it works automagically."

theprotocol · on Sept 13, 2017

It's certainly subjective, but for me, what I pejoratively refer to as "magic" is things that there's just no escaping knowing, yet are abstracted in a way that obfuscates what's going on. Often, it's presented as "it just works" which ends up being a hindrance since there's just no getting around the thing that it's hiding for you.

def0wt · on Sept 14, 2017

> To prevent these mistakes from happening, add 'use strict'; at the beginning of your JavaScript files. This enables a stricter mode of parsing JavaScript that prevents accidental global variables.

I don't think using strict will prevent accidental global variables, such as this.var in global scoped function calls. Strictness main goal is to prevent inadvertently misspelled variables from going unnoticed.

ufo · on Sept 14, 2017

Iirc, if you use strict the this gets set to null by default instead of the global object

jnordwick · on Sept 13, 2017

More extremely junior posts being rated to the front page. Y combinator is changing and i don't like its new junior tutorial level.