Hacker News new | past | comments | ask | show | jobs | submit login
Tail Calls, Optimization, and ES6 (duartes.org)
95 points by luu on May 24, 2014 | hide | past | favorite | 59 comments



It's important to explain why (or when) TCO is important. You could read a dozen articles like this that explain what a tail call is in simple terms and still not know.

The core demographic that cares about TCO is implementers of languages like Scheme, because if you have TCO:

* Loops can be written as recursive functions. In Scheme, you don't write loops, you write functions that re-invoke themselves at the end. This is efficient with TCO and impractical without (because the stack grows with every iteration).

* You can implement continuations. Imagine if every function took an argument "next", and instead of returning it called "next()". Without TCO, the stack would grow and grow, while with TCO, the old stack frame is torn down when you call "next()".


There's a nice example from Shriram Krishnamurthi of implementing automata using functions, where the functions represent states and state transitions are just function calls. It's an elegant way of creating automata IMHO, but the efficiency of the approach depends on TCO:

http://cs.brown.edu/~sk/Publications/Papers/Published/sk-aut...


in your second example you just described most async javascript, through it tends to be called "callback"


Nope, they're completely different.

In OP's second example, each caller function stays on the stack until its next() returns, thus the stack keeps growing in size.

In async functions, the caller is thrown off the stack once it hands off the callback to the event loop. This is why a callback can't return values or throw exceptions.


The problem is that this forces programmers to tightly couple interfaces and implementations.

Take for example a cache interface that wraps multiple implementations. The cache might be in-process or it might be a centralized memcached, etc. In order to support both of these, the "get" method must take a key and a callback. But the most reasonable in-process implementation actually grows the stack before calling the callback, while the other implementations reset it. If you iterate a large list of keys, you'll blow the stack with the in-process implementation.

So you end up doing things like "setImmediate()" on every get to reset the stack. The "async" library has a neat trick to detect if the callback is called without deferring to the event loop, in which case it does the deferral itself. But in general, programmers are forced to think about these kinds of problems constantly, or hit weird bugs. For example, I once had a periodic task that iterated every account in the database. It began crashing when our database client began returning larger batches, causing iteration not to defer to the event loop.


* shouldn't throw exceptions


can't throw exceptions that the caller can catch. If an exception falls in the forest, and there's no one there to catch it…


Loops can be written as recursive functions, so that the compiler can use TCO to turn them back into loops...

If your language religiously omits loops, it's understandable why you'd need to emulate loops this way, but the alternative solution is: have loops in the language.

This leaves us with continuations, which is a great hack, but again, a hack, for languages which don't implement fibers/coroutines/generators natively (see Python generators, see C# await/async, etc.).

This is not the path we have to go down to. Layering hacks upon hacks, so we can implement other hacks on top of it.

If we need loops, have a language with loops. If we need coroutines, the language should implement coroutines.


There are a number of algorithms that, while able to be expressed as loops, are more succinctly expressed recursively. Obviously, the reverse is also true, but having both options in the language means there are more cases where the code is a cleaner expression of the algorithm being implemented.


The problem with this is that not every recursion qualifies for a TCO optimization, and when it doesn't, you're in big doodoo if your recursion goes deep.

This means a programmer has to tell apart two types of recursion based on how a compiler optimizes them. It's possible to learn, but this qualifies as a "leaky abstraction".

It's far better to use loops when you want to loop, and recursion when you want to recurse, and if the compiler happens to use TCO, so be it. But don't rely on TCO.

Instead, we see here whole algorithms which depend on TCO to work without filling up the stack. It's a path towards more obscure code, not more succinct code.


It's dead simple to tell apart a tail call. Does it look like

  return call(a, r, g, s)
It is also very simple to test if the compiler does TCO. Just write

  def f(a):
    return f(a + 1)

  f(0)
and wait for a stack overflow.


That gives an indication, but not if it sometimes does TCO but not always (like gcc for C).


The point here is the guarantee -- as specified by the language standard. There is no guarantee for TCO in C. The ES6 standard only guarantees the TCO for certain constructs which rapala's example showed (informally).


Trampolining is a technique to achieve Tail-call Elimination in ES5 [1]

However, I found this to only work in a linearly recursive algorithm (eg factorial) and not in a bi-directional recursive algorithm (eg tree construction)

I'm thinking TCO in ES6 might solve this problem? Anyone tried?

[1] http://raganwald.com/2013/03/28/trampolines-in-javascript.ht...


Another problem with trampolining in ES5 is that it produces GC pressure; not particularly ideal. TCO in future versions of JavaScript should be a big win, if only in that area.


I blogged about the advantages/disadvantages of trampolines in JavaScript a while back. See http://taylodl.wordpress.com/2013/06/07/functional-javascrip....

I'm preparing to analyze TCO in ES6. I'll do a write up of my analysis.



Apart from knowing if your environment supports TCO which could be an issue.


If you write ruby and want TCO put this at the top of your .rb file:

    RubyVM::InstructionSequence.compile_option = {
      :tailcall_optimization => true,
      :trace_instruction => false
    }


Why is it optional? What is the benefit of having it disabled by default if there is a decent implementation?


They disabled it for error reporting/debugging reasons. Because you don't need to create a new stack frame, you eliminate that information in the backtrace.


I've seen at least one paper that shows a way to fix this... hm.


A lot of people prefer having complete, explicit stacks for debugger use over having tail calls.


loops don't leave a stack trace either...


And also apply this patch, if you're using MRI (the C implementation, and the most common Ruby runtime): https://gist.github.com/plexus/11201946


This is absolutely the best news I've had all day. I've been a fierce advocate of proper tail calling since 1999, and I'm incredibly happy to see it gaining traction outside of niche functional languages like Racket. Many, many, many thanks to dherman and jorendorff and all of the others who made this happen.

Now, off to brag about this to my PL class!


A tail call is basically goto with arguments. You can use it to build arbitrary extensible state machines.

Like a goto, it's useful but potentially confusing. You shouldn't use it when a more structured loop construct will work.


> A tail call is basically goto with arguments.

In the lowest level, yes, but you could say the same about regular function calls, return statements, and even loops and conditionals, break and continue, etc.

The main selling point of using recursion for looping, IMO, is immutability. As jacquesm said, you don't need mutable local variables to track the state of the loop.

And in the real world functional programming you rarely use recursion to replace a loop, you mostly use abstractions like map/reduce/etc along with lambdas, etc.


Sure, it's all gotos at the bottom, but structured programming was introduced for a reason: it makes loops easier to reason about. In functional programming, functions like map and reduce serve the same purpose. My point is that most of the time you should use some higher-level control structure to represent a loop rather than bare gotos or bare tail recursion.

Also, avoiding mutation of local variables is a poor reason to prefer recursion. Mutable local variables are harmless so long as they don't escape (as in a closure). If the inputs and outputs are immutable then it's still a pure function. You can use them if it makes the program simpler and/or faster and the program as a whole is just as pure.

It's really too bad that functional languages avoid mutable local variables; it would make the divide between imperative and functional languages easier to jump.


Though people often criticise immutability on the ground that the machine code mutates registers and memory addresses as they're scarce resource, the compilers don't often generate such code directly. The compilers generate IR in the static single assignment form for better analysis and optimisation. That said, you should know why mutation is not inherently good, but a compromise to the pathetic reality. And now you know the technique to avoid mutation in general as well.



> you don't need mutable local variables to track the state of the loop.

Erm, that's not really true is it? Isn't the main way which TCO is (usually? always?) done by introducing a counter/accumulator?

It does alleviate the kind of thing you see in c (or other similar languages):

    while(int i = 0; i++; i < 10)
    {
      i--; // whoops
      blah();
    }
But you can hide that counter in many, many (useful) ways, like with iterators, or just some syntax that avoids modifying (or even checking) the counter variable in the loop.

So not having TCO is generally a way to avoid local state variables (but an ever expanding, eventually overflowing stack), implementing TCO tends to re-introduce a counter to avoid growing the stack (and then you have clever ways to emulate stack frames based on how far you got in the loop... sort of giving you the best (and/or worst) of both worlds).

Anyway, if all you want is a safe loop construct, recursion isn't the only option.

(All that said, I like the elegance of "lets pretend the call stack is infinite"-recursion -- and like the idea of TCO. Just don't oversell it.)


>> you don't need mutable local variables to track the state of the loop.

> Erm, that's not really true is it? Isn't the main way which TCO is done by introducing a counter/accumulator?

Yes, but accumulators aren't mutable. You're just passing something like acc*n or acc+1 to the tail call... acc is still the same.

Also, accumulators are not analogous to loop counters, they're more like the "sum" variable in this example (also mutable):

   var list = [0,1,2,3];
   var sum = 0;
   for (var i = 0; i < list.length; i++)
     sum += list[i];
> Anyway, if all you want is a safe loop construct, recursion isn't the only option.

I agree! The point of having TCO in a language isn't replacing every for/while loop with recursion. TCO is just an optimization. The right way (at least IMO and IME) is using better abstractions, like you mentioned. Recursion is just a way of easily building those abstractions, as are loops.

Btw, I probably write way more recursive functions in C than I do in Haskell, because in Haskell I have folds and catamorphisms.

I mean, the idiomatic way of summing those numbers as I did above in functional style is probably this:

    [0,1,2,3].reduce(function(a,b){return a+b;}, 0);
Or:

    sum [0,1,2,3]
in haskell ;)


>A tail call is basically goto with arguments.

Every function call is "goto with arguments". Sometimes there's even an argument that tells the code block where to jump back to when it's done.

>Like a goto, it's useful but potentially confusing.

How is it confusing? You don't even have to be aware of TCO to take advantage of it. Many recursive algorithms are tail-call-optimizable out of the box.

>You shouldn't use it when a more structured loop construct will work.

Ironically, there are many programmers who consider loops to be too low-level and unstructured compared to recursive algorithms.


There is a really nice example of the "goto" semantic for proper tail calls in "Programming in Lua". The example is a maze or adventure game where you need to move into different rooms (north, south, east west).

http://www.lua.org/pil/6.3.html

It's a nice example because it's simple, it doesn't confuse/intermix the concept with recursion, and it's an effective solution because it allows you to achieve the goal of moving to rooms (i.e. s state machine) elegantly. And in a language without proper tail recursion, it is easy to envision how you could accidentally blow up the call stack.


Loops are a much higher level construct than tail calls. It's much easier to make a mistake iterating through a list using tail calls than it is using a loop or a higher order function. (E.g. you can forget to handle the base case.)


While I agree that higher-order functions are higher-level than tail calls, I'd say that the prevalence of things like off-by-one errors makes the comparison between loops and tail calls far more ambiguous.


I was thinking more of for-each loops etc. than low-level loops with explicit counters/iterators.


In FP loops are typically frowned upon and then the tail call is presented as an elegant solution, because it can be optimized into a loop but still look like a function call with all the goodies a function call normally gives you, minus the new stack frame.

I'm really not convinced of this, it looks like a function call but it is a loop. If it is a loop under the hood then I'd like to see that loop, in my mind I always see this 'stackoverflow' neon sign hover over every tail call (of course it doesn't but years of debugging stuff have ingrained all kinds of interesting detection habits).

Making it look like a loop would introduce all kinds of syntactical complexity, and suddenly you'd be re-using your local variables. So that's not a solution either. But it feels a bit like a kludge.


It looks like a function call and it is a function call. There's nothing about a function call that requires pushing a stack frame ... as the existence of TCO demonstrates.


I'd argue that's the wrong interpretation. Function-calls absolutely require pushing a stack-frame, and TCO is no exception. TCO just says "Oh hey, we're done with the current stack-frame, let's toss it out before we call the next function". That's why TCO can apply to function-calls which aren't recursive. The recursive case still follows this rule, it's just the old and new stack-frames are identical. The compiler makes another optimization of not bothering to remove the stack-frame and just reuses it for the next call (Since it's going to be the same size anyway). Even noting that, there are still some cases where this also doesn't apply, because if (in C) you use VLA's at the beginning which depend on a supplied argument, then even with TCO, you still have to remove and create a new stack-frame every call.


I suggest you read "Debunking the 'Expensive Procedure Call' Myth, or, Procedure Call Implementations Considered Harmful, or, Lambda: The Ultimate GOTO": <http://library.readscheme.org/page1.html>, in particular section C.


I'm not so sure that in FP communities recursive call based algorithms are thought of as elegant solution; rather higher abstractions are desirable (e.g. map, filter, fold, etc)


How do you implement map without tco or loops?(I'm not saying this is impossible just that I can't think of a way to do it of the top of my head)


Basically you want to always use the least powerful tool you can to solve a problem. When applicable, map is better than foldl not because it's more concise, but because it tells the reader that the result is a list of the same length of the input, and the elements of the input are processed independently. Similarly, foldl is better than raw recursion because it tells the reader that every element in the input list is being processed in order, rather than potentially anything happening.

There's nothing wrong with using the more powerful tools when you need them (such as when implementing the more restricted tools). It's just that needing them frequently is a sign that you're doing weird things and should take a second look at your design to make sure there isn't a better way to do things.


Sure, it's all gotos at the machine level. There's nothing wrong with implementing map that way. The point is to use a higher-level, less free-form control structure when it matches the problem, because it gives the reader more information.


It's considered elegant because that way you only need functions, no other constructs required.


If you look at it from the perspective of continuations then tail calls are very natural. A function call is always a goto, the only difference is that calls in tail position do not need to add extra data to the continuation.


Of course! But that still doesn't mean you should use them most of the time. Continuations are great for writing compilers but confusing to use elsewhere. Just like you don't normally write assembly (but it's necessary sometimes), you shouldn't normally use these exceptionally flexible but unstructured abstractions in a program you want other people to read.

Instead you want to use constructs that are familiar and easy to reason about. It's hard to beat a for loop.


Sure, whether you should use them is a software engineering question. I was merely speaking to their theoretical elegance, pointing out that from a continuation perspective tail call optimization isn't really an optimization, it's just the most natural way to implement procedure calls to begin with.


By continuations I believe Jules means Continuation-passing style, a popular optimization/intermediate format in compilers:

http://en.wikipedia.org/wiki/Continuation-passing_style


I agree wholeheartedly with the main thrust of your argument, but I think it is slightly disingenuous to put confusability of goto and tail calls in the same league. Yes tail calls are gotos, but it is at an order of magnitude better than a goto at conveying the meaning of the code snippet to a human.

This is anecdotal, but some reasons, goto labels are not as well named as function names. With a goto it is more difficult to guess what data the destination code is expected to touch.

Tail calls are an improvement on both the counts. I often do not have to break context from reading a piece of code to start reading the piece of code that the program counter jumps to (unless it is a very local jump). The arguments passed by a tail call and the name of the function that has been tail called often makes (i) breaking code reading context less necessary and (ii) make larger semantic jumps remain within realms of easy human reasoning. With gotos, the jumps better be within 20 lines of text.


So ClojureScript will get TCO before Clojure? (JavaScript would get this before Java) Interesting times we live in.


Well, Java does have TCO already, just depending on the JVM you're using. J9 supports it, just not HotSpot.

http://stackmonthly.com/2010/9/tail-call-optimization


How will the affect the debugging of stack frames in javascript programs that utilize TCO?


Scheme implementations have had to deal with this problem for a long time. (Of course, one can argue that languages with ordinary while loops have a similar problem, in that value of previous iterations of the loop get overwritten by later iterations).

You can alleviate it by hanging on to stack frames for a while. Scheme48 heap-allocates stack frames, and then relies on garbage collection to reclaim space from tail-calls, so this happens automatically. And MIT Scheme uses a ring-buffer to keep around the last 10-or-so frames. (Actually it is even nicer than that: it uses ringbuffer-of-ringbuffers to keep 10-or-so frames from each of the last 10-or-so non-tail recursive calls. So when debugging you can pop up the call stack, and then at each level examine the last couple interations of the loops done at that level).


in theory it could make them less informative, but I suspect in practice most engines that implement it will figure out a way to make stack frames usable with TCO


...And then there's Python.

(Although, amusingly enough, you can actually hack TCO into Python via decorator weirdness)


Not TCO. Tail recursion can be optimized, though. See this[1] discussion at LtU.

I wish GvR would let TCO through. With TCO, Coffeescript or Clojurescript might become new favorites.

1. http://lambda-the-ultimate.org/node/1331




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: