Tame.JS: Flow-control by the makers of OkCupid.com

aston · on July 18, 2011

I've been waiting for this sort of thing for the longest. The first time I saw the callback passed to a callback of a callback's callback style in Node.js, I wondered why the code couldn't look as nice as the async C++ I'd written at OkCupid. From my lips to Max & Chris's minds... (telepathy!)

How long 'till the port to CoffeeScript syntax?

malgorithms · on July 18, 2011

It's funny, when Max started programming Tame.Js a couple weeks ago, I exchanged emails with a developer who tried to make something similar for CoffeeScript who couldn't convince others it was needed. He said the common response was that if you need something like Tame for async you're not "doing it right." (Obviously we disagree.)

jashkenas · on July 18, 2011

I'd be more than happy to explore the addition of Tame.js-style CPS to the CoffeeScript compiler -- but there's a lot of prior work there already:

https://github.com/jashkenas/coffee-script/issues/241

https://github.com/jashkenas/coffee-script/issues/287

https://github.com/jashkenas/coffee-script/issues/350

Edit:

Things look a little less promising after running a simple test. This input JavaScript:

    while (i--) {
      twait {
        fs.readFile("one");
        fs.readFile("two");
      }
    }

Gets compiled into this resulting "tamed" JavaScript:

    var tame = require('tamejs').runtime;
    var __tame_fn_0 = function (__tame_k) {
        var __tame_k_implicit =  {};
        var __tame_fn_1 = function (__tame_k) {
            if (i --) {
                var __tame_fn_2 = function (__tame_k) {
                    var __tame_ev = new tame.Event (__tame_k);
                    var __tame_fn_3 = function (__tame_k) {
                        fs .readFile ( "one" ) ;
                        fs .readFile ( "two" ) ;
                        tame.callChain([__tame_k]);
                    };
                    __tame_fn_3(tame.end);
                    __tame_ev.trigger();
                };
                tame.callChain([__tame_fn_2, __tame_fn_1, __tame_k]);
            } else {
                tame.callChain([__tame_k]);
            }
        };
        __tame_k_implicit.k_break = __tame_k;
        __tame_k_implicit.k_continue = function() { __tame_fn_1(__tame_k); };
        tame.callChain([__tame_fn_1, __tame_k]);
    };
    __tame_fn_0 (tame.end);

... not so nice to work with or debug. The general conclusion of that series of tickets was that the code generation required to make this CPS transformation work with all edge cases is a bit too hairy to be worth it on balance. Depending on how much sequential async you're doing, YMMV.

maxtaco · on July 18, 2011

It's on the list of todos to preserve the input line-numbering in the output code. This would mean some changes to the code emitter, and also to the code emitted to preserve the ordering of statements.

In tame/C++, debugging either compiler or runtime errors is straight-ahead, as the developer doesn't need to examine the mangled code. Now if only JavaScript had the equivalent of cpp's #line directive....

jashkenas · on July 18, 2011

We're getting there ... maybe this summer.

https://bugs.webkit.org/show_bug.cgi?id=63940

https://bugzilla.mozilla.org/show_bug.cgi?id=618650

jjm · on July 18, 2011

Progress was made but looks like it's stalled atm. Since Tame has an endpoint maybe easier to just suck it in?

Edit:

That compiled script looks a wee scary. I need to be able to fully dive into a debugger with clarity and can't imagine if there were tens or hundreds of lines of this.

cpr · on July 18, 2011

As jjm says, with Tame as a concrete, proven solution, it'd sure be great to have CS adopt something like it. Turning the problem "inside out" like this may be the best approach.

I.e., it has well-defined semantics and "only" involves JS rewriting, which is CS's forte.

po · on July 18, 2011

The branch which attempted the defer keyword also had working code. The input was clean and the edge cases of what to do with return, try/catch, etc had been thought through.

The stopping issue was twofold: Coffeescript cares about the readability of both the compiler input and output and… Coffeescript cares about not inserting dynamic lookups and calls into the program which can add undefined performance impact.

Perhaps a fork of coffeescript that is targeted for writers of async NodeJS code would be able to safely trade those off. I think jashkenas believes that Javascript will benefit from many little languages springing up to solve specific problems.

Cushman · on July 18, 2011

Not surprised. A rough transliteration of `huntMen` for example into CoffeeScript using normal node callback syntax is under 20 lines and relatively easy to understand:

  huntMen =(buffy)->
    soulmates =(buffy, cb, mates=[])->
      getMatches buffy, 10, (userids)->
        for u in userids
         do (u)->
          getThumbnail u, (thumb)->
            isPicAVampire thumb, (is_vamp)->
              unless is_vamp
                getPersonality u, (personality)->
                  getLastTalked u, match, (last_talked)->
                    soulmates.push userid: u, thumb, personality, last_talked
                    if soulmates.length >= 10
                      cb(mates)
                    else soulmates(buffy, cb, mates)
              else soulmates(buffy, cb, mates)
                      
    soulmates buffy, (soulmates)->
      #Do whatever you need to with soulmates

Obviously it's still more of a hassle to deal with than a more featurey async library, but there isn't quite the panic of trying to do the same thing in JS.

Edit: For comparison, the Tame.JS style transliterates to ~15 lines of CS— but presumably that would compile to dozens more lines of JavaScript.

In exchange, it is lying to you about what the program is actually doing, as opposed to the callback syntax which obscures nothing. So yeah, not surprised if you don't get too much traction with examples like that.

judofyr · on July 18, 2011

In exchange, it is lying to you about what the program is actually doing, as opposed to the callback syntax which obscures nothing.

I'm really fascinated with this thought that adding sugar/coroutines is "lying". Everything is based on abstractions; this is just another one. This is not a different "lie" than any other abstraction (assembly, C, OS, JavaScript).

Cushman · on July 18, 2011

That's a fair point— it's certainly subjective. I'd disagree that an abstraction is an abstraction, though; some are more deceptive than others.

Node comes with single-threaded, callback-based async that works quite well, but it's a bit of a hassle to use for complex stuff.

Nothing wrong with that— abstraction time! The async module, for example, comes with a few different callback-based flow-control patterns. The Buffy example would look something like this:

    huntMen =(buffy)->
        soulmates = []
    
        getSoulmate =(callback)->
            mate = {}
            async.series [
                (cb)->getThumbnail u, (thumb)->cb null, mate.thumb = thumb
                (cb)->isPicAVampire thumb, (is_vamp)->cb('vampire' if is_vamp)
                (loaded)->async.parallel [
                        (cb)->getPersonality u, (p)->cb null, mate.personality = p
                        (cb)->getLastTalked u, match, (l)->cb null, mate.last_talked = l
                        loaded
                    ]
                ], (err)->
                    soulmates.push mate unless err
                    callback()
        async.whilst (->soulmates.length < 10), getSoulmate, ->
             #Do whatever with soulmates

(I don't actually use async much, so forgive me if there's an error there.)

Async's abstractions are what I would call "honest". It's still using callbacks, it's obvious what the relationship between them is. The meanings of 'series' and 'parallel' are clear; I could write them out myself, it'd just take longer. Nothing about what this code does is being obscured by the abstraction, just made prettier. I know (or can easily work out) exactly what will be executed.

Tame is abstracting the same single-threaded, callback-based async, but it's trying to make it look like it's using threads. I know in theory it must be turning my code into callbacks which are being passed around, but it's deliberately trying to make that unclear. The result is that I have a somewhat worse understanding of what my program actually does.

To be clear, I don't have a huge beef; like you say, abstractions are necessary, and Tame seems fine to me. It's just my personal preference for more honest abstractions over more dishonest ones (no doubt heavily influenced by the fact that I actually love callback-based async, which I think puts me in a small minority.)

malgorithms · on July 18, 2011

I believe this code is painfully unreadable - and that's despite the fact that CoffeeScript is very elegant and easy to read. (It's not you, it's async.) Further, maybe i'm misreading this, but it appears getPersonality and getLastTalked are fired in serial. Can someone who knows CS well fix that and reply? Thanks!

Cushman · on July 18, 2011

Ah, that's a good catch. Missed that. Calling those in parallel is legitimately a hassle to roll yourself, something like:

              ...
              isPicAVampire thumb, (is_vamp)->
                unless is_vamp
                  mate = userid: u, thumb: thumb
                  finish =-> 
                    return unless mate.personality? and mate.last_talked?
                    soulmates.push mate
                    if soulmates.length >= 10
                      cb(mates)
                    else soulmates(buffy, cb, mates)
                  
                  getPersonality u, (p)->finish mate.personality = p
                  getLastTalked u, match, (lt)->finish mate.last_talked = lt

Anyway, I'll grant you that code is pretty harsh. It's much nicer with syntax highlighting and wide tabs, but still, fair cop. The flip side is that that code compiles into JavaScript that does exactly what it says.

sawyer · on July 18, 2011

I think adding Tame style CPS to CoffeeScript would be amazing; it looks like an incredibly clean way to write the async code necessary for complex Node apps.

From my understanding of the prior work the issue with adding defer or <- to CS was that it required too much overhead to get right in all cases. Does TameJS' approach improve that overhead in any way, or is this essentially the same work that's already been explored for CS, broken out into a dedicated compiler?

tjholowaychuk · on July 18, 2011

adding something like this at the grammar level is a massive hack... we can just use coros

statictype · on July 18, 2011

Ever since I first heard of CoffeeScript, I'd been hoping that features like this would make it into the language. It's not realistic to wait for javascript interpreters in the browser to catch up, but this would be a perfect addition for a compiler like CoffeeScript.

pmjordan · on July 18, 2011

The problem this solves is a serious one, in my experience, even though I find their choice of syntax rather curious. Given that JavaScript 1.7 introduces the yield keyword, it would make sense to add support for that to V8 and implement the infrastructure for concurrent asynchronous I/O around that as a library. The concurrency aspect is, after all, orthogonal to the blocking vs. callback situation, and can easily be done even when using callbacks, with a single callback function called upon completion of all concurrent I/O. I believe the Dojo framework provides such a utility, and I wrote my own simplistic stand-alone mini-library for exactly this a while back. [0]

I've run into the problem of endless chained callbacks in C, where it's much worse due to the lack of nested functions, let alone closures or garbage collection.[1] I ended up using the switch block "coroutine" hack [2] for the worst cases, along with dynamically allocated "context" structs to hold "local" variables. A proper macro system would have helped transform blocking code into CPS form. I tried to integrate a SC [3] pass into our build, which could have done it, but ran into all sorts of practical/yak shaving problems, so I ended up with the C preprocessor macro/switch solution for now. In user space, explicit stack-switching with something like swapcontext() is probably preferable, if you can get away with it, but in the kernel this is rather problematic.

[0] https://github.com/pmj/MultiAsync-js

The reason I wrote my own was because I originally needed it in Rhino, the JVM-based JS implementation, and I couldn't find anything similar that worked there.

[1] Yes, there are garbage collectors that work with C, but to my knowledge, none of them can be used in kernel modules. In any case, the other 2 issues are worse and aren't solveable within the language via libraries.

[2] http://www.linuxhowtos.org/C_C++/coroutines.htm

[3] http://super.para.media.kyoto-u.ac.jp/~tasuku/sc/index-e.htm...

maxtaco · on July 18, 2011

My first thought for implementing tame.js was with yield, but V8 doesn't currently support it (though it's reserved as a "future" keyword). A direct use of yield (without anything like twait) would probably make node code look more like Python/Twisted code, which while better than the status quo, still can get unmanageable in my experience.

Agreed that twait conflates concurrency with blocking/callbacks, but it my experience, it's a natural and useful combination.

bdarnell · on July 18, 2011

I think you could do something manageable with yield alone (at least with python-style generators). I've been meaning to try something like this with tornado. The general idea is that yielding would either immediately produce a callback or asynchronously "wait" for a previously-generated callback to be run. It would look something like this:

    doOneThing(yield Callback("key1"))
    andAnother(yield Callback("key2"))
    res1 = yield Wait("key1")
    res2 = yield Wait("key2")

pmjordan · on July 18, 2011

True, you wouldn't get the syntactic sugar of the result variables - you would have to return them as an array or object with named properties. Lack of V8 support is of course the killer, and when I wrote my original comment, I for some reason assumed you'd modified V8 (at which point you may as well have implemented yield), but it turns out you just do the CPS transform in a preprocessing pass. Nice!

snprbob86 · on July 18, 2011

Looks a lot like C# 5's await/async keywords:

http://blogs.msdn.com/b/ericlippert/archive/tags/async/

Cool to see growing interest for this at the language level.

contextfree · on July 18, 2011

It looks even more like F# async workflows, which compared to the C# async feature have the advantage of being implemented in the current shipping version rather than the next one.

reustle · on July 18, 2011

I've been very happy with "parallel" in the async library by caolan https://github.com/caolan/async

sjs · on July 19, 2011

Why hasn't anyone brought up error handling yet? What happens when an error is thrown inside a twait block? What happens when 2 errors are thrown inside a twait block?

Tame.js looks nice in that it's very simple to learn but ~300 lines of plain old JavaScript[1] can give you a general purpose deferred/promise library with more flexibility should you need to do something other than wait on N async operations and then use the results.

[1] https://github.com/heavylifters/deferred-js/blob/master/lib/...

geuis · on July 18, 2011

In what significant ways is this different from deferreds/promises/futures?

sjs · on July 19, 2011

Your code is indented less than it might otherwise be indented, and you type fewer literal functions. Oh and there wasn't any mention of error handling so apparently there is nothing to help you when one of the calls in a twait block throws. I wonder how twait interacts with try/catch. If it's purely a CPS transform then yeah, you're just fucked on error handling.

mkevent is like creating a Deferred and twait is like creating a DeferredList and calling when/then on it with the remainder of the code after the twait block used as the callback. Conceptually this is a very specific and concrete subset of functionality that promises give you.

voidfiles · on July 18, 2011

I was wondering the same thing. If the spec evolves tools to handle this stuff, thats one thing, and of course this could be an example solution, but right now I think the problem has a solution without adding extra keywords to the language.

bialecki · on July 18, 2011

I would've thought the same thing until I wrote substantial async code in JS and Python.

It's not that it doesn't work and you can't do it, it's that the code becomes a mess and their example gets at that. It probably doesn't seem like a big deal, but when you constantly have to pass around callbacks and chain requests together you get this feeling the code could look a lot cleaner than it is. You want asynchronous behavior but with synchronous syntax.

This isn't possible without adding something, and, having seen a lot of the solutions out there, it's nice to see someone take a stab at it by changing the language. The cost is huge (preprocessing, etc.), but, speaking from experience, the simplicity of the code you write might make the change worth it. You get the feeling in 5-10 years this will be a solved problem, but I'm not sure any of the solutions out there yet will be the accepted solution.

Pewpewarrows · on July 18, 2011

This seems neat, but after reading through it twice I can't seem to understand how this provides any advantage over just using Deferreds. Someone care to enlighten me?

valyagolev · on July 18, 2011

That reminds me of monads in Haskell (they can solve this problem (and other as well) matemagically). I've seen somewhere a proposal to add them into Javascript, but I doubt the idea will be loved by public.

(By "add them to JS" I mean some syntactic sugar, not a library)

pmjordan · on July 18, 2011

Proposal? JavaScript 1.7 introduced generators circa 2006, which I do believe lets you implement something like this (in addition to a bunch of other cool things).

https://developer.mozilla.org/en/New_in_JavaScript_1.7#Gener...

Unfortunately, V8 doesn't support generators, as far as I can tell.

valyagolev · on July 18, 2011

sorry? generators? well, I like generators, but what's your point?

pmjordan · on July 18, 2011

  yield;

Will block execution until the generator is resumed and return control flow to the caller. I haven't actually tried it, but it should be pretty straightforward to wrap a generator function in such a way that it's driven (resumed) by callbacks from async I/O calls.

janetjackson · on July 18, 2011

This is the wrong solution to the problem, and it's implemented poorly.

Use a proper control-flow ( not "flow-control" ) library like https://github.com/caolan/async.

Furthermore, why would you write an entire custom js parser for this? Why not use some of the many many pre-existing ones that are much more stable, more developed, and well supported.

__david__ · on July 18, 2011

Seriously? The library solution is pretty ugly--though it's a good solution if you are absolutely dead set against compiling your "javscript". Tame, being a code transformer, makes the equivalent code so much more readable and clean looking.

How on earth is readable, clean looking code "the wrong solution to the problem"? I would argue that it's almost always the right solution to a problem.

frankdenbow · on July 18, 2011

Slightly Unrelated: There was a site recently on hn that was a listing of various js libraries, like this one, on one page. What was it?

matthiaswh · on July 18, 2011

There are two that fit your description and are really useful:

http://www.everyjs.com/

http://microjs.com/

frankdenbow · on July 18, 2011

thanks! microjs was what i was thinking of

yaix · on July 18, 2011

This is nice on the browser, but not very useful in nodeJS.

twait{} will block and stop my nodeJS process from doing anything else.

It would be more useful if I could give twait{} a callback to fire when all its async events completed. Then my nodeJS process could do other stuff while waiting for a twait{} bundle to finish.

malgorithms · on July 18, 2011

No, that's not the case. twait won't block your Node process from handling other events.

yaix · on July 19, 2011

Isn't it waiting to execute anything that follows after a twait code block?

If so, then it is blocking. Otherwise, how do you manage to let other code be executed, except the code you don't want to be executed until all functions in the twait block have returned?

The would need to be a callback attached to the twait block, but there isn't. So it's blocking.

Because that is it's purpose, to block further executing until all data from the non-blocking functions have returned.

baudehlo · on July 19, 2011

From looking at it, each twait block is blocking as a unit, but other code elsewhere from the twait block will still run.

teyc · on July 19, 2011

Also see JSCEX, which uses the C# keyword await

http://groups.google.com/group/nodejs/browse_thread/thread/3...

lzm · on July 18, 2011

The next version of Visual Studio will have something similar: http://msdn.microsoft.com/en-us/vstudio/async.aspx

trungonnews · on July 19, 2011

So the code we write is clean and easy to understand, but the debugger only work with the compiled version of the code?

In another word, write in C, and debug in assembly...

starwed · on July 19, 2011

There's a bug in huntMen. if (! is_vamp) should instead be if ( is_vamp)... :P

tjholowaychuk · on July 18, 2011

why not just use node-fibers?

maxtaco · on July 18, 2011

A key advantage of fibers of course is that they preserve exception semantics, whereas tame can't in all cases. I'm not too interested in reviving the stalemated thread v. event religious war. I prefer explicitly-managed events, but if others prefer a more thread-like API, by all means....

jrockway · on July 18, 2011

In the end, both are the same thing; managing action-specific state in an application-defined per-action data structure. With OS threads, you let the OS manage the state instead. This can be inefficient with a large number of actions, because most of the state kept has nothing to do with the application itself; it's OS bookkeeping overhead.

sandstrom · on July 18, 2011

I remember reading something about fibers from Asana (maybe thats the same as node-fibers): http://asana.com/2010/10/adding-fibers-to-v8-efficiency-clar...

tlrobinson · on July 18, 2011

Same idea, not the same implementation: https://github.com/laverdet/node-fibers

_mnjb · on July 18, 2011

For those who looking for the JavaScript way of the examples; https://gist.github.com/1090228