Hacker News new | past | comments | ask | show | jobs | submit login
What's a Closure? (nathansjslessons.appspot.com)
254 points by gulbrandr on Aug 5, 2011 | hide | past | favorite | 57 comments



I found that structure of teaching awesome!

If you already know some JavaScript you can skip the beginning and start right away at 8, "Nested Functions". Otherwise if you know any other programming language just start with the lessons and reach the lessons and learn about closures in less than 10 minutes, and -- maybe even more important -- learn many of JavaScript's basic on the way.

I really hope the author of this does many more lessons. I found this to be a really great way to introduce some of JavaScript's somewhat awkward features, like local variables and function scope.

Compared to Eloquent Javascript that also features try-as-you-read exercises I found this site to be more condensed. Decide for your own whether this is an advantage or a disadvantage. I'd suggest taking the best of both worlds, use this to get the basics and see if you get a hang on JS, and if yes continue with the examples from Eloquent Javascript.


See also http://ejohn.org/apps/learn/ and http://code.pageforest.com/ and http://www.stanford.edu/class/cs101/

And similar ideas for teaching java interactively in the browser: http://ijava.cs.umass.edu/ http://math.hws.edu/javanotes/

The idea itself is old, search for 'active essays' for example, and most recently wolfram alpha's computational document format.


code.pageforest.com is in early alpha now - but we'd be happy to have comments on the system we're building. The goal is to build a site that can allow anyone to not only solve JavaScript coding problems, but also to author new ones.


Maybe somebody could explain what the difference between a lambda, a closure and a monad is, and how/if they are different from a function pointer (or generalization of it like a signal), or from unary or binary function objects. (my C++ bias in asking this question may be obvious ;) )

I see the first three concepts used interchangeably (as far as I understand), but maybe that's because they're used slightly different between different communities?


A lambda is an anonymous function (i.e., a function without a name). It may or may-not be a closure, depending on if it has access to surrounding lexically scoped variables.

A closure is a function, combined with the set of variables declared outside its definition (a 'lexical scope').

A monad has absolutely nothing to do with functions/lambdas/closures. The 5-second explanation is that it's a "design pattern" that's useful for managing control flow (often chaining operations together). But Google around a bit if you're interested.

A function-pointer/function-object in C++ are functions that don't have access to external lexically scoped variables. They can only access their arguments, or dynamically scoped (roughly 'global') variables. They are 1/2 of a closure (the missing half is the definition's lexical environment).


Monads, in the context of programming, are used as abstractions for computation. That's all - some people mention state, because some useful computational strategies maintain state - but not all do. In Haskell:

The List monad represents nondeterministic computation. Each possible result from the previous step in the computation is fed into the next step of the computation.

The Maybe monad represents computation that can fail. If one step of the computation fails, then the rest fail. The previous step of the computation is only fed into the next step of the computation if it succeeds.

The IO monad represents globally stateful computation. The entire state of the program is fed into each step of the computation. Kind of. That's sort of the idea, anyway - it's a model.

The Reader monad represents computation in the scope of a single shared bit of data.

Only the latter two mention state. The idea of monads in programming is just abstracting computational strategies, and sometimes those strategies require or provide state.

Monads are neat in Haskell/ML because you can write code that works on arbitrary monads. Think of it this way: you know how you can write code that operates on a `List` in Java, and it will work on `ArrayList` and `LinkedList`? Well in Haskell, you can write code that operates on/with a given function regardless of computational strategy used.


I see the first three concepts used interchangeably (as far as I understand)...

Either you misunderstood, or the people using them misunderstood.

Lambdas are anonymous functions. In javascript terminology:

    _.map(someNumbers, function (x) { return x +1; } /* <- lambda */)
Closures are a way of hiding state (or at least data) in anonymous functions:

    makeCounter() {
        var x = 0; 
        function numCalls() { return x++; }; <---the variable x is "closed over"
        return numCalls;
    }
Monads are a method of encapsulation, with goals similar to object orientation. OO is about hiding data inside objects, whereas monads are about hiding most of the outside world from your functions.


And it is important to remember that all functions in JavaScript can (and some say "should") be defined this way.

    var c = 2;
    var f = function (x) { return (2 * x); };
    var g = function (h, x) { return h(x); };


I see several explanations here already, none of which gets at the essence of the difference between a lambda expression and a closure.

A lambda expression (or juat "lambda" for short) is a syntactic construct: something you write in your program, like "(lambda (x) (+ x 1))" or "function (x) { return x + 1; }".

Lambda expressions, like any other evaluable expressions, have values. Just as the expression "2 + 2" has the value 4 which is represented in the machine as binary 100, so a lambda expression has a value, which is a function, and that value has a representation in the machine.

What is that representation? It turns out that the most general representation of a functional value is a closure, which is simply a code pointer with some associated data (or a pointer thereto). A function pointer in C is just a code pointer; this is an impoverished representation that requires the programmer, in order to write general higher-order functions (functions that take functions as arguments), to explicitly pass around a data pointer along with each function pointer, and pass the data pointer when calling through the function pointer:

  void foo(int (*f)(void* data, int x), void* data) {
    ... (*f)(data, n) ...
  }
You've surely seen this idiom if you've worked with function pointers much. A closure, again, packages up the two pieces together so you don't have to deal with them separately.

You can see the similarity between instances and closures. An instance has a vtable (in C++ parlance), which is an indexed collection of code pointers, along with its data. A closure, instead of having a vtable, has a single code pointer. Instances and closures are duals: an instance provides a collection of operations, while a closure provides only one. Other than that, they're identical.

So, to summarize: a lambda expression, like a `new' expression, is a syntactic construct you find in your code. Lambda expressions evaluate to closures just as `new' expressions evaluate to instances; closures and instances are implementation constructs, like binary integers, inside the machine.

Finally, note that the only difference between a function defined in the traditional way and one defined by a lambda expression is that the latter has no name. It's fine to call these "anonymous function expressions" if your language doesn't use the word "lambda".


There are two kinds of lambda expressions: combinators, which only combine their arguments and make no reference to any other variables; and open functions, which refer to variables that are not bound by the lambda expression.

A closure is an open function paired with an environment that maps the unbound variables to values. The pairing is called a closed function, or the closure of an open function.


Very informative, thank you!


One of these is easy, and two of these are hard.

A lambda is basically a function literal. Just like you have string literals ("string"), number literals (3), and array literals ({1, 2, 3}), you can have function literals (function(x, y){x+y}). C++ does not have these, so I used JavaScript syntax as it is the most similar.

A closure is a dynamic binding of parameters to a function in the scope of its definition. This is not a valid concept in C, because there is only one scope for function definitions. In C++, there are namespaces and classes, but these are static, so it still isn't quite right.

You'll need to imagine that you can define a function inside another function. Then, suppose you have something like this:

  int f1(int x, int y) {
    int f2(int z) {
      return x+z;
    }
    return f2(y);
  }
What should f1(1, 2) return? If C++ is extended with only the feature to define functions within functions, this will result in a compiler error: x is used undeclared in f2. If it is extended with static or lexical scoping, it will return 3. Lexical scoping is generally what you want; the alternative (dynamic scoping) is useful in a few cases but generally more confusing and less powerful.

Then, suppose you want to return a reference to f2 instead. We need to keep the x value around in order to be able to actually make use of the reference, so we store the reference along with that x value. The data structure containing the function reference and any values needed to use it is called a "closure". You can think of it as a function object containing a function pointer and some extra values, plus some behind-the-scenes magic to initialize it.

A monad is something entirely different. The most analogous thing in C is if you had the ability to define an extra computation that is applied at every semicolon. That computation might be for maintaining state, or for propagating errors, or for deciding which lines are actually run, or really anything.


I'll try to tackle the first two :-)

A lambda is an anonymous function; it's a function that doesn't have a name. Of course you can assign it to a variable if you like to. So for example in Javascript these two are equivalent:

  function foo(arg) {//function body}
  foo = function(arg) {//function body}
Or in Python

  def foo():
    return 42
  foo = lambda: 42
Note that functions (and lambdas) are 'first class citizens'. This means we can pass functions around like any other type of value. You could sort of compare this to function pointers.

Now a closure is related to these first class functions. It's basically a function that carries with it, the environment from when it was defined. Consider this Python/pseudocode:

  def foo():
    a = 42
    def bar():
      return a
    return bar

  myClosure = foo()
  myClosure() #this returns 42
The function foo returns a function that we assign to myClosure (in the function foo we named it bar, but we could have used an anonymous lambda as well). The returned function returns the value of a, but note that a was only around when we defined bar. When we call myClosure, it takes the value for a from the environment when it was defined.

So, lambdas and closures are not the same thing, but they're both related to first class functions. In practice when a language supports first class functions, it tends to support lambdas and closures as well.

Monads are something different again. They're used in purely functional languages (such as Haskell) to work with state. However, that's about as far as my knowledge goes so someone else can try explain that :-)


Other people have already answered this, but since you come from a C++ background, perhaps this will help.

Lambdas and closures get mentioned together because they don't make sense apart. "Lambda" is just a fancy term for an anonymous function, but it also implies that the function is created (sans optimizations) dynamically (i.e. at runtime). This doesn't ever happen in C++, so hence the confusion.

A closure is the concept that a dynamically created function carries with it the state in which it was created (in C++ terms, the callstack). Imagine for a moment we had a C++ keyword called "lambda" that could create functions at runtime. What would the following code do?

  typedef int (*function_t)(int);
  function_t accum(int x) {
    return lambda (int y) { return x+=y; };
  }
You see the problem? "x" is no longer on the stack once foo() returns, but the function we return from foo accesses it! In fact doing anything meaningful with "x" is bogus. The solution is to have dynamically created functions like this to logically copy the call-stack that they are created in, so that they can access any of these stack variables. This is a closure.

It is clear that you could have lambdas without closures, by either disallowing accessing any non-globals in your lambdas, or just having the behavior be undefined if the function accesses any variables that have left scope. In practice this makes for lots and lots of bugs, so the solution is closures.

The C++ way of combining a function with state, is to use a class (or a struct). Here's the C++ way of doing the above:

  class accum { 
    int x;
    accum(int x) : x(x) {}
    int operator()(int i) {return x+=i}
  }
Notice how you need to be explicit about what state you keep in the class. With closures the state kept around is implicit, but it is kept around too.


This was a clever use of unit tests to help you "pass" each lesson. I've read a bunch of tutorials on closures, but this one really helped me "get" it. Sure, it takes liberties and glosses over things... but seeing it grow right before my eyes was really helpful.


Agreed! It would have been even more helpful if the unit tests showed both the expected result and the actual result. Now I had to guess at what the problem might be (which led to me giving up on the last problem).


I had the same problem, and these JavaScripts even don't allow for alert, to see the variable value.


Wow, I love the flow!

Except JSLint. JSLint barfs all over otherwise valid JavaScript and provides really unhelpful error messages. For a minute, I thought that I forgot how to write valid JavaScript.... Hate! Hate! Hate! You're teaching folks to program, not write syntactically pure JS.

But drop that and the experience is great.


(Creator here) Ha, yeah JSLint really provokes strong emotions.

Before I added in JSLint, if you misplaced one semicolon or forgot a parentheses, none of the tests would run and it would just say "program failed". That was incredibly frustrating to me. I figured that if I was getting frustrated, and I was the creator of the tests, then random users would be REALLY frustrated. So I added in JSLint. Now you get line and column numbers for exactly where the syntax broke down.

I'm actually happy following all of JSLint's suggestions. But my first test user, my wife, had other thoughts. She filled out her first answer, saw the whitespace errors, and said "hells no". So I turned off the whitespace checking. She seemed OK with it after that.

Which checks in particular are annoying? I can change the default parameters of JSLint to be less strict about some things. But I don't want to give up syntax checking entirely.


JSLint complained when I had two `var` statements; it wanted me to merge the two into one. I repeat `var`, so this annoyed me. (Of course, it wasn't hard to fix, but the examples didn't show any multi-`var` usage so it may be a problem for someone new to the language or programming in general.)


Use JSHint with

  /*jshint asi:true, eqnull:true, laxbreak: true */
and you'll make everyone happy :)


SPOILERS

I created a successful bothC function that looks more like the simplified version of seqC, calling fC first. But when I switch to calling gC first, I get the following failures. Am I missing something, or is this a bug in the unit test?

  PASSED No JSLint errors
  PASSED bothC(S, S, A, B) === undefined
  PASSED output === "SSA"
  PASSED bothC(S, F, A, B) === undefined
  FAILED output === "SSASFB"
  PASSED bothC(F, S, A, B) === undefined
  FAILED output === "SSASFBFSB"
  PASSED bothC(F, F, A, B) === undefined
  FAILED output === "SSASFBFSBFFB"


I'm not sure that we used the same method to get to this part, but I too ran into this output.

Interested in seeing how my solution was off, and if I was thinking about it the wrong way. Any luck?


Continuation Passing SPOILERS:

f_success = function() { gC(success, failure); }; f_failure = function() { gC(failure, failure); }; fC(f_success, f_failure);

For me, the trick here was to forget about the words "success" and "failure" because they have too much semantics embedded. I had to re-word the problem like this: call f3, if first argument of f1 and f2 are called. All other roads lead to f4. Both f1 and f2 has to be called in the chain no matter what.

Also, this is the first time I've seen such a pattern. Is there a practical application of such pattern? I think it's ok to have a single level of closure to continue execution, but if it goes to that many levels like the sample problem, it'll be a nightmare to maintain for people who have not seen your code yet.


Generally, any check that isn't a syntax exception on commonly used browsers. Specifically, it barfed when I didn't include optional semicolons.

   var x = 5
shouldn't fail.


Do you know a way to disable strict identity over equality checking (=== vs ==)? It's a Crockford-ism, but I think the jury's still out on the value of strict type checking vs coercion and I'd like the choice.



I would love to see how others solved these problems as a reward for completing it on my own.


The only thing that annoys me is two “Expected ';'” errors after e. g.

var f = function () { return 0 }


Please make more lessons (html, css, javascript, node, php...). I would pay.


It would be awesome if the web version of LPTHW had an interactive tool just like this.


Agreed. Once I converted all the exercises to coffeescript (syntactically similar to python) they made more sense.


This was a great tutorial. There's a long discussion on reddit: http://www.reddit.com/r/programming/comments/iviw3/whats_a_c...


I have no idea what I'm supposed to do in problem 12. Everything else was fairly obvious. I just can't comprehend what's going on in number 12.


I'm late to the party again.

To understand the importance of closures it might be interesting to look at this poster[1]. Look at the node in the top left that says "Functional programming". Every paradigm that descends from that node depends on closures. THAT's how important they are.

[1] http://www.info.ucl.ac.be/~pvr/paradigms.html


Or there is this -- http://reprog.wordpress.com/2010/02/27/closures-finally-expl... -- which takes a much more informal approach. (Disclaimer: I wrote it myself, so it reflects my own rather practical moment of enlightenment rather than rigorous computer science.)


Nice. I also took a crack at explaining them - here's my version, in case it helps anyone.

http://sleeplessgeek.blogspot.com/2009/12/so-what-are-these-...


Sorry if this is a stupid question but when you use continuations-passing style, what happens to the stack in javascript? Does it clean it self up or does the language just keeps going functions all the way down and fills up?


If you're doing CPS in a language with tail-call optimization then the same stack frame will be used for all your function calls. I don't know if V8 does this. In something like Node you don't do full CPS, you have portions of your code in CPS but usually only a few layers deep as you need to give control back to the event loop at some point.


I don't think JS has tail-call optimization. At some point you'll likely hit an error similar to too much recursion, but it would require a scope chain in the thousands.


Thanks for sharing. Running through this tied up some loose ends for me.


Focused on javascript, but this is a great introduction to closures: http://jibbering.com/faq/notes/closures/


My C++ brain must be getting too old -- head is swimming :( From all the talk, these must be very useful, but I don't understand why? Why not just use an object?


Many a brain have pondered the same question since the beginning of time. For instance, you can meditate over the following koan:

-----

The venerable master Qc Na was walking with his student, Anton. Hoping to prompt the master into a discussion, Anton said "Master, I have heard that objects are a very good thing - is this true?" Qc Na looked pityingly at his student and replied, "Foolish pupil - objects are merely a poor man's closures."

Chastised, Anton took his leave from his master and returned to his cell, intent on studying closures. He carefully read the entire "Lambda: The Ultimate..." series of papers and its cousins, and implemented a small Scheme interpreter with a closure-based object system. He learned much, and looked forward to informing his master of his progress.

On his next walk with Qc Na, Anton attempted to impress his master by saying "Master, I have diligently studied the matter, and now understand that objects are truly a poor man's closures." Qc Na responded by hitting Anton with his stick, saying "When will you learn? Closures are a poor man's object." At that moment, Anton became enlightened.

-----

Anton van Straaten in http://people.csail.mit.edu/gregs/ll1-discuss-archive-html/m...


Because there's no need to burden ourselves (and, perhaps more importantly, the people who'll have to maintain the code after us) with a whack of boilerplate for a fairly simple function.

(Also, you can use enclosed variables inside of map/fold constructs, which would be hard to express via objects.)


Anyone else try to do a lazy AND evaluation for chapter 12 and have it fail? I did

fC(gC(function(){success,failure};),failure);

but the desired answer checked gC even if fC already failed..


You're passing in the return value of gC() as the first parameter to fC(). gC() doesn't return anything, let alone a function. Also, you're only passing in one parameter into gC().

I'm not sure that makes sense?


My first attempt was about the same, until I realized I missed a requirement:

> Your function bothC should call both fC and gC no matter what

What happens in your code if fC fails?


Ah good point, forgot about that requirement.


A closure is a function that can access variables that don't have to be passed in as parameters.


is it me or this is confusing?

"Define a function named generate that takes one argument, a function f. It should return an array that consists of the function f and another function that doubles its input."


Define a function

  var ... = function( ... ){ ... };
named generate

  var generate = function( ... ){ ... };
that takes one argument, a function f

  var generate = function( f ){ ... };
It should return an array

  var generate = function( f ){ return [ ... ]; };
that consists of the function f

  var generate = function( f ){ return [ f, ... ]; };
and another function

  var generate = function( f ){ return [ f, function( ... ){ ... } ]; };
that doubles its input

  var generate = function( f ){ return [ f, function( x ){ return 2*x; } ]; };


Your Answer is superb but can you tell me what is wrong with this

var generate = function (f) {

   var double = function (x){return x * 2;};
    
   var list = [f,double];
    return list;
};

On the other hand this one passes the JSLInt test given here at : http://nathansjslessons.appspot.com/lesson?id=1050

var generate = function (f) {

    function double(x){return x * 2;}
    
    var list = [f,double];
     return list;
    
};


I'm sorry I can't. I'm not into the details of Javascript (yet) and know nothing about JSLint.


The closure articles are a perfect reason for merging threads on HN. One of these appears weekly!


I found that structure of teaching awkward.

If someone already knows some basic javascript, you could show them a basic, familiar function but then show them how it can become a closure. Then make them realize that everything can be treated essentially as an object in javascript, nearly everything, but especially functions.

I didn't know the formal names of "Stateful Closures" or "Continuation Passing" but I knew that I can make a copy of a function as another object so it retains different values than another copy and that I can pass a function as a return value.

They should also cover self-executing closures and using timers in closures via setTimeout(function(){ - I don't think those concepts are too advanced.


The pedagogical structure he is using is known as "programmed instruction." It was popular in the 60's through the 80's for many kinds of technical instruction. One of the interesting characteristics about it is that its progression always seems very easy (often too easy) because it takes small steps and uses repetition. This ensures that the learner is ready for the subsequent step.

It is highly effective if you want the learner to understand a specific set of details quickly. I like it in this context.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: