> Note that for well-known procedures, all variables that are free in the lambda are also free in the caller
I don't think this is right, unless given an unnecessarily strict definition of `well-known`. A contrived counterexample:
let example x y =
let f =
let z = big_calculation x in
(fun t -> if t then z else 0) in
(fun q -> f q + f y)
Here f is bound to a known function, which is only called and never escapes. All of the known-function optimisations should apply. However the variable z is not in scope at the call sites of f.
Optimising this example effectively is a bit tricky - the 'standard' thing to do is construct a closure that contains z (but need not contain a code pointer, since all call sites are known), which is itself contained within the closure for (fun q -> ...). A more aggressive method would be to discover that all references to f are within (fun q -> ...) and inline the free variables of f into its closure, avoiding an indirection.
It's quite a fun subject - there are various implementation tricks for curried functions and partial application too, which aren't very important in Scheme but are crucial for ML.
This article considers "closure" to mean a type of internal representation of a function which references free bindings which would otherwise be stack-allocated.
This goes against the usage of "closure" that I've seen elsewhere, including the current wikipedia page for the topic[1]. Ususally I see "closure" used to denote the concept of a function that references free variables, not any specific implementation. Maybe within the guile codebase they have a construct called "closure" which implements closures in the most general way possible (which involves heap allocation no doubt).
But it's true: lambdas ≠ closures. "Lambda" refers to the syntax that a language uses to allow anonymous functions. A "closure" is a type of function that references free variables. Java 8, for example, has lambdas, but they are not full closures, because they can only reference local variables declared in higher scopes if those are declared as final.
Python, on the other hand, sort of has lambdas, but they are limited to a body consisting of a single expression, which doesn't allow e.g. conditionals. Even if it didn't support the lambda syntax, though, it would support closures, because it lets you declare a named function during a function call, then pass it elsewhere by referencing its name. These functions can reference free variables:
def mkincer():
n = 0
def inner():
nonlocal n
n += 1
return n
return inner
No, this isn't what the Wikipedia article says. To quote:
In programming languages, closures (also lexical closures
or function closures) are a technique for implementing
lexically scoped name binding in languages with first-
class functions.
which is the same definition that wingo is using. Stack allocation is another technique for implementing lexical scope, if the relevant bindings are still on the stack.
This is evident in PHP especially. The type of an anonymous function is 'Closure'. The problem is you have to explicitly declare the variables you are closing over and only until then does it become a closure. It's entirely possible to have a Closure that isn't actually a closure.
As far as I understand the article, a lambda generally is a closure (or more accurately: is generally implemented as a closure), but that's expensive and it can often be optimized into something cheaper. At least by Guile.
No. Lambdas are a language notation representing a computation with parameterized inputs. A closure is a lower level data structure in the language implementation, which a lambda may or may not compile into. A closure consists of a chunk of code (whether s-expressions to be interpreted, vm bytecode, or native object code) together with a record that provides the locations of the code chunk's free variables in the surrounding environment. (In Scheme, variables bind to locations, not values.) If a lambda has no free variables, then the chunk of code itself suffices to represent it, and this code may be inlined, optimized away, jumped into instead of "called" when it appears in tail position, etc.
One of the things that makes lambda so powerful is that it does not force a particular implementation on the language implementer, and allows for plenty of room for optimization.
And now if only we can stop people from referring to lambadas as closures, and get them to understand lexical scope instead of thinking of closures as "Magic."
Seriously, people. At least learn the abstractions that your using.
(In Guile Scheme). Also, probably other languages.
Very cool to read about these kinds of compiler optimizations. The "scheme class" I've taken was more about programming techniques that scheme encourages (tail recursion, continuation passing) than specific details about the language itself. Im going to have to go read more about the partial evaluation the Guile compiler is performing, I'm curious what heuristics it uses for the definition of "evaluateable".
Ah great pointer. Interesting digression in there about limits of applying these techniques to Javascript - in short, isn't done by current implementations of the language (not sure which engines the author has looked at) but nothing about the spec forbids it with regards to strict-mode code.
(Does anyone write non-strict JS these days? Why?)
Can someone explain why lambda expressions are being picked out?. I.e. why not say 'An expression is not necessarily a closure' as wouldn't this stuff apply to any old expressions, functions etc.
I am not a lisp person but thinking of this in terms of say C# or Haskell.
They're being picked out because you see people do that all the time (i.e., calling an anonymous function a "closure"), while I don't recall ever seeing someone referring to a random expression as a closure.
I'm not overly fond of calling anonymous functions "lambdas", for that matter. Sure, several languages use the term lambda, but many do not. More specifically, "anonymous function" is much more descriptive than "lambda".
Swift even does this officially: ALL lambdas are called closures, even though there's nothing being closed over.
Similarly, people refer to constructor functions that always return objects of the same class/prototype as factories, despite both the GoF patterns that popularised the term showing dynamic output. IIRC Angular JS misuses the term this way too.
AFAICT it's either genuine ignorance of how/why these words describe or some kind of weird demonstration of cleverness by using a particular term, regardless of whether it's applicable or not.
It's probably because, in an implementation of a staticly typed language, you have to treat all functions being passed around as closures. E.g. the type of a lambda expression needs to be one thing, and that is a closure.
Whether or not any variables are actually closed over is irrelevant.
Lambda is easy to reason about, since it is guaranteed to not have any side effects. To juggle the other kinds of expressions like this you'd need a more elaborate purity analysis. So, lambda is a very easy prey for the DCE.
Also, there are some specific optimisations, like beta-reduction and closure inlining.
If we take "Functions" as a baseline, then we have...
Floating Functions / First class functions (thanks to comments for reminding): Functions that can be passed around as variables. e.g. All functions in Javascript are floating functions.
function callback() { ... }
// function assigned to a variable
var x = callback;
// function is an argument to another function
doSomething(..., callback)
Anonymous functions: functions with no name.
// an anonymous function
(function() { .... })()
// anonymous function as an argument
doSomething(..., function() { .... })
Lambda functions: A style of writing functions without a block, where the function and return value are expressed as one single expression without the usage of a block or explicit returns.
employee => employee.name
Closures: When a function references a variable outside its scope, and that "frees" the variable from its normal lifetime, thus creating a "closure" between the function and the now-free variable.
function a() {
var x = 1;
return function() {
// here a closure is created
// now "x" outlives its lifetime and
// becomes a free variable
return x + 1;
}
}
To use academic terminology, we would refer to "floating functions" as "first class functions" (since they are not second-class citizens in the language).
I would argue the true distinguishing feature of a "lambda function" in a PL is that it is first-class, nest-able, and lexically-scoped (that is, it captures bindings from its definition site).
A closure is simply a way of implementing lexical scope for functions.
A variable is a "free variable" not because it is "freed" from its lifetime, but because it is not bound by its enclosing lambda. (It would still be a free variable even if the function and the variable binding had the same lifetime.)
> Lambda functions: A style of writing functions without a block, where the function and return value are expressed as one single expression without the usage of a block or explicit returns.
employee => employee.name
I had this argument with the Rust folk. They use the term "closure" for things that don't reference any variables in the outer scope. Sometimes, all you need is a function pointer. For example:
This is really just a function call. There are no references to an outer scope. But Rust requires a closure there. You can't pass a pointer to a function.
You'd like for this to optimize down to a function call, without saving any state from the outer scope. Don't know if Rust does this yet.
The term "closure" is used in Rust for that syntax, regardless of whether it actually closes over its environment. You need some term for the syntax, and closure is used even if it doesn't necessarily close over anything; because much of the time, you do use it because you want it to close over its environment.
And, you can pass a pointer to a function just fine:
fn call_twice<F>(f: F) where F: Fn() {
f();
f();
}
fn hello() {
println!("Hello");
}
fn main() {
call_twice(|| println!("Hi!"));
call_twice(hello);
let x = 5;
call_twice(|| println!("{}", x));
}
And all of this optimizes down just fine; if the environment being closed over is empty, it will just be an anonymous function with no extra overhead. Heck, if you try it out yourself on the playpen in release mode, and take a look at the asm, all of these just get inlined right into main: http://is.gd/p9Kd9y
How about memory size? A pointer to a function is just a single pointer to the called code, while a pointer to a closure has at least 2 pieces of data (pointer to code + closed over data or reference) which seems incompatible.
Rust's lambdas are structs containing (references to) exactly the data from the parent scope that their bodies refer to, no more, no less. It's fine for that set to be empty, resulting in an empty, zero-sized value. As we discussed[1] recently, unique types + monomorphisation means that many/most calls of closures (i.e. even ones that capture variables) can compile down to an inlineable static function call. If there's no references to external data in the body, this function call is exactly what you want with a function pointer, except better, because it is not a virtual call. That is, the scheme that Rust and C++11 use for closures automatically gives a better result than passing pointers. (Passing a function value is the same, they have unique types just like closures, but with the ability to be coerced to C-style function pointers.)
By the way, function pointers have been able to be passed directly where closures, both the new and old forms, are expected since at least 0.13.
Ah. Pre-1.0 we didn't have the Fn* traits. Closures were boxed (two closures with the same signature would have the same type due to this), and closures/functions were different types. You could declare an argument to be a closure or a function but not both.
Post "unboxed closures", closures are now stored on the stack (thus they usually optimize away; for example if you use an iterator with a bunch of adapters like filter/map it will optimize down to a for loop). Each closure has a unique anonymous type (you can box them if you want them to have the same type), and all closures implement one or more of the Fn* traits, as do function pointers. So arguments can be generic over all callables, and you can pass in a closure or a function or your own custom callable.
Only `proc` closures were boxed, `|...| -> ...` closures were pointers into the stack. However, it is true that all closures were type erased, and were hence equivalent to certain trait object types (`Box<FnOnce(...) -> ...>` for `proc`, and `&mut FnMut(...) -> ...` for the pipes).
Also, functions could coerce to either of the closure types, and hence be passed straight to functions expecting them.
> You'd like for this to optimize down to a function call, without saving any state from the outer scope. Don't know if Rust does this yet.
It does. Rust closures translate to a stack struct containing the capture and a method call on that struct. In this case, the struct is zero-sized, so it doesn't exist, so it's just a function call. Additionally, unwrap_or_else will probably be monomorphised and then inlined, so there will be no call at all.
> You'd like for this to optimize down to a function call, without saving any state from the outer scope. Don't know if Rust does this yet.
My (limited) understanding is that all bindings closed over by a closure are represented as anonymous struct passed to the function, if nothing is closed over that will be a unit struct which will of course be optimized to nothing.
Actually, even though in the language spec that usage is refered to as a "closure", when configured for a release build, that will compile to exactly the same code as if you had written a static funtion and passed a pointer to it.
Rust is very good at optimizing away closures to have as little overhead as is possible for the semantics they describe.
Actually, a lambda expression is never a closure, though it can evaluate to a closure.
Confusing lambda expressions with closures is like confusing the string "new Foo()" with the object it constructs. In each case, the former is a syntactic entity, the latter an implementation construct.
Ok, sure, the author means "result of compiling a lambda expression". I think it's reasonable phrasing given that this is describing compiler transformations pre-interpretation rather than interpreter behavior.
My quibble is not really with the author, who knows what he's talking about. But I've seen lots of people using "closure" as if it were something in the source code.
To a language implementor, these distinctions are not pedantic at all. A declaration of an array is a thing with a type, a name, and a size, each of which we have to pay attention to. It's really nothing like a sequence of consecutive locations in memory.
To a language user, certainly, such distinctions usually don't matter. But conflating "lambda expression" and "closure" is particularly problematic. On the one hand, as this article points out, not all lambda expressions are evaluated by constructing closures. On the other, it's certainly not the case that all closures come from lambda expressions -- in many languages (certainly Scheme, Common Lisp, JavaScript, and Python, and I think Ruby and PHP) you can perfectly well create a closure from a named function; it's done all the time. Here's an example in Python:
def f(a):
def g(b):
return a + b
return g
See? f returns a closure, but there's no lambda expression in the code.
So given that lambda expressions and closures, though related, are not in 1-to-1 correspondence, I think it's important to understand the distinction.
Even passing over that, it's only a closure if it keeps a lexical environment.
(lambda (x) (+ x 1)) doesn't close over anything, so it's not a closure, nor does it evaluate to one. For that matter, (let ((x 0)) (lambda (x) (+ x 1))) isn't really a closure in any sense because the environment can be inlined out.
Doesn't (lambda (x) (+ x 1)) capture +? You could evaluate (lambda (x) (+ x 1)) in a scope of, say, (let ((+ *)) (lambda (x) (+ x 1))) - in some lisps.
Ah, I guess in Scheme it does (I use Common Lisp, where the two namespaces are different). But even then you would need the actual environment there for it to be a closure. The lambda expression is only half of a closure. It's "let over lambda" that is a closure.
There's a subtle but important difference. They're not necessarily tied to each other anyway. For example in Python you frequently have lots of named functions that capture some variables in an outer scope, and are closures but not lambdas.
Pointless until you start dealing with macros. It can be important in some cases to know whether the closure "code" has been evaluated at that point or not.
I don't think this is right, unless given an unnecessarily strict definition of `well-known`. A contrived counterexample:
Here f is bound to a known function, which is only called and never escapes. All of the known-function optimisations should apply. However the variable z is not in scope at the call sites of f.Optimising this example effectively is a bit tricky - the 'standard' thing to do is construct a closure that contains z (but need not contain a code pointer, since all call sites are known), which is itself contained within the closure for (fun q -> ...). A more aggressive method would be to discover that all references to f are within (fun q -> ...) and inline the free variables of f into its closure, avoiding an indirection.
It's quite a fun subject - there are various implementation tricks for curried functions and partial application too, which aren't very important in Scheme but are crucial for ML.