I love lambdas, but a lot of commenters are throwing around the word "closure" here, and c++ lambdas are definitely not closures. You can capture outside variables by value or by reference, but that value can expire before the lambda runs if the reference no longer exists; in which case, you are in trouble. Unlike a true closure (as in lisp or other languages), where the closed-over value stays around.
If we talk only about c++ capture lists by value (i.e. [=]), then you could make a case for a more appropriate use of the word "closure" but since many lambdas do more than this, I think the distinction is necessary.
However, even in by-value captures, if you are capturing a pointer by value, the issue remains. So, really it is not a good idea to think of lambdas as closures in the functional sense typically used in other languages.
I find that pointlessly pedantic. By the same measure, languages that don't offer bignums don't offer integers. Even Lisps need to implement closures in one way or another, and you may be surprised to see how they actually do it.
I think you miss the point, the OP was mentioning the property that in most languages, closures whose outer variables they are bringing into scope stay in scope (even if the outer function ends). In C++ they expire (is what I got from his comment, anyway).
Closed-by-value variables (I.e. the default) in C++ don't expire. Referenced or pointed-to object might, but this is completely consistent with the rest of the language.
Remember that C++ is a by value language. Pointers are explicit, fist class and distinct from the pointed to object.
You would have to combine copying and reference management, e.g. "[=]" and std::shared_ptr<>. It definitely requires the programmer to pay more attention though, compared to other languages/constructs.
The thing to keep in mind is that copying a shared_ptr isn't cheap at all. It's a class with a pointer and atomic reference count inside and the atomic inc/dec takes many cycles.
How does this compare to the cost of a closure in other languages? Yeah atomic reference counts are not cheap, but basically that's the point of a shared_ptr.
To avoid confusion, only an empty parameter list is optional. You can't omit the parameter list if you take args, unlike lambda shortcuts in languages like clojure which allow you to implicitly refer to arguments inside the body using a universal variable name, and therefore omit the parameter list for all lambdas. That possibility does not exist in c++.
For those of us still stuck on C++98 at work, would you mind explaining what's going on here? In particular, I can't figure out why the ellipsis is so separated from `args` here:
(std::cout << ... << args) << std::endl;
That looks like some black magic to me. The rest makes sense, I think.
Heh, great. I used to put an ellipsis into a programming example to mean “fill in whatever you actually do here”, and now C++ went and made it mean something. :)
It's a trade off more than a necessity. For example, Rust doesn't have explicit capture lists, and if you want explicit control, you make new bindings and capture those. You almost never need to do this in Rust, so it's optimized for that case; I haven't written many closures in C++, so I can't say as much about the frequency there.
To make this more concrete:
let s = String::from("s");
let closure = || {
println!("s is: {}", s);
};
closure();
If you wanted to capture s in a different way:
let s = String::from("s");
let s1 = &s;
let closure = || {
println!("s1 is: {}", s1);
};
closure();
Rust doesn't have explicit capture lists, but it does have the `move` modifier on closures which is like C++'s `[=]`.
Strictly speaking, Rust probably could have gotten away with having neither the `move` modifier nor capture clauses at all, but it would have had wide-ranging implications on the ergonomics and capabilty of closures.
What if you want to move some things, but copy others?
e.g.
auto shared = std::make_shared<MySharedType>(...);
auto unique = std::make_unique<MyOwnedType>(...);
function_that_accepts_lambda([shared, u = std::move(unique)] {
shared->foo(...); u->bar(...);
});
// Outer scope can still use shared.
shared->foo(....);
If the type implements Copy they'll be implicitly copied when moved into the Rust closure(I think). Or you can declare a scope var and clone() manually.
The syntax is not the prettiest, but it is legible once you understand what [](){} means.
In C#, there is no such thing, but there is a part of me that wishes we had such a thing. I like the ability explicitly state what variables are being captured.
No. What I have done on the other hand was add unused lexical variables to an anonymous function so the runtime wouldn't optimise them out of the closure and I could still see them in the debugger.
I'm not 100% sure, but C#'s compiler should automatically capture what you need (and leave out the rest).
I think the primary need for manual declaration is because in C++ you need to differentiate between pass by copy semantics and pass by reference semantics.
> I think the primary need for manual declaration is because in C++ you need to differentiate between pass by copy semantics and pass by reference semantics.
That's not actually a need, C++ includes [=] and [&] (capture everything by value or by reference). You can get a mix by creating references outside the body then capturing the environment by value (capturing the references by value and thus getting references).
On the one hand it has a bit more syntactic overhead (you have to take and declare a bunch of references before creating the closure), on the other hand there's less irregularity to the language, and bindings mean the same thing in and out of the closure.
FWIW that's what Rust does[0], though it may help that Rust's blocks are "statement expressions", some constructs would probably be unwieldy without that.
[0] the default corresponds to C++'s [&] (capture by ref), and a "move closure" switches to [=] instead
Yep, in Microsoft’s C# compiler, only the closed over local variables of a function are captured (which, in C#’s case, means generating a class with fields corresponding to each closed over local, and then replacing those locals with references to their respective fields of an instance of that class).
The only thing I find somewhat frustrating about the syntax is that the notation messes with my existing expectations. Up until now, in C-like languages a [] was just for collections and indexing into them, in the code I used at least.
I mean I'm not really complaining; I don't see better syntax to fit short anonymous functions into the existing syntax, without defeating the whole purpose of it either.
I suspect it's just a matter of getting used to this extra meaning for angular brackets.
Well, you need something to indicate the beginning of a lambda. So you can think of "[]" as serving that role, instead of "lambda" in Python or "\" in Haskell.
The declaration for lambda variables is almost identical to function pointers, just with a ^ instead of a *, so there's nothing to learn (or unlearn, like C++ forces you to). The ^ looks like a lambda, and historically the lambda of lambda calculus actually was a caret accent over the variable. The argument list can be elided.
Looking here (https://developer.apple.com/library/ios/documentation/Cocoa/...) for the details, I think this is going to largely be a matter of opinion. I prefer the C++ syntax, especially when it comes to capture by reference, for which the Apple syntax seems to require the __block storage type modifier.
Caret as the embryonic form of lambda is apparently a myth propagated by Barendregt, and lambda is just a random Greek letter to go with alpha, beta, and eta.
Agreed. From the template library on down, it seems like the C++ community is hellbent on making the syntax for what should be clean, common operations seem like arcane Sanskrit. I dont know what their problem is.
Often, the problem is that the clean simple syntax you might want to use already means something else in C++, and the bias against breaking existing code is very strong.
Refusing to break backwards compatibility is their problem. I respect them for that; if you do want to break it make another language that plays nicely with C++ instead.
This is the first time I've looked at C++ lambdas. They appear magnificently powerful and also like another pile of easy ways to get completely screwed up.
I really like the explicit capture of C++'s lambdas more than the implicit one in most other languages (C#, Java, Python...) where you easily ends-up with a closure not referencing the expected variable. See: https://blogs.msdn.microsoft.com/ericlippert/2009/11/12/clos...
"Explicit is better than implicit". Therefore, I agree with you that explicit closure list, with the ability to copy and reference captured variables, is actually what C++ does right, not wrong.
The explicit capture list is only necessary in C++ because memory ownership and lifetimes are managed by the programmer in C++. Compare that to a garbage-collected language like Scheme or C#: when the implementation can figure out where memory needs to be freed and ensures you can't use-after-free, it frees the programmer from thinking about ownership (but not necessarily lifetimes: you can still wind up with memory leaks in GC'd languages if you're not careful to let go of references you no longer need). As mentioned elsewhere in this thread, Rust also offers the same level of explicit control without capture lists (though I'm on the fence about which way I prefer).
My point is that in languages with automatic memory management, explicit capture lists don't make much sense because the programmer is not tasked with managing memory and can safely capture references all the time. There's no need to ask oneself, "Do I own this pointed-to memory? Do I need to worry about it being freed before this closure? Should I make a copy?", etc. This is because, in a sense, the garbage collector itself owns the memory, but checks to make sure nothing else can use it anymore before it frees it.
You only talk about the memory management part and I guess most language designers think the same. What you and they fail to account for is that explicit capture list can reduce logical bugs.
For one, if I were allowed to explicitly capture the counter variable by copy, the surprising behavior mentioned above would never occur. In languages with mutability, the ability to make some part immutable is a virtue.
For two, in languages without explicit variable declaration, which variable is defined where quickly becomes murky when you have implicit capture. I have so many frustrations where the inner `i` variable clashes with the outer `i` in Python. Yes, I could just use a different name, but naming is hard, and with a new scope I should be able to reuse the name. That is almost the whole point of opening a new scope!
For three, in Javascript where closures are everywhere due to the amount of callbacks, the reference graph is just impossible to analyze. A closure may closes over another closure which closes over an object with a reference to the original closure. An explicit capture list makes the programmer think, and ease the job of anyone who tries to spot memory leaks from the source code. (But I guess that is just not the Javascript style, as they are so fond of never letting the programmers know about their mistakes. At least in C++ we trade that for speed. I don't know what Javascript trades that for.)
> You only talk about the memory management part and I guess most language designers think the same. What you and they fail to account for is that explicit capture list can reduce logical bugs.
I suppose, as a language designer, I tend to think that the more I do automatically, the more I ease the programmer's burden. However, as you point out, that's not always true. That said, my point wasn't (isn't?) that explicit capture is only a good idea sans automatic memory management (it may well be -- you've certainly given me some food for thought here), but rather that it's only necessary in that case, and I think that point still stands.
> For one, if I were allowed to explicitly capture the counter variable by copy, the surprising behavior mentioned above would never occur.
That's a failure of language design and I don't think the proper solution is to force explicit capture on closure creation (also note that you need more than just explicit capture because to prevent such an error, you need the ability to specify that the "captured" variable ought to be copied rather than actually captured). I think the proper solution to that problem is the one that the C# team went with: limit the scope of iteration control variables to the iterated block. This is typically what programmers used to block-structured languages would expect, anyway, unless the variable were clearly declared outside the scope of the iteration.
> In languages with mutability, the ability to make some part immutable is a virtue.
That's an orthogonal issue, and can be done in many other (and more general) ways.
> For two, in languages without explicit variable declaration, which variable is defined where quickly becomes murky when you have implicit capture. I have so many frustrations where the inner `i` variable clashes with the outer `i` in Python. Yes, I could just use a different name, but naming is hard, and with a new scope I should be able to reuse the name. That is almost the whole point of opening a new scope!
You're right: that is the point of opening a new scope! That sounds like a flaw in Python's design and could be remedied by making variable definition syntax different from assignment syntax. Consider Lua with its `local` syntax, C and kin with their type annotations, the Lisps with their completely separate forms for variable definition and assignment, and so on. There's also the Tcl strategy of "it's a definition unless it was imported into this scope with `global` or `upval`; otherwise it's an assignment".
> For three, in Javascript where closures are everywhere due to the amount of callbacks, the reference graph is just impossible to analyze. A closure may closes over another closure which closes over an object with a reference to the original closure. An explicit capture list makes the programmer think, and ease the job of anyone who tries to spot memory leaks from the source code. (But I guess that is just not the Javascript style, as they are so fond of never letting the programmers know about their mistakes. At least in C++ we trade that for speed. I don't know what Javascript trades that for.)
JavaScript is a shitty language to begin with, and fixing it wouldn't be as simple as fixing C# or Python... You make a good point here, but I still think that better tooling for data-flow analysis is a more attractive choice than a compulsory explicit capture list. On the flip side, an optional capture list could be a good compromise.
> On the flip side, an optional capture list could be a good compromise.
That is exactly what I am thinking about. Or, rather, what C++ has done right: You can let the compiler infer what to capture, like [=] or [&], or you can explicitly list the variables to capture.
> you need the ability to specify that the "captured" variable ought to be copied rather than actually captured
Yes, that is what I am talking about, and again, what C++ has done right. Most other languages give you no choice whether the capture is by copy or by reference.
> Most other languages give you no choice whether the capture is by copy or by reference.
That's because in languages that have traditionally had GC (i.e., languages in the Lisp tradition or in the ML tradition), the distinction didn't matter. Those languages did not "suffer" from a value/reference dichotomy (e.g., in Scheme, you're literally capturing the variable rather than a copy or reference to the value stored within -- under the hood, that variable might always store a reference for convenience, or it might store a value for performance, but it doesn't matter as it's strictly an implementation detail).
I'm glad that the C++ committee didn't just dump closures into the language without considering this sort of interaction with other aspects of the language. Without the capture lists, closures in C++ have the potential to really suck. That the explicit capture lists even exist is evidence that they've carefully considered how the new features are going to play with existing characteristics of C++. Kudos to them for that!
That is almost true, but there's one exception in those GC'ed languages due to the dichotomy of value types and reference types. The confusing behavior on capturing the iteration variable is one example.
Ah, yes! You're correct. I spend most of my GC'd time in languages that don't have such a value vs. reference dichotomy, and I'd completely forgotten about it.
Spores seem like an interesting solution. The language designer in me has a distaste for it, though :p
For case 1 (capture of mutable references), an explicit copy operator might be better (as in, "I want whatever value this variable is bound to, rather than the storage location") (or even vice versa, where value is the default and there's an operator for location). In a way, spores accomplish this by forcing you to do the copy manually -- but then programmers have to always remember to use the extra syntax, and they need to do it for every captured variable. I'm not quite happy with even this solution, and it may be possible to come up with something even better. Concurrency is always a can o' worms :)
For case 2 (capture of implicit "this"), I'd argue that if (a) the compiler is smart enough to know that "helper" is implicitly "this.helper" and (b) that "this" will be captured by the closure, then (c) the compiler is also smart enough to create an implicit binding for "helper" and capture that instead. This would lead to less-surprising behavior, and intentional capture of "this" could still be done via explicit access. Another option is to, rather than treating "this" as being in an enclosing scope, treat it as though it were an implicit argument to the method (albeit a covariant one). This avoids capture altogether.
Agree on the copy operator, not only for spores, have wanted it more than one time in other languages too.
Not sure how the this binding should work though. If calling a method you need to a) dispatch on the runtime type and b) provide the instance to the method when called.
The compiler would essentially emit the same code that it would in the case of the spore, but it would be automatic. You still get to dispatch on the runtime type, because the binding is created after the method invocation, but before the scope of the lambda to be closed.
I think when a programmer writes "foo.combobulate()", the vast majority of the time, the intend to capture "foo". If they didn't and were being clever, I don't think it's unreasonable for the compiler to expect them to be explicit and write "this.foo.combobulate()" instead. In the former case, the compiler creates the implicit binding to capture, in the latter it does nothing implicit and just closes over "this".
I'm certain that the compiler has enough information to do this, and that it's in accordance with the principle of least surprise ;)
What do you mean? Only time you can mess up a lambda is if a pointer that you're using gets changed. And this kind of dupicate ownership is a general problem with pointers.
Yes, I work in a template heavy C++ codebase and the (already quite good) situation is getting better with each language standard.
C++11 was really a turning point for the language - features like 'auto', lambdas, and variadic templates have enabled succinct generic code that is both readable and highly performant.
Increased competition between the gcc and clang teams has also been a major improvement - both have implemented many C++17 features very quickly, and error messages have greatly improved in both compilers. This is especially welcome when developing templates. Clang's licensing has made it possible to integrate libclang in to vim/emacs (ycmd, irony-mode, rtags, etc) for very accurate completion/syntax checking, etc. Clang-format has also seen quite a bit of adoption, bringing the benefits of standardized formatting to large projects.
The sanitizers have also been a huge boon - getting automatic memory leak, buffer/heap overflow, use after free, uninitialized memory, integer overflow, etc is now as easy as compiling with '-fsanitize=[address|undefined|memory|etc]'.
Overall, C++(11+) is a very productive language if you have stringent performance and latency requirements and you need powerful abstraction facilities.
* `unique_ptr` as a local variable. Before C++11, I needed to either (a) define a holder class for anything that should be deleted at the end of a scope or (b) delete it manually and pray that there isn't an exception thrown. Now, I can just declare it, and trust the destructor to clean up after me.
* `unique_ptr` as a return value. Previously, if a function returns a pointer, there was no way on knowing who was responsible for calling `delete`. Now, I can clearly indicate intent. `unique_ptr` means that the caller now owns the object, while C-style pointer or reference means that the callee still owns the object.
* With lambda statements, I can call `std::sort` in-place, with the sorting criteria immediately visible. Previously, I would need to define a function elsewhere in the code, obscuring what may be a simple `a.param < b.param`.
* With range-based for loops, I can loop over any container without needing the very long `std::vector<MyClassName>::iterator` declaration.
* `= delete` to remove an automatically generated method, such as copy constructors. Previously, you would declare that method to be private, then never make an implementation of it. `= delete` shows your intent much more clearly.
* `static_assert`, so that you can bail out of templates earlier, and with reasonable error messages.
* Variadic templates. These aren't needed in 99% of cases, but they are incredibly useful when designing libraries.
* `std::thread` No more messing around with different thread libraries depending on which platform you are on.
True, I didn't mention it, because it has its own issues. The move-on-copy semantics of auto_ptr makes it incompatible with std containers, and makes for some rather unexpected behavior.
Contrary to popular belief, new features can make a language more elegant and simple, if they make clunky old features obsolete with a simpler alternative.
One aspect of C++ lambdas that I really don’t like is the visual confusion caused by allowing "return", since at a glance this seems to affect the parent function. I have already found myself adding comments inside lambdas like "return x; // return-from-lambda" to make sure that I see what is really happening. Python by contrast does two things better: Python makes it really hard to write long expressions as lambdas, and no "return" is used in a Python "lambda". Of course, Python also allows "def" inside a "def" as a convenient way to write longer one-time functions.
I also found that while I could use C++ lambdas for things like iteration, e.g. "object->forEachThing([](Thing const& t, bool& stop){ ... })", this makes the keyword problem worse. In this type of call, if I want to implement something that is logically like a "break" or "continue" of the loop, it has to use the "return" keyword (from the lambda only) with special conditions attached such as a "bool" variable to request the break. And that is confusing to read, even though conceptually it is similar to the Objective-C NSArray "enumerateObjectsUsingBlock:" that takes a similar approach (in that the block takes a "stop" argument).
Not a bad article. I wish the first example wasn't so complicated. C++ lambda syntax is pretty gross. The initial breakdown is great. But why use a std::vector and std::transform in the first real example? Stick to integer addition. Keep things simple.
It's the best reference I found. The paper only talks of capturing "*this" by value (as in the original post of the topic).
I think I read that in a draft about coroutines. The idea was to capture "this" by reference and convert the lambda to a function pointer to make it movable.
If we talk only about c++ capture lists by value (i.e. [=]), then you could make a case for a more appropriate use of the word "closure" but since many lambdas do more than this, I think the distinction is necessary.
However, even in by-value captures, if you are capturing a pointer by value, the issue remains. So, really it is not a good idea to think of lambdas as closures in the functional sense typically used in other languages.