132: Data Types: Lambdas.

Lambdas are a recent addition to C++ and are also included in other languages. Think of them like unnamed methods but with some extra concepts that you need to know. Or if not, you’ll be even more lost than with function pointers.

The C++ language has three ways to call code and other languages will be similar. What we’re mainly talking about here is not executing a method directly but instead passing a method to some other code that will run the method for you at some time either right away or in the future.

The first is to just call a method directly. Method names are pointers and you can pass them to other methods to be called for you. This way also includes using function pointers.
The second is to write a function object. This is a class that implements the function call operator. You can pass an instance of a function object class to another method to be called for you.
And the third is to use a lambda expression which will create a temporary unnamed function object behind the scenes. This unnamed function object is called a closure.

All lambdas start out with a lambda capture block that consists of a pair of square brackets. Inside these square brackets, you can list individual variables that the lambda will have access to. These variables get captured by the unnamed function object closure. That’s just a lambda way of saying that the function object will get initialized with either copies or references to the variables.

Listen to the full episode for more on lambdas, or you can also read the full transcript below.

Transcript

I’m including lambdas under data types even though they’re not really a data type themselves. They’re actually expressions and have an unspecified type. But before all that, why would you want to use an unnamed method?

I mean, I’ve talked at length about the importance of crafting good descriptive names. Why now, would you consider methods without names?

Well, many times, you just want some small piece of code that you can send to another class to be run and this may be the only place in your code that you’ll ever need to use this code. You probably won’t need to ever call the code yourself. Creating a full method seems like a lot of unnecessary work.

You have alternatives. The C++ language has three ways to call code and other languages will be similar. What we’re mainly talking about here is not executing a method directly but instead passing a method to some other code that will run the method for you at some time either right away or in the future.

◦ The first is to just call a method directly. Method names are pointers and you can pass them to other methods to be called for you. This way also includes using function pointers.
◦ The second is to write a function object. This is a class that implements the function call operator. You can pass an instance of a function object class to another method to be called for you.
◦ And the third is to use a lambda expression which will create a temporary unnamed function object behind the scenes. This unnamed function object is called a closure.

Like I said, lambda expressions are a fairly new addition to C++ and other languages too. They’re really just a quick and short way of providing a function object. Make sure to listen to episodes 128 and 129 for more information about function objects.

Normally, lambda expressions are just called lambdas and when I say they’re a quick and short way of writing a function object, they’re really easy to write. And they’re small enough to fit right into the rest of your code.

Sometimes, this can cause lambdas to get lost in your code because they are so small and blend in with the rest of the code. This is just something to be aware of. I don’t think I’ve come across any code yet that I’d say tried to overuse lambdas.

Probably the two most important aspects to understand are how to properly capture variables for use in the lambda and how the lifetime of a closure affects what you should and shouldn’t do. I’ll explain these right after this message from our sponsor.

When I explained function objects, I said that one of their advantages over a regular method is the ability of the function object class to declare member data. Just like any data variables, this member data can be initialized by copying a value from someplace else or by referring or referencing someplace else. This is critical to understanding lambdas because they highlight this ability and make it really easy to specify what variables will be captured by value and which will be captured by reference.

But what do I mean when I say that a lambda captures a variable? Let’s say that you have a method with a local variable called message. This is a string that you want printed only for items that match some criteria. You have the items in a collection and want to go through each item, test if it meets the criteria, and if so, then print the message. Seems simple, right? You could write all this code just like I said. Or you can use any of several utility methods that will handle most of the code for you. All you need to do is provide some small piece of your own code that will be run for each item in the collection. You don’t need to worry about navigating through the collection anymore. All you need to do is write a small piece of code that receives one item at a time from the collection and then you can do whatever you want with the item.

This is a perfect example of where a lambda can really help. You don’t run the lambda expression yourself. You just need a way to quickly and easily write this piece of code that the utility method will run for you. The question then is how will you get the message text into the lambda? You can’t pass the string message when the lambda is called because it’s not your code that’s calling the lambda. You need to get the message preloaded into the lambda. You can do this with a function object by passing the message to a function object constructor. Even though a lambda will cause a function object to be created behind the scenes, well, that’s the problem, you never get to see this automatically created function object.

So instead, all lambdas start out with a lambda capture block that consists of a pair of square brackets. Inside these square brackets, you can list individual variables that the lambda will have access to. These variables get captured by the unnamed function object. That’s just a lambda way of saying that the function object will get initialized with either copies or references to the variables.

Since lambdas are all about simplicity, you have some shortcuts you can use in the capture block. Here are four ways you can tell the compiler what outside variables should be used within the lambda.

◦ #1 If you use an empty set of square brackets, then that means the lambda is not allowed to use any local variables. Note that the lambda can still use other global variables and other variables already available to the current namespace.
◦ #2 If you put a single ampersand inside the square brackets, then that means you can use any local variable inside the lambda and that variable will be a reference to the actual variable. If the lambda modifies this variable, then it’s changed in the original location too.
◦ #3 If you put a single equal sign inside the square brackets, then that means the lambda can use any local variable and the closure will be given its own copy of that variable.
◦ #4 You can also list individual variables inside the square brackets to be more specific about which ones should be captured by value and which ones captured by reference.

The topic of a lambda lifetime is the other aspect of using lambdas that you really need to understand. If you don’t understand this, then it’s super easy to write code that looks perfectly fine and might even run okay but is sure to crash at some point.

Let’s say that you want to provide a lambda as a callback. Yes, lambdas make great callbacks as long as you understand how the lifetime of the lambda affects what you can do. Or more importantly, how you do it. If you create a lambda and capture a local variable by reference, then that variable exists and is valid at the moment that the lambda is created and the variable is captured by reference inside the unnamed function object closure. But what do you do with the lambda? If you pass it to some code that runs it right away, then you’ll probably not notice anything wrong. But if that code saves the lambda, or to be more accurate, if that code saves the closure, then when the method returns that created the lambda, the variable gets destroyed. But the closure is still referencing it. This is one of the ways that you can get an invalid reference.

If the lambda just reads the value from the referenced variable, then it might even be able to read the correct value if nothing has changed that memory. That’s something you really shouldn’t count on though. But if the lambda writes to that variable, then it ends up writing a value on top of memory that could be important for some other purpose. That other code will likely crash and good luck trying to figure out why.

It’s usually a good idea to capture any needed variables by value. If you ever do capture by reference, then make sure that you understand exactly when the lambda code will be run and when will the variable go out of scope and be destroyed.

132: Data Types: Lambdas.

Transcript

Tags

Leave a ReplyCancel reply

132: Data Types: Lambdas.

Transcript

Share this:

Tags

Leave a ReplyCancel reply