Sunday, March 29, 2015

Named Delegates Compared to Anonymous Delegates

I love to talk about lambda expressions because they are useful in a lot of situations (especially when using LINQ). Lambda expressions are just anonymous delegates. So the question comes up, What are the differences between named delegates and anonymous delegates?

For an overview of delegates in general, check out "Get Func<>-y: Delegates in .NET" (including a video series). For some more specifics on lambda expressions and anonymous delegates, take a look at "Learn to Love Lambdas (and LINQ, Too)".

There are 4 differences that we'll look at: captured variables, removing assignments, re-use, and visibility.

[Update 3/30: When I originally published this article, I left out re-use and visibility. Since these are so important, I amended this article rather than writing a new one. That's what I get for writing an article when I've had a long day.]

Captured Variables
Advantage: Anonymous Delegates
Anonymous delegates (and therefore lambda expressions) allow us to "capture" a variable that is currently in scope when we assign the delegate. Then we can later use that variable even when it would have normally gone out of scope.

Here's an example (from "Learn to Love Lambdas (and LINQ, Too)" -- check Page 8 of the PDF:


In this scenario, we have a method-level variable called "selectedPerson". This holds the currently selected item from a list box in the UI.

Then inside our lambda expression, we use that "selectedPerson" variable in order to locate a matching item in the list box after we load in fresh data.

The cool part of this is that the body of the lambda expression does not run until *after* the containing method (RefreshButton_Click) has completed. As such, the "selectedPerson" variable would normally go out of scope and be eligible for garbage collection.

But since our lambda expression captured this variable, it is not garbage collected, and it remains available for our anonymous delegate to make use of it. The big advantage to doing this is that we can have a method-level variable in this case. If we had a separate named delegate, then we would need to promote this to a class-level variable so that both methods have access to it. This allows us to scope our variables more appropriately.

Even though we use a lambda expression in our example, this works with other anonymous delegates as well. Other languages refer to this as a "closure". For a little more info on captured variables, check out this article: Lambda Expressions, Captured Variables, and For Loops: A Dangerous Combination.

So captured variables is a big advantage to using anonymous delegates.

Removing Assignments
Advantage: Named Delegates
All delegates are multicast delegates. This means that we can assign multiple methods to a single delegate variable. Then when we invoke the delegate (that is, execute the methods associated with the delegate), all of those methods run.

For an example of multicast delegates, take a look at "Get Func<>-y: Delegates in .NET" or check out the 3rd video in the series: Action<> & Multicast Delegates.

Just like we can add multiple methods, we can also remove those methods (at least when we're using a named delegate). Let's look at some code to see an example of this.

The Base Application
Here's a sample application where we can see the assignment differences:


In this application, we have buttons to add and remove named and anonymous delegates. The numbers will keep a running total of adding and removing. Then when we click the "Click Me!" button, the assigned delegates will execute.

You can get this code from GitHub: jeremybytes/delegates-and-func. This project is in the "BONUS-NamedVsAnonymous" folder of the solution.

The Constructor
Here are the basics of the code-behind for this form:


This is pretty straight-forward. We have 2 private fields to hold our running totals. Then in the constructor, we assign a lambda expression to our "Click" event to clear our output list in the UI.

As a reminder, event handlers are just special kinds of delegates. This means that they have all the properties that we expect with delegates -- including multicasting.

Named Delegates
Here's the code to handle our named delegate buttons:


First, notice our method "NamedDelegate". This method signature matches the event handler for a button: it takes an object and RoutedEventArgs as parameters and returns void. This means that we can assign this to our "Click" event.

The body of our method just outputs "Hello from Named Delegate" to our output list in the UI.

Then notice our "AddNamed_Click" method. This is hooked up to the event of the "Add" button. Here we see that we use "+=" to assign the "NamedDelegate" method as a handler for the event. The "+=" will add this delegate to any existing collection of methods that are already assigned.

The rest of this method manages the running total -- incrementing the count and updating the UI.

Now look at the "RemoveNamed_Click" method. This is hooked up to the "Remove" button. Here we see that we use "-=" to remove the assignment of the "NamedDelegate" method. The effect of this is that it will remove an assignment to this named delegate (assuming that it finds one by that name).

Anonymous Delegates
Now let's look at the code that is hooked up to the anonymous delegate buttons:


This code is similar to the "Add" and "Remove" methods that we have for the named delegates. The difference is that we are using anonymous delegates.

In the "Add" method, when we use "+=" we give it a lambda expression (which is an anonymous delegate). This will put "Hello from Anonymous Delegate" into our output list in the UI.

In the "Remove" method, we use "-=" to attempt to remove the anonymous delegate (just like we did with the named delegate). What we'll find is that this does not work quite like we intend it to.

Running the Application
So now that we have the code, let's run the application to see what happens. First we'll click each of the "Add" buttons 2 times, and then click the "Click Me!" button:


We can see that our running total tells us that we expect to have 2 named delegates and 2 anonymous delegates hooked up to our "Click" event. And we see exactly that in the output.

Now, let's click the "Remove" buttons one time each and see what happens:


This output isn't quite what we expect. Our running totals tell us that we expect to only have 1 named delegate and 1 anonymous delegate assigned. But our output shows that we have 1 named delegate and 2 anonymous delegates.
What this tells us is that the removal of the named delegate was successful, but the removal of the anonymous delegate was not.
Let's try again by clicking the "Remove" buttons again:


At least our results are consistent. The other named delegate was successfully removed, but we see that both anonymous delegates are still attached to our "Click" event.

Deconstructing the Behavior
This behavior makes sense when we take a closer look at what's happening. When we use the "-=" operator, we need to give an identifier that can be used to locate the item that we want to remove. When we have a named delegate, that's easy -- the name is "NamedDelegate", and it can remove the first one that it finds in the list.

But with the anonymous delegate, we don't have access to the name. The compiler generates a name, but we never see that in the code. Because of that, we can't give a valid value that will work with the "-=" operator in order to remove an anonymous delegate.

How Important Is This?
So that question is whether we should be concerned about this. The answer is: maybe. We can still clear out the items attached to the "Click" handler by just saying "xxx.Click = null". But this has the effect of clearing *everything*.

Why do we care whether things stay attached to event handlers? Well, depending on the circumstances, these are generally strong references. What that means is that event handler assignments can prevent the garbage collector from clearing objects from memory. Whether this is important really depends on how we do our assignments, the expected lifetime of our objects, and how we deal with these types of references.

As an alternative, we can use a different construct that will give us a weak reference. This is a reference that does *not* prevent the garbage collector from cleaning up an object. (But this is a topic for another day).

So we can see that being able to easily remove assignments is an advantage of named delegates.

[Update 3/30: When I originally published this article, I left out 2 very important differences: re-use and visibility. They're important enough that I added them on here rather than writing another article.]

Code Re-Use
Advantage: Named Delegates
A really big advantage to having named delegates is that we can reuse the same method throughout our application. If we use an anonymous delegate, we basically in-line the code. This is great for visibility (which we'll see in a bit), but it hampers the ability to reuse the same block of code somewhere else.

Here's an example from Get Func<>-y (also in the 2nd video in the series). In this sample, we have a delegate variable:


This represents a method that takes a "Person" as a parameter and returns a "string". And we assign it based on UI elements (in this case, a set of (badly-named) radio buttons).


This uses some static methods that we have in another class. Here's the code for "LastNameToUpper":


We can see that it matches our delegate signature (takes a "Person" as a parameter and returns a "string"), and it simply returns the last name property of the parameter as upper case.

The other methods look similar. And an advantage to having these separate methods (named delegates) is that we can re-use them in other parts of our application. This allows us to eliminate duplication -- which is good -- and adhere to the DRY principle (Don't Repeat Yourself).

So we can see that re-use is an advantage to using named delegates.

Visibility of Code
Advantage: Anonymous Delegates
There's one more difference that is important when we're talking about delegates, and that is visibility of the code. When we use an anonymous delegate, we basically in-line the code. This means that we don't need to go somewhere else to find the functionality.

Let's go back to the example that we just looked at:


If we want to know what any of these methods do, we need to click into them. For the most part, the method names tell us what they are doing, but we would need to click into the "FullName" method to know exactly what format comes out.

When we switch these to anonymous delegates (and in-line the code), all of the code is right in front of us:


Now we can see exactly what each delegate does. If we look at the last assignment now (which was previously assigned to the "FullName" method), we see that it formats the name as "LastName [comma] [space] FirstName".

The great thing about this is that we don't need to go somewhere else to find that. It's right here in front of us.

This is especially useful when we have lambda expressions in LINQ methods (which I really like). Now if our anonymous delegates get more than a few lines long, then they can become difficult to read and we may want to extract them as separate methods. But if they are short, it's great to have the code right in front of us.

So, visibility of the code is an advantage for anonymous delegates.

Wrap Up
There are pros and cons to pretty much every construct that we work with in programming. Delegates are no different.
Captured Variables: Advantage Anonymous Delegates
Removing Assignments: Advantage Named Delegates
Code Re-Use: Advantage Named Delegates
Visibility of Code: Advantage: Anonymous Delegates
As developers, we get to make these types of decisions all the time. It's important that we understand our tools so that we can make the best choices for our particular needs.

Happy Coding!

4 comments:

  1. The point I may have poorly tried to make on twitter is that lambda expressions are not anonymous delegates, they are expressions. The result of evaluating the expression at runtime is an anonymous delegate. So they are not delegates themselves, but a way of creating delegates. Just like we never "write an object", we write classes, then use them to create objects.

    ReplyDelete
    Replies
    1. I won't disagree with you, Jim, because you are technically correct. The best kind of correct. (Thanks Futurama)

      The terminology around delegates has always been a little strange. A delegate is a type that describes a method signature. A method is just a method, but if it matches the signature of our delegate (and we use it as such), then it becomes what I want to call an "instance of the delegate" (just like we have instances of other types), but that terminology doesn't make sense when we're talking about methods. So we end up calling it a "named delegate" (which isn't quite right, either).

      Lambdas were created to make LINQ easier (a shorter syntax for anonymous delegates). And whether we use "delegate(string param) { Console.Write(param); }" or "param => Console.Write(param)" the compiler generates the same IL. And this is how most people run into lambda expressions. (Now when we get into the deeper intricacies of expressions (especially creating and parsing them dynamically), then all bets are off. I have yet to find a really good, understandable explanation of expression trees and how to interact with them. If you've got a reference, let me know.)

      Delete
    2. I think your terminology is fine. If I have the expression: Square s = new Square(2), then it is natural to say that 's' is an instance of type Square.

      To your second note, Expression vs expression gets confusing too. My previous example with the square was an assignment expression (little 'e') from the language perspective. Just like "Hello, Jeremy" is an expression in English. The confusion comes in because of the Expression (big 'E') types available in .net. Now the name makes sense, because you use big E expressions to build little e expressions at run time, but any time we overload a term we create more confusion.

      So, I'd still prefer it if people would stick to saying "a lambda is just an expression that creates an anonymous delegate" rather than "a lambda express is an anonymous delegate." It was a nit-picky point to begin with, but I figured you wouldn't mind.

      Delete
    3. You're right: I don't mind. These types of discussions are always welcome. It makes me think about things in different ways. It's also fun to try to figure out the best way to explain things without being too confusing.

      It looks like the MSDN documentation is somewhere in between: "A lambda expression is an anonymous function that you can use to create delegates or expression tree types." https://msdn.microsoft.com/en-us/library/bb397687.aspx

      The VB definition is even more fun: "A lambda expression is a function or subroutine without a name that can be used wherever a delegate is valid." https://msdn.microsoft.com/en-us/library/bb531253.aspx

      Delete