Tuesday, February 2, 2021

Go (golang) Anonymous Functions - Inlining Code for Goroutines

Anonymous functions in Go (golang) let us inline code rather than having a separate function. These are often used for goroutines (concurrent operations). Similar to anonymous delegates and lambda expressions in C#, anonymous functions in Go also support captured variables (referred to as closures). This can simplify our function signatures. Today, we'll look at how these fit into our code.
Anonymous functions allow us to inline code by creating a function with no name. Combined with closures, this can simplify function signatures for goroutines.
We'll continue with the sample that we've seen in the previous articles ("Go (golang) Channels - Moving Data Between Concurrent Processes" and "Go (golang) WaitGroup - Signal that a Concurrent Operation is Complete") to see anonymous functions with concurrent code.

Note: I would highly recommend reading the previous articles before continuing here. This article builds on the concurrency concepts of channels and WaitGroup.

Motivation: I have been using C# as a primary language for 15 years. Exploring other languages gives us insight into different ways of handling programming tasks. Even though Go (golang) is not my production language, I've found several features and conventions to be interesting. By looking at different paradigms, we can grow as programmers and become more effective in our primary language.

Resources: For links to a video walkthrough, CodeTour, and additional articles, see A Tour of Go for the C# Developer.

Anonymous Function Syntax

Let's start by looking at an example of an anonymous function.

Go
    go func(message string) {
      fmt.Println(message)
    }("hello")

This starts a new goroutine (i.e., kicks off a concurrent operation).

The anonymous function starts with "func" and then parameters in parentheses. This function takes a single string parameter named "message". If there are no parameters, use a set of empty parentheses. The main difference between this and a normal function declaration is that there is no name for the function.

The body of the function (enclosed in braces) is just like a normal function declaration.

After the closing brace, we include the parameter values enclosed in parentheses. This is because we are not just declaring the anonymous function, we are also calling it. In this example, the string "hello" is used for the "message" parameter.

Important: If an anonymous function has *no* parameters, you must include a set of empty parentheses after the closing brace to indicate that you are calling the function. (I keep forgetting this, and the compiler error does not make it obvious to me what is wrong.)

Using Anonymous Functions

Let's take the example we used when discussing channels and WaitGroup and add anonymous functions. As a reminder, here's how we left the "fetchPersonToChanel" function:

Go
    func fetchPersonToChannel(id int, ch chan<- person, wg *sync.WaitGroup) {
      defer wg.Done()
      p, err := getPerson(id)
      if err != nil {
        log.Printf("err calling getPerson(%d): %v", id, err)
        return
      }
      ch <- p
    }

And the code that calls the function concurrently:

Go
    ch := make(chan person, 10)
    var wg sync.WaitGroup

    // put values onto a channel
    for _, id := range ids {
      wg.Add(1)
      go fetchPersonToChannel(id, ch, &wg)
    }

    wg.Wait()
    close(ch)

    // read values from the channel
    for p := range ch {
      fmt.Printf("%d: %v\n", p.ID, p)
    }

For our code, we want to take the "fetchPersonToChannel" function and inline it. This will put the body of the function right after the "go" in the first "for" loop.

Inlining the Code
Here's what the code looks like once we add the anonymous function:

Go
    ch := make(chan person, 10)
    var wg sync.WaitGroup

    // put values onto a channel
    for _, id := range ids {
      wg.Add(1)
      go func(id int, ch chan<- person, wg *sync.WaitGroup) {
        defer wg.Done()
        p, err := getPerson(id)
        if err != nil {
          log.Printf("err calling getPerson(%d): %v", id, err)
          return
        }
        ch <- p
      }(id, ch, &wg)
    }

    wg.Wait()
    close(ch)

    // read values from the channel
    for p := range ch {
      fmt.Printf("%d: %v\n", p.ID, p)
    }

This block of code has the same functionality as the code with the separate function (more or less - there are a few technical differences).

Let's look at the pieces of the anonymous function a bit more closely. Here is just that part of the code:

Go
    go func(id int, ch chan<- person, wg *sync.WaitGroup) {
      defer wg.Done()
      p, err := getPerson(id)
      if err != nil {
        log.Printf("err calling getPerson(%d): %v", id, err)
        return
      }
      ch <- p
    }(id, ch, &wg)

After the "func" keyword, we have the parameters. These are the same parameters as the separate named function (we'll see how we can simplify this in just a bit).

After the body of the function (the last line), we have a set of parentheses with the parameter values for the function. In this case, we get the "id" value from the for loop, and the "ch" and "wg" values from the variables created earlier.

Things look a little strange because the names of the parameters (id, ch, wg) in the function declaration match the names of the variables (id, ch, wg) that are passed into the function at the bottom.

Simplifying the Function with Closures

In C#, anonymous delegates and lambda expressions can capture variables. This means that they can use a variable that is in the scope of the declaration even if the variable is not explicitly passed in to the anonymous delegate / lambda expression. In C#, these are referred to as "captured variables"; in Go (and most other languages), these are referred to as "closures".

For our example, since the channel ("ch") and WaitGroup ("wg") are both part of the enclosing scope, we can use these in the anonymous function without passing them in as parameters.

Here's what the code looks like when we do this:

Go
    ch := make(chan person, 10)
    var wg sync.WaitGroup

    // put values onto a channel
    for _, id := range ids {
      wg.Add(1)
      go func(id int) {
        defer wg.Done()
        p, err := getPerson(id)
        if err != nil {
          log.Printf("err calling getPerson(%d): %v", id, err)
          return
        }
        ch <- p
      }(id)
    }

    wg.Wait()
    close(ch)

    // read values from the channel
    for p := range ch {
      fmt.Printf("%d: %v\n", p.ID, p)
    }

The anonymous function now only takes one parameter: the "id". The channel and WaitGroup are no longer passed as parameters.

The body of the function has not changed. Inside the function we still reference "ch" and "wg", but instead of referring to parameters, these refer to the variables that are created before the "for" loop.

What I like about this
I like that when we use an anonymous function, we can simplify the function signature. The channel and WaitGroup parameters strike me as "infrastructure" parameters -- things we need to get the concurrent code to work properly.

I also like that we no longer have to worry about pointers. Previously, we had to use a pointer to a WaitGroup as a parameter so that things would work. Since the WaitGroup is now a closure, we don't have any pointers that we deal with directly.

Don't Capture Indexers

In C#, there is a danger using indexers in a closure (see "Lambda Expressions, Captured Variables, and For Loops: A Dangerous Combination"). This same danger exists in Go.

In the code above, we keep "id" as a parameter. It is tempting to use a closure for this as well. But if we do this, we end up with unexpected behavior.

Let's see what happens when we capture the "id" value. Here's that code:

Go (DANGER DANGER DANGER - do not copy this code block)
    ch := make(chan person, 10)
    var wg sync.WaitGroup

    // put values onto a channel
    for _, id := range ids {
      wg.Add(1)
      go func() {
        defer wg.Done()
        p, err := getPerson(id)
        if err != nil {
          log.Printf("err calling getPerson(%d): %v", id, err)
          return
        }
        ch <- p
      }()
    }

    wg.Wait()
    close(ch)

    // read values from the channel
    for p := range ch {
      fmt.Printf("%d: %v\n", p.ID, p)
    }

Now the anonymous function takes no parameters. Here is the output:

Console
    PS C:\GoDemo\async> .\async.exe
    [1 2 3 4 5 6 7 8 9]
    9: Isaac Gampu
    9: Isaac Gampu
    9: Isaac Gampu
    9: Isaac Gampu
    9: Isaac Gampu
    9: Isaac Gampu
    9: Isaac Gampu
    9: Isaac Gampu
    4: John Crichton

Instead of getting 9 separate values, we find that the "id" value of "9" repeated over and over. This is because of the way that closures work.
The value of a variable in a closure is the value at the time the variable is used, not the value at the time it was captured.
This means that if we capture an indexer or an iterator, we often get the final value of that indexer. For this particular example, the last value is "9", so that is what we see. There is one value of "4" -- we get this because of the "fun" of concurrency. The first call to the anonymous function ran before the iteration was complete, so the "id" has an intermediate value. The rest of the calls to the anonymous function ran after the iteration was done, so "id" has the final value.

The bad news is that we do not get a compiler warning if we try to capture an indexer or iterator value. The good news is that a good linting tool (such as the tool included with the "Go" extension for Visual Studio Code) will warn us when we make this error.

Another Anonymous Function

There is one more anonymous function that we can add to our block of code. We can wrap the code that waits for the WaitGroup and closes the channel.

Here's that code (note that the "id" parameter is back in the first anonymous function):

Go
    ch := make(chan person, 10)
    var wg sync.WaitGroup

    // put values onto a channel
    for _, id := range ids {
      wg.Add(1)
      go func(id int) {
        defer wg.Done()
        p, err := getPerson(id)
        if err != nil {
          log.Printf("err calling getPerson(%d): %v", id, err)
          return
        }
        ch <- p
      }(id)
    }

    go func() {
      wg.Wait()
      close(ch)
    }()

    // read values from the channel
    for p := range ch {
      fmt.Printf("%d: %v\n", p.ID, p)
    }

Notice the code between the "for" loops. We have a goroutine with an anonymous function that takes no parameters. This will kick off this code concurrently.

How does this change the flow of the application?
The change to how the application works is subtle, but it's interesting. (Note: if it doesn't make sense right away, that's fine. It took me a little while to wrap my head around it.)

With this goroutine in place, we end up with 11 total goroutines. This includes the 9 goroutines created in the first "for" loop, the goroutine with the "Wait" and "close", and the "main" function (the "main" function is also a goroutine).

In our previous code, the "main" function blocked at the "wg.Wait" call. With the new code, the "main" function is not blocked here (since it is inside a goroutine). Instead, the "main" function will block with the second "for" loop waiting for the channel to be closed.

The goroutine with the "Wait" will block until the WaitGroup hits 0. But this blocks only within the goroutine itself. Once the WaitGroup hits 0, then the channel will be closed.

Once the channel is closed, the second "for" loop in the "main" function will complete, and the application continues as normal.

From the outside, the behavior is the same. We still have the separate concurrent operations that call a web service and write to a channel. We still have the WaitGroup that signals when these concurrent operations are complete (so the channel can be closed). And we still have the "for" loop that reads from the channel until that channel is closed.

Internally, we have a bit more concurrency. The WaitGroup "Wait" is happening in a separate goroutine.

Anonymous functions make this easier
We may or may not want to add another layer of concurrency to the application. But using an anonymous function with a goroutine makes it a lot easier. If we wanted to use a normal named function, we would need to include parameters for the channel and WaitGroup, and it's probably a bit more effort than we want to go through.

But when we can use an anonymous function with a closure for the channel and WaitGroup, there is less code for us to manage.

A Final Example

When we looked at "WaitGroup", we saw an example where we could use a WaitGroup to stop an application from exiting too early. Let's see what happens when we use an anonymous function here.

Here's the original code:

Go
    func logMessages(count int, wg *sync.WaitGroup) {
      defer wg.Done()
      for i := 0; i < count; i++ {
        log.Printf("Logging item #%d\n", i)
        time.Sleep(1 * time.Second)
      }
    }

    func main() {
      var wg WaitGroup
      wg.Add(1)
      go logMessages(10, &wg)
      time.Sleep(3 * time.Second)
      wg.Wait()
      fmt.Println("Done")
    }

And here's the same code with "logMessages" as an anonymous function:

Go
    func main() {
      var wg WaitGroup
      wg.Add(1)
      func() {
        defer wg.Done()
        for i := 0; i < 10; i++ {
          log.Printf("Logging item #%d\n", i)
          time.Sleep(1 * time.Second)
        }
      }()
      time.Sleep(3 * time.Second)
      wg.Wait()
      fmt.Println("Done")
    }

Like our other example, we can remove the "WaitGroup" parameter from the anonymous function and rely on a closure for the WaitGroup.

For the "count" parameter, it doesn't really make much sense to have a parameter for this since we are passing in a hard-coded value ("10"). So, I removed the parameter and put the "10" into the conditional of the "for" loop.

The output of this code is the same as we saw in the prior article.

Wrap Up

Anonymous functions and goroutines work well together. We can inline our code and simplify function signatures by using closures. I have found anonymous delegates and lambda expressions to be very useful in the C# world, and anonymous functions in Go have a lot of similarities.

There is more to learn about anonymous functions. For example, we can assign an anonymous function to a variable (similar to using a delegate variable in C#). In this article, we focused on using anonymous functions with goroutines. But there is more to explore.

Happy Coding!

No comments:

Post a Comment