Thursday, January 14, 2021

Go (golang) Goroutines - Running Functions Asynchronously

I've been exploring the Go programming language (golang). A really cool feature is that running functions asynchronously is built right in to the language.

How to Run a Function Asynchronously

Normal function call:

Go
    processData(incomingData)

Asynchronous function call:

Go
    go processData(incomingData)

Add a "go" before the call. Now it is a goroutine, and it runs asynchronously.

That's it. That's the post.

So, there is more...
Okay, asynchronous programming is more complicated than this. Once we have multiple things running at the same time, we need to know when they complete and often communicate results between asynchronous processes. In Go, we can use "WaitGroup" and "channel" to help with this. But that's a bigger topic for another article.

Motivation: I have been using C# as a primary language for 15 years. Exploring other languages gives us insight into different ways of handling programming tasks. Even though Go (golang) is not my production language, I've found several features and conventions to be interesting. By looking at different paradigms, we can grow as programmers and become more effective in our primary language.

Resources: For links to a video walkthrough, CodeTour, and additional articles, see A Tour of Go for the C# Developer.

Happy Coding!

Wednesday, January 13, 2021

Go (golang) Multiple Return Values - Different from C# Tuples

One of the features that I like in the Go programming language (golang) is that functions can return multiple values. In C#, we can mimic something similar using tuples, but they are not quite the same.
Go functions can return multiple values.
Let's take a look at how this works and how it differs from what is available in C#.

Motivation: I have been using C# as a primary language for 15 years. Exploring other languages gives us insight into different ways of handling programming tasks. Even though Go (golang) is not my production language, I've found several features and conventions to be interesting. By looking at different paradigms, we can grow as programmers and become more effective in our primary language.

Resources: For links to a video walkthrough, CodeTour, and additional articles, see A Tour of Go for the C# Developer.

Multiple Return Values

In a prior article, we took a look at deferring operations (Go (golang) defer - A Better finally). Part of that included an example of a function that returns multiple values. Let's start with the same function from that article.

The function is called "getPerson", and it fetches data from a web service. You can see this function on GitHub in the "CodeTour: Go for the C# Developer" repository: https://github.com/jeremybytes/go-for-csharp-dev/blob/main/async/main.go (starting on line 27).


This function returns multiple values, and inside the body, it calls a function that returns multiple values.

Declaration
Here's the function declaration:

Go
    func getPerson(id int) (person, error) {
       ...
    }

The declaration starts with "func" followed by the named of the function ("getPerson").

Next is the parameter set (enclosed in parentheses). In this case, we have a parameter named "id" which is an integer.

Next, we have the return types (also enclosed in parentheses). This indicates that this function returns 2 values: one of type "person" (a struct) and one of type "error" (a built-in type).

Note: There are several ways for declaring return types. If we only have a single return type, we do not need the parentheses. In addition, we can name the return values (similar to how the parameters are named). For more information, take a look at the "Function declarations" steps in the Code Tour (available on GitHub: https://github.com/jeremybytes/go-for-csharp-dev).

Returning Multiple Values
There are several code paths that return multiple values. In each case either the "person" or the "error" is populated (but not both). This follows the error handling philosophy in Go. For more information, see the previous article: Go (golang) Error Handling - A Different Philosophy.

Here is an error path that has a return:

Go
    if err != nil {
      return person{}, fmt.Errorf("error fetching person: %v", err)
    }

This shows a "return" statement that returns 2 values (delimited by a comma). The first value ("person{}") is an empty struct. "person" is a struct that contains multiple data fields. Structs themselves cannot be nil (null in C#), so we use an empty struct here. In an empty struct, all of the fields have default values (0 for integers, "" for strings, etc.).

The second return value is an "error". For more information on error handling, see the article mentioned above.

As another example, let's look at the "happy path" -- the last few lines of the function.

Go
    var p person
    err = json.NewDecoder(resp.Body).Decode(&p)
    ... // skipping the error handling
    return p, nil

In this code, "p" is a variable of type "person". The "Decode" function parses a JSON result and populates the "p" variable.

The final return statement returns the variable "p" for the "person" and "nil" for the "error".

So we return multiple values from a function by separating the values with commas.

Using Multiple Return Values

This shows how we can declare a function that returns multiple values, but how do we use those values? We can see an example of this in the first 2 lines of our function:

Go
    url := fmt.Sprintf("http://localhost:9874/people/%d", id)
    resp, err := http.Get(url)

The first line creates a URL for a web service call.

Note: the ":=" operator both creates a variable and assigns a value to it. The type of the variable is inferred based on the assignment (similar to a "var" in C#). Here is the equivalent of the first line in C#:

C#
    var url = string.Format("http://localhost:9874/people/{0}", id);

The next line of our Go function is where things get more interesting.

Go
    resp, err := http.Get(url)

"http.Get()" sends a web request and returns 2 values: a pointer to a response (*Response) and an error (error). To capture both of these values, we can use the ":=" operator to create and assign variables. Since we have multiple return values, we have multiple variables on the left of the ":=" operator.

The end result is that "resp" contains the "*Response" value and "err" contains the "error" value.

Using the getPerson Function
To relate declaring and using a function with multiple return values, let's go back to the "getPerson" function. Here's the declaration again:

Go
    func getPerson(id int) (person, error) {
       ... // most of the function
       return p, nil
    }

This function is used later in the sample code. (You can find this on line 72 of the file in the GitHub repository: https://github.com/jeremybytes/go-for-csharp-dev/blob/main/async/main.go.)

Go
    p, err := getPerson(id)

This calls the "getPerson" function and captures the "person" and "error" values in variables named "p" and "err", respectively.

Different from Tuples

We can mimic something similar to this in C# by using tuples, but tuples are a bit different. To see the difference, let's look at a similar function declaration in C#.

Go
    func getPerson(id int) (person, error) {
       ... // most of the function
       return p, nil
    }

C#
    static (Person, Error) GetPerson(int id)
    {
      ... // most of the function
      return (p, err);
    }

The C# version looks fairly similar to the Go version. The difference is that the Go function returns 2 separate values; the C# function returns a tuple -- a single item that contains multiple values.

Looks the Same...
We can mimic the different ways to get values from the function:

Go
    p, err := getPerson(id)

C#
    var (p, err) = GetPerson(id);

In C#, this creates 2 variables named "p" and "err" that are "Person" and "Error" types, respectively. So this looks pretty much the same as Go.

We can even discard values that we do not want to use:

Go
    p, _ := getPerson(id)

C#
    (var p, _) = GetPerson(id);

In both of these cases, the "error" return value is discarded.

...But Ultimately a Tuple is Different
Even though these look the same, there is something we can do in C# that we cannot do in Go.

C#
    var result = GetPerson(id);

In this case, "result" is a tuple that has the type "(Person, Error)". So a tuple is really a single "thing" that contains multiple elements. In Go, the return values are completely separate.

Tuples are interesting in C# because they let us put multiple values together as a single entity. And C# keeps adding different ways to deconstruct tuples into the various elements. Tuples are becoming more important as time goes on.

Final Thoughts

I find multiple return values to be quite interesting. I really like how Go has baked it into the language. In C#, tuples feel a bit awkward to me -- probably because of the different ways they can be used and deconstructed. I like the clean way that Go lets us easily return multiple values.

This feature makes the error handling philosophy possible. And that's something I find really interesting. (Again, see the prior article for more information: "Go (golang) Error Handling -- A Different Philosophy".)

Along these same lines, F# supports a different construct: Discriminated Unions. This allows us to return a single value, but the type of the value can vary. In our example, we could have a discriminated union that returns a "person" or an "error". I'll leave further exploration about this up to the reader.

The great thing about exploring different languages is that they take different approaches to problems. Along the way, we can also find out the various pros and cons to each approach. Ultimately, this gives us more options in our toolbox, and that makes it easier to pick the right one for a particular scenario.

Keep exploring.

Happy Coding!

Tuesday, January 12, 2021

Go (golang) Error Handling - A Different Philosophy

In looking at Go (golang) as someone who has spent quite a bit of time in C#, I'm really intrigued by the approach to error handling.
Go has "error" to represent problems that can potentially be handled and "panic" for problems that force an application to exit.
This is significantly different from exceptions and the exception handling mechanism that we see in C#.

Motivation: I have been using C# as a primary language for 15 years. Exploring other languages gives us insight into different ways of handling programming tasks. Even though Go (golang) is not my production language, I've found several features and conventions to be interesting. By looking at different paradigms, we can grow as programmers and become more effective in our primary language.

Resources: For links to a video walkthrough, CodeTour, and additional articles, see A Tour of Go for the C# Developer.

Exceptions in C#

In C#, we are used to exceptions for error handling. An exception is a complex error object that contains a specific type, message, (potentially) an inner exception, a call stack, and other things.

In addition, we rely on an exception handling system that takes care of things for us. When an exception is thrown, it walks up the call stack looking for a "catch" that handles the exception. If it doesn't find one (i.e., the exception is unhandled), the application exits -- usually ungracefully.

As programmers, we can try to catch an exception using a try / catch block. If we think we can do something with the exception, we handle it. Otherwise, we let is go up the call stack to see if someone else can handle it.

We're used to letting the infrastructure do much of the work for us (not that there's anything wrong with that).

Go takes a different approach.

"error" in Go

In contrast, Go uses an "error" (an interface type) that is, at its heart, a wrapper around a string. "error" is a common return type for a function. And this really leaves it up to the programmer to deal with errors a bit more directly. There is no infrastructure for us to tap in to.

Example
Here's a sample function. We'll break down bits of it to see the conventions for error handling in Go. You can see this function on GitHub in the "CodeTour: Go for the C# Developer" repository: https://github.com/jeremybytes/go-for-csharp-dev/blob/main/async/main.go (starting on line 13).


Conventional Return Values
Let's look at line 14 for a typical way that errors are provided.

Go
    resp, err := http.Get("http://localhost:9874/people/ids")

The "Get" function returns 2 values (multiple return values are supported in Go). In this case, "Get" returns a pointer to an HTTP response (*Response) and an error (error). These are captured in the variables "resp" and "err", respectively.
Convention: A function returns both data and an error. A non-nil error value means that an error occurred.
This is a common pattern: to return data and an error. If the function needs to return more than one value, then the error is the last return value in the list. This is all by convention; there is nothing in the compiler that enforces this (although a good linting tool will help you out quite a bit -- I use the Go extension with Visual Studio Code).

Checking for Errors
After calling a function that returns an error, the next step is to see if the error is populated. Here is line 14 repeated along with the following lines:

Go
    resp, err := http.Get("http://localhost:9874/people/ids")
    if err != nil {
        return nil, fmt.Errorf("error fetching ids: %v", err)
    }

This checks to see if the error is not nil (null in C#). If it is not nil, then the function short-circuits by returning -- in this case using "nil" for the data along with a populated error (we'll look at how the error is created in just a bit).

There are a couple of conventions to note here. First, we check that "error is not nil" before using the data. Next, the "if" does not have a corresponding "else"; instead, the "if" generally exits out of the function (often by returning its own error).
Convention: Check the error before using any data returned from a function. 
Convention: If there is an error, short-circuit the rest of the function.
Note: these are not the only options. For web calls, we can always retry on error, and there are ways to dig deeper to see if we can figure out the source of the problem. But typically, we pass the error along.

Formatting Errors

In the sample above, we use the "fmt.Errorf" function to create an error by wrapping another error. Let's look at this a bit more closely.

Go
    if err != nil {
        return nil, fmt.Errorf("error fetching ids: %v", err)
    }

As mentioned above, an "error" is more or less a wrapper around a string. The "Errorf()" function returns an "error". The parameter is a string that contains whatever information we want to pass along.

Since we already have an error, we want to append information specific to our function and then pass along the error we received. "Errorf" helps us do that.

The parameter for "Errorf" uses placeholders (known as "verbs" in Go) that are similar to the placeholders in "string.Format()" in C#. The "%v" verb will be filled in with "err". ("%v" means that "natural format" in Go -- this is similar to "ToString" in C#.)

Prepending Errors
By convention, we prepend the incoming error with our message. In the above example, we use "error fetching ids: [incoming error]". This is all by convention.
Convention: Specific error messages are prepended to an existing error.
In addition, due to the conventions around error messages, capitalization and punctuation are important. Because we are creating a string of error messages, we should not include capitalization (implying a "new" thing). For the same reason, we should not include line breaks or other terminators (such as a period).
Convention: Error messages should start with a lower-case letter.
Convention: Error messages should not have line breaks or periods.
These conventions will make more sense when we look at some output.

Sample Error Message

Let's continue with our example to show where the "getIDs" function is called -- this is on line 68 of the previously linked file on GitHub: https://github.com/jeremybytes/go-for-csharp-dev/blob/main/async/main.go:

Go
    ids, err := getIDs()
    if err != nil {
      log.Fatalf("getIDs failed: %v", err)
    }

This calls the "getIDs" function and then checks for an error. If there is an error, we use the "log.Fatalf" function to log the error and exit the application.

log.Fatalf
"log.Fatalf" does 2 things. First, it prints a formatted string to the log (which goes to the console by default). This string includes a timestamp. Next, it exits the application (using "os.Exit(1)" to denote that the application exited with an error). Notice that just like with our "Errorf" call above, we prepend our own message before including the error.

As an reminder from a previous article, when we use "os.Exit()" or "log.Fatalf", deferred items do not run. See "Go (golang) defer - A Better finally" for more information on "defer".

Error Message
The function call to "getIDs" tries to get a slice of integers from a web service. If the web service is not available (for example, if the service is not running), then we get the errors that we have seen here.

Here is the resulting message:
2021/01/10 15:47:24 getIDs failed: error fetching ids: Get "http://localhost:9874/people/ids": dial tcp [::1]:9874: connectex: No connection could be made because the target machine actively refused it.

If we break this message down, we can see the how the conventions come together.

2021/01/10 15:47:24 getIDs failed:

This part comes from the "log.Fatalf" function from the "main" function. We can see the timestamp along with our message "getIDs failed".

error fetching ids:

Next we see the message from the result of the "fmt.Errorf" function call in the "getIDs" function -- our custom message "error fetching ids".

Get "http://localhost:9874/people/ids": dial tcp [::1]:9874: connectex: No connection could be made because the target machine actively refused it.

The rest of the message is the error that comes from the "http.Get" function call. Based on the conventions, I would assume that the first message ("Get" with the URI) comes from the "http.Get" function, and the rest comes from the lower level functions that are called within "Get".

A Sort of Call Stack
What we see when we put this all together (using the conventions noted above) is a sort of call stack. The colons are delimiters as we walk down the stack to the original error message.

This is definitely not as formal as the call stack that is part of a C# exception, but it is still useful for tracking the ultimate source of error messages.

Error Handling Philosophy

In Go, handling errors is entirely up to the programmer. We need to check the errors that are returned from functions. If an error is populated, then we decide what to do with it. Even if we are not prepared to handle it directly, we should pass it along as a return value.

This is quite a bit different from C#. If a function throws an exception in C#, we can decide whether we want to handle it or not. If we leave it unhandled, then it walks up the call stack until it finds a handler. And whether we decide to "catch" it or not, the normal execution stops.

This means that in Go we need to be more aware of what functions return errors and how we want to deal with them. But it is entirely up to us to decide.

Ignoring Errors
One option is to ignore errors. There is nothing that forces us to capture an error value that comes back from a function. Here's an example:

Go
    resp, _ := http.Get("http://localhost:9874/people/ids")

This uses the same "http.Get" call from above. But instead of putting the error value into an "err" variable, we use a "blank identifier". This is a discard. The value of the error is not assigned to anything, and we have no visibility to it. The error is completely ignored.

This may work if we have a simple environment that we have full control over, but this is not very common. Instead, what is likely is that we will get a "panic".

"panic"

A "panic" in Go is most similar to an unhandled exception in C#. A "panic" occurs when an illegal operation is performed -- such as reading beyond the end of an array.

When an "panic" happens, the application will exit. There is no way to "handle" a panic. The good news is that any deferred operations will still run. See "Go (golang) defer - A Better finally" for more information on "defer".

For the sample application, if we do ignore the error from the "http.Get" function, and the service is not running, we get a "panic". Here is what gets dumped to the console:

panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xc0000005 code=0x0 addr=0x40 pc=0x64de30]

goroutine 1 [running]:
main.getIDs(0xbff756a8f5b21990, 0x2d864d, 0x8f99e0, 0x0, 0x0)
    C:/GoCodeTour/async/main.go:18 +0xa0
main.main()
    C:/GoCodeTour/async/main.go:60 +0x63

This shows the runtime error: "invalid memory address or nil pointer dereference". I would guess that the problem is a "nil pointer dereference". Above, we saw that "http.Get" returns a pointer to an HTTP response (*Response). This value will be nil due to the error.

So even though we can ignore errors, it is not a good idea.

Conventions

Let's do a quick review of the conventions:
  • A function returns both data and an error. A non-nil error value means that an error occurred.
  • Check the error before using any data returned from a function.
  • If there is an error, short-circuit the rest of the function.
  • Specific error messages are prepended to an existing error.
  • Error messages should start with a lower-case letter.
  • Error messages should not have line breaks or periods.
None of these conventions are required by the compiler, but we will have a much easier time if we follow them.

What I Find Interesting

I think this philosophy for error handling is quite interesting. There is a clear division between "things I can potentially deal with" (error) and "things I cannot deal with" (panic).

As much as I appreciate the exception handling mechanism that is built in to C# (and the ways that I have taken advantage of that in my own programming), I like the approach of returning an error value from a function. This feels like the programmer has a bit more control over the situation.

It can also be quite a bit more dangerous since it is possible to ignore errors and continue -- that's quite a bit harder with exceptions.

I do like the idea of a "panic". This means that I am in truly an invalid state (trying to dereference a nil pointer, trying to read beyond the end of an array). The "panic" ensures that the application exits, hopefully before any damage can be done.

This gives me a clear delineation between errors I can potentially deal with and those that I cannot.

Overall, there are things to be said for both approaches. But by looking at a different paradigm, it gives me the opportunity to rethink how I program. Are there situations where I can use a more Go-like error handling mechanism? Maybe.

Ultimately, learning how other environments work makes our world a bit bigger and gives us other options to consider.

Happy Coding!

Monday, January 11, 2021

Go (golang) defer - A Better “finally”

My primary world has been C# for a while. There are several things in Go (golang) that I find interesting, particularly when looking at them from a C# perspective. One of those is the "defer" statement.
The "defer" statement lets us ensure that code runs before a function exits.
The closest thing to "defer" in C# is the "finally" part of a "try / finally" block. But “defer” has a characteristic that I really like: we can use it where we’re thinking about it. Let's take a look.

Motivation: I have been using C# as a primary language for 15 years. Exploring other languages gives us insight into different ways of handling programming tasks. Even though Go (golang) is not my production language, I've found several features and conventions to be interesting. By looking at different paradigms, we can grow as programmers and become more effective in our primary language.

Resources: For links to a video walkthrough, CodeTour, and additional articles, see A Tour of Go for the C# Developer.

Example of "defer"

We'll start by looking at some code written in Go. Here's a function called "getPerson" that fetches data from a web service. You can see this function on GitHub in the "CodeTour: Go for the C# Developer" repository: https://github.com/jeremybytes/go-for-csharp-dev/blob/main/async/main.go (starting on line 27).


Here's a plain text representation of the same code (sorry there's no syntax highlighting here -- that's why I included the screenshot and link to the GitHub repo):

Go

27  func getPerson(id int) (person, error) {
28    url := fmt.Sprintf("http://localhost:9874/people/%d", id)
29    resp, err := http.Get(url)
30    if err != nil {
31      return person{}, fmt.Errorf("error fetching person: %v", err)
32    }
33    defer resp.Body.Close()
34    if resp.StatusCode != 200 {
35      return person{}, fmt.Errorf("error fetching ...[truncated])
36    }
37    var p person
38    err = json.NewDecoder(resp.Body).Decode(&p)
39    if err != nil {
40      return person{}, fmt.Errorf("error parsing person: %v", err)  41    }
42    return p, nil
43  }

We're specifically looking at line 33 here. This line uses the "defer" statement with "resp.Body.Close()".

Similar to IDisposable
"resp" is the response that comes back from an HTTP request (from line 29). The response has a "Body" with the payload of the response (a set of JSON data in this case). The "Body" has a "Close" method that needs to be called when we are done using the Body.

This is similar to an object that implements the IDisposable interface in C#. When we are done with the object, we need to call the "Dispose" method to make sure that everything is cleaned up appropriately. In C#, we often do this in a "finally" block (and we'll explore this more below) or with a "using" statement (which turns into a "finally" block when the code is compiled).

Deferring the Call
In this case, rather than needing to call a "Dispose" method, we need to call the "Close" method on the Body. 

Instead of using a "finally" block, we use the "defer" statement in Go. When we use "defer", the statement will run just before the function exits. The good thing about this is that we can put a "defer" almost anywhere in the code (in this case, anywhere after we have the response body available).

The function has multiple exit points. There is a "return" on line 35 if there is an error retrieving the data. There is a "return" on line 40 if there is an error parsing the data. And there is a "return" on line 42 if things run successfully.

Regardless of which path is followed in the function, the "resp.Body.Close()" method will run before the function exits.

Comparison to try/finally

Let's compare "defer" in Go to "try / finally" in C#. We'll do this with a bit of pseudo-code (since error handling is different in Go, the code will not be directly equivalent).

C#
  var resp = GetHttpResponse(url);
  try
  {
    var person = ParseResponse(resp.Body);
    return person;
  }
  catch (Exception ex)
  {
    log.LogException(ex);
  }
  finally
  {
    resp.Body.Close();
  } 

Go
  resp, _ := GetHttpResponse(url)
  defer resp.Body.Close()
  person, err := ParseResponse(resp.Body)
  if err != nil {
    return person{}, err
  }
  return person, nil

In the C# sample, if the "ParseResponse" call throws an exception, the "finally" block will run and the response body will be closed. If the "ParseResponse" call is successful, then the "finally" block will run and the response body will be closed.

In the Go sample, if the "ParseResponse" call returns an error (and the function returns "early"), the "defer" will ensure that the response body will be closed. If the "ParseResponse" call is successful, then the "defer" will ensure that the response body will be closed.

In both scenarios, the response body will be closed whether the operation was successful or whether it failed.

Note: In this simplified code, we are not checking to see if the "GetHttpResponse" was successful. In the C# code, this could result in an unhandled exception. In the Go code, this could result in a "panic" (which is similar to an unhandled exception). Error handling in Go has quite a different philosophy than C#, and we'll look at this in a subsequent article.

Mutiple Deferred Items

What if there is more than one "defer" in a function? Well, they all run, but they run in the reverse order that they are declared. Here's an example:

Go
  func main() {
    defer fmt.Println("Goodbye")
    defer fmt.Println("Farewell")
    fmt.Println("Hello, World!")
  }

And here's the output:
  Hello, World!
  Farewell
  Goodbye

We can see that the deferred items are run in the reverse order they were declared.

Error Handling

As mentioned above, Go has a different philosophy toward error handling. This is a topic that will be covered in a future article. But it does need to be mentioned here.

In Go, there are no "try / catch" blocks. There are no exceptions that can be handled. Instead, there are errors that can be dealt with (an "error" returned from a function), and errors that cannot be dealt with (a "panic" in Go).

Error
An "error" that is returned from a function can be dealt with in the code. The pattern we see above is common. A function returns an "error" value. If it is not null, then we do something with it (often adding our own message and returning it). This works for things like a web service call that fails or parsing JSON that is not in the right format. We may or may not be able to recover from the scenario, and that is up to our code.

Panic
The other type of error state in Go is a "panic". A "panic" is caused by thing like accessing an index beyond the end of an array or writing to a closed channel. These are invalid operations that potentially compromise the state of the application.

A "panic" in Go is most directly similar to an unhandled exception in C#. The "panic" works its way up the call stack and eventually the program exits with an error code.

A big difference between Go and C# is that there is no way to "catch" a panic. Once a panic has occurred, the program will exit.

"defer" and "panic" 
The reason this is important to us is that we need to know what happens to deferred items when a panic happens.

The good news is that "defer" still runs. If a function panics, any defers in that function will still run. If that function was called by another function, defers will run in that calling function. This goes all the way up to the main goroutine.

So if a function panics, all of the deferred items will still run.

Where Deferred Items are *Not* Run

We do need to be aware of a situation where deferred items are not run: when we use os.Exit().

The os.Exit() function exits from the application immediately -- without running "defer". We can see this with a small application.

Go
  func main() {
    defer fmt.Println("Goodbye")
    defer fmt.Println("Farewell")
    fmt.Println("Hello, World!")
    os.Exit(0)
  }

Here's the output:
  Hello, World!

Unlike the earlier example, the 2 deferred items are not run here. "os.Exit()" takes a parameter to tell whether the application exits successfully or with an error. A non-zero value indicates an error (this is often used to tell a batch processing system that an application errored). In the example above, we use "0" as a parameter (meaning the application completed successfully). But regardless of the error code, os.Exit() will bypass deferred items.

As a side note, this is also important in relation to the "log.Fatalf()" function (and its variants). "log.Fatalf()" calls "os.Exit(1)", so deferred items will be bypassed here as well. We'll look at "log.Fatalf()" a bit more in a future article about error handling.

What I Like about "defer"

The big thing that I really like about "defer" is that I can use it where I'm thinking about it. In these examples, I can use "defer" to close the response body right after I get the response. I don't have to remember to do it later.

I always like code that I can use in the spot nearest to where I'm thinking about it.

I find the philosophy of error handling in Go to be quite interesting. That's what we'll look at in the next article.

Happy Coding!

Sunday, January 10, 2021

Go (golang) Loops - A Unified "for"

My primary world has been C# for a while. I've been looking at the Go programming language (golang), and I wanted to write a few articles about things I find interesting about the language. Here is the first:
The "for" statement is used for all loops in Go.
In C#, we have quite a few ways to loop, including "for", "foreach", "while", and "do while". Go uses a single "for" statement to handle all of these situations.

Motivation: I have been using C# as a primary language for 15 years. Exploring other languages gives us insight into different ways of handling programming tasks. Even though Go (golang) is not my production language, I've found several features and conventions to be interesting. By looking at different paradigms, we can grow as programmers and become more effective in our primary language.

Resources: For links to a video walkthrough, CodeTour, and additional articles, see A Tour of Go for the C# Developer.

Looping with an Indexer

In C#, the "for" loop sets up an indexer. The Go "for" statement can be used this same way.

C#
    // Print the numbers 0 through 9
    for (int i = 0; i < 10; i++)
    {
        Console.WriteLine(i);
    }

Go
    // Print the numbers 0 through 9
    for i := 0; i < 10; i++ {
        fmt.PrintLn(i)
    }

The syntaxes are quite similar. Some differences: 
  • Go does not have parentheses around the condition.
  • Go has the opening brace on the same line as the "for".
  • Go requires the braces even if the body only has 1 line of code.

Iteration

In C#, the "foreach" loop is handy to iterate over a collection (in reality, anything that implements the IEnumerable interface). Go uses a "for" statement with a range to do this same thing.

C#
    // "weekdays" is an array with the days of the week
    foreach(var day in weekdays)
    {
        Console.WriteLine(day);
    }

Go
    // "weekdays is an array with the days of the week
    for index, day := range weekdays {
        fmt.Println(weekdays[index])
        fmt.Println(day)
    }

A big difference in Go is that "range" returns 2 values: the index and the item. If we use the index, then we can index into the collection to get a value out. But if that's what we're doing, we can instead use the value directly.

Side note about unused variables
In Go, if a variable is declared but not used, the result is a compiler error. With a "range", it is common to use the indexer *or* the item, but it is uncommon to use both. In that situation, we can use a "blank identifier" (the '_' character) to note that we are not going to use that value. This is similar to using a "discard" (also the '_' character) in C#.

Here is a "for" loop that is more equivalent to a "foreach" in C#:

Go
    // "weekdays" is an array with the days of the week
    for _, day := range weekdays {
        fmt.Println(day)
    }

Simple Iteration
One other option with a "range" is to capture neither the indexer nor the item. This will run the body of the loop one time for each item in the range, but it does not use the index or the item in the body.

Go
    // Run a loop one time based on the length of the range
    for range weekdays {
        // do something here
    }

This is limited because we do not have access to the thing that we are iterating, but it does let us perform an action a certain number of times (based on the length of the collection).

While Loops

The "for" statement in Go can also be used as a "while" loop.

C#
    while (command != "stop")
    {
        // do something with the command
        command = GetNextCommand();
    }

Go
    for command != "stop" {
        // do something with the command
        command = getNextCommand()
    }

The statements are similar between C# and Go. The condition is set with the "while" or "for", and the state that affects the condition is updated inside the loop.

do / while
There isn't really a direct way to do a "do / while" loop in Go. In C#, the body of a "do / while" loop will execute at least one time, and the condition is checked after the first iteration of the loop. Personally, I haven't seen a "do / while" loop in quite a while in C#. Generally a "do / while" can be refactored into a standard "while", so this is not a large issue.

Endless Loops

Sometimes we need an endless loop -- okay, generally not endless, but a loop that runs until an explicit "break".

C#
    while (true)
    {
        // do something interesting
        if (stopCondition)
            break;
    }

Go
    for {
        // do something interesting
        if stopCondition {
            break
        }
    }

The "for" statement with no condition sets up an endless loop.

Note: the above format is one way to refactor a "do / while" loop. The "do something interesting" section will run at least one time.

"break" and "continue"

The "break" and "continue" statements in Go work the same as in C#. The "continue" statement will stop the current iteration of the loop and go to the next iteration. The "break" statement will break out of the loop entirely.

Final Thoughts

There are a lot of interesting things about the Go language and the design decisions that were made. I think it's very interesting that the "for" statement can cover so many use cases -- things that require "for", "foreach", and "while" in C#. There are pros and cons to this approach.

One "pro" is that there's only one statement to worry about -- if you need a loop, you use "for".

The "con" that goes along with this is that there are many different things that can go in the condition, whether it is an indexer, a range, or an empty condition.

Personal preference, I kind of like the single "for" statement.

Even if you don't end up using Go as a primary language (or even at all), it is interesting to look at how other languages approach certain problems. It can give us ideas of how we can approach things differently in our primary programming language.

Happy Coding!

Sunday, October 18, 2020

Abstract Classes vs. Interfaces in C# - What You Know is Probably Wrong

A little over a year ago, C# 8 changed a lot of things about interfaces. One effect is that the technical line between abstract classes and interfaces has been blurred. It's been a year, and I haven't seen much online about this, so let's take a closer look at how things have changed.

To show the changes, here is a slide I used to use to show the differences. 4 of the 5 differences no longer apply:

Side-by-side list of differences between interfaces and abstract classes with 4 of the 5 differences being crossed out.

A lot of folks are not aware of these changes, so let's go through what we used to know about the differences between abstraction classes and interfaces and see if it still holds up today.

Note: For more information on the changes to interfaces in C# 8, you can take a look at the code and articles in this repository: https://github.com/jeremybytes/csharp-8-interfaces. There are also links to videos that demonstrate the new features (and some of the pitfalls).


Difference #1

An abstract class may contain implementation code. An interface may not have implementation code, only declarations.

Status: No Longer True

Interfaces in C# 8 can have implementation code -- referred to as "default implementation". For example, an interface method can have a body that provides functionality. This functionality is used by default if the implementing class does not provide its own implementation.

An interface can also provide default implementation for properties, but due to limitations with property setters, this only makes sense in specific circumstances.

More information: C# 8 Interfaces: Properties and Default Implementation.


Difference #2

A class may only inherit from a single abstract class, but a class may implement any number of interfaces.

Status: Still True

C# still has single inheritance when it comes to classes. But because a class can implement any number of interfaces, and those interfaces may have implementation code, there's a type of multiple inheritance that is available through interfaces.

When calling default implementation members, the caller must use the interface type (just like when a class has an explicit implementation). This avoids the "diamond problem", which is where two base classes provide the same method. By specifying the interface, this avoids the runtime having to make decisions of which method to use.


Difference #3

Abstract class members can have access modifiers (public, private, protected, etc.). Interface members are automatically public and cannot be changed.

Status: No Longer True

In C# 8 interface members can have access modifiers. The default is public (so existing interfaces will still work as expected). But interfaces can now have private members that are only accessible from within the interface itself. These may be useful for breaking up larger methods that have default implementation.

Note: Protected members are also possible, but because of the weirdness of implementation, I have not found a practical use for them.

More information: C# 8 Interfaces: Public, Private, and Protected Members.


Difference #4

Abstract classes can contain fields, properties, constructors, destructors, methods, events, and indexers. Interfaces can contain properties, methods, events, and indexers (not fields, constructors, or destructors).

Status: Mostly True

This is still mostly true. When it comes to instance members, interfaces can contain only properties, methods, events, and indexers. Instance fields and constructors are still not allowed.

But, interfaces can contain static members (see "Difference #5" below). This means that interfaces can have static fields, static constructors, and static destructors.


Difference #5

Abstract classes may contain static members. Interfaces cannot have static members.

Status: No Longer True

As noted above, interfaces in C# 8 can have static members. A static method in an interface works exactly the same way as a static method on a class. 

The same is true for static fields. Yes, fields are allowed in an interface, but only if they are static. As a reminder, a static field belongs to the type itself (the interface) and not any of the instances (meaning, instances of a class that implement the interface). This means that it is a shared value, so you want to use it with care. Note that this is exactly how static fields work on classes, so if you've used them there, you already know the quirks.

More information: C# 8 Interfaces: Static Members.

Also, having a "static Main()" method is now allowed in an interface. If you want to drive your co-workers crazy, here's one way to misuse this: Misuing C#: Multiple Main() Methods.


Technical Differences vs. Usage Differences

So if we look at the 5 technical differences between abstract classes and interfaces, 3 of them are no longer true, and 1 of them has some new caveats.

This really blurs the technical line between abstract classes and interfaces. When it comes to actually using abstract classes and interfaces, I'm currently sticking with my previous approach.

If there is *no* shared code between implementations...
I use an interface. Interfaces are a good place to declare a set of capabilities on an object. I like to use interfaces in this case (rather than a purely abstract class) because it leaves the inheritance slot open (meaning, the class can descend from a different base class if necessary).

If there *is* shared code between implementations...
I use an abstract class. An abstract class is a good place to put shared implementation code so that it does not need to be repeated in each of the descending classes.

Why not use interfaces with defaults for shared implementation? Well, there are a lot of interesting things that come up with default implementation, such as needing to refer explicitly to the interface and not the concrete type. Using the interface type is a generally a good idea (and I recommend that), but when it is forced on you, it can be confusing.

Another limitation has to do with properties. Because interfaces cannot have instance fields, there is no way to set up real properties using default implementation. Abstract classes can have fields, so we can put full properties and automatic properties into an abstract class that are then inherited by descending classes.

My approach is a personal preference based on how interfaces and abstract classes have been used in the past. I'm all for language and technique evolving, but this is a slow process. And until certain practices become widespread, I like to use "the path of least surprise". In this case, it means keeping a logical separation between abstract classes and interfaces even though the technical separation has been blurred.

Mix Ins
One other way that default implementation can be used is for mix ins. This is where an interface *only* contains implemented members. This functionality can then be "mixed in" to a class by simply declaring that the class implements the interface. No other changes to the class are necessary.

This is an interesting idea, and I still have to explore it further. Personally, I wish this functionality got a new keyword. It muddies things up a bit when we try to define what an interface is.

Wrap Up

So if we look at the 5 differences between abstract classes and interfaces, we find that most of them are not longer true starting with C# 8.

If you'd like to see the implications of these changes, take a look at the articles available here: https://github.com/jeremybytes/csharp-8-interfaces or here: A Closer Look at C# 8 Interfaces

And if you'd like a video walkthrough with lots of code samples, you can take a look at this video on YouTube: What's New in C# 8 Interfaces (and how to use them effectively). This is from a talk I did for the Phoenix AZ area user groups.


Happy Coding!

Friday, September 11, 2020

"await Task.WhenAll" Shows One Exception - Here's How to See Them All

When we use "await" in C#, we can use the typical try/catch blocks that we're used to. That's a great thing, and it makes asynchronous programming a lot easier. 

One limitation is that "await" only shows a single exception on a faulted Task. But we can use a little extra code to see all of the exceptions.

Let's take a closer look at what happens when we "await" a faulted Task and how we can take more control if we need it. 

Background

Tasks in C# are very powerful. They can be chained together, run in parallel, and set up in parent/child relationships. With this power comes a bit of complexity. One part of that is that tasks generate AggregateExceptions -- basically a tree structure of exceptions.

When we "await" a Task, it unwraps the AggregateException to give us the inner exception. This makes our code easier to program, and often there will only be one inner exception. But when there are multiple exceptions, we lose visibility to them.

This topic came up in a virtual training class that I did this past week. We were looking at parallel processing that included exception handling. We didn't have time to dig into this in the class, so I did some experimentation after.

Note: the code for this article (and the rest of the virtual training class) is available on GitHub: https://github.com/jeremybytes/understanding-async-programming.

The projects that we'll use today are "People.Service" (the web service), "TaskAwait.Library" (a library with async methods), and "TaskException.UI.Console" (a console application).

If you're completely new to Task, you can check out the articles and videos listed here: I'll Get Back to You: Task, Await, and Asynchronous Methods in C#.

Parallel Code

For this sample code, we have a console application that calls a web service multiple times. It is called for each record we want to display (for this scenario, we might only want a handful of records rather than the entire collection).

Starting the Service

If you want to run the application yourself, you'll need to start the "People.Service" web service. To do this, navigate your favorite terminal to the "People.Service" folder and type "dotnet run".


This shows the service listening at "http://localhost:9874". The specific endpoint we're using here is "http://localhost:9874/people/3" (where "3" is the id of the record that we want).

Running in Parallel with Task

The "TaskException.UI.Console" project has a single "Program.cs" file. This is where we have our parallel code. Here is an excerpt from the "UseTaskAwaitException" method (starting on line 61 of the Program.cs file):

This section sets up a loop based on the IDs of the records we want to retrieve.

The loop starts with a call to the "GetPersonAsyncWithFailures" method that returns a task (we'll look at this method in a bit). Note the "WithFailures" is there because I created a separate method to throw arbitrary exceptions.

The resulting task is added to the a task list that holds a reference to all of the tasks.

Then we set up a continuation on that task. When the task is complete, it will output the record to the console. Note that this is marked with "OnlyOnRanToCompletion", so this continuation only runs on success; it will not run if an exception is thrown.

As a last step in the loop, we add the continuation task to our list of tasks.

Because nothing is "await"ed in this block of code, all of the tasks (9 altogether) are generated very quickly without blocking. So these will run in parallel.

Outside of the loop, we use "await Task.WhenAll(taskList)" to wait for everything to complete.

This will wait for all 9 tasks to complete before continuing with the rest of the method.

The Happy Path

When no exceptions are thrown, this outputs 9 records to the console. Here is a screenshot from the "Parallel.UI.Console" project (which calls a method that does *not* throw exceptions):

This shows us all 9 records. Now let's see what happens when there are exceptions.

Throwing Arbitrary Exceptions

As mentioned above, the method we're calling for testing throws some arbitrary exceptions. Our method is called "GetPersonAsyncWithFailures". Here is an excerpt from that (starting on line 75 of the TaskAwait.Library/PersonReader.cs file):

This method has two "if" statements that throw exceptions. If the "id" parameter is "2", then we throw an InvalidOperationException. If the "id" parameter is "5", we throw a "NotImplementedException".

These exceptions are completely arbitrary. We just need multiple calls to this method to fail so we can look at the aggregated exceptions.

"await Task.WhenAll" Shows One Exception

As a reminder, here's the code that we're working with (starting on line 61 of the Program.cs file in the "TaskException.UI.Console" project):

As noted above, the "foreach" loop will run the async method 9 times (with 9 different values). Two of these values will throw exceptions.

We don't need to look for these exceptions in the continuation since we marked it with "OnlyOnRanToCompletion" -- the continuation only runs for successful tasks.

The good news is that when we "await" a faulted Task (meaning, a Task that throws an exception), the exception is automatically thrown in our current context so that we can use a normal "try/catch" block.

When we use "await Task.WhenAll(tasklist)", it will throw an exception if any of the tasks are faulted.

Since we have 2 faulted tasks here, that's exactly what happens.

Here's a bit more of the "UseTaskAwaitException" method (starting on line 76 of the Program.cs file);

The top of this block shows the "await Task.WhenAll" call. Then we have catch blocks to handle cancellation or exceptions. The "OutputException" method shows several of the properties of the exception on the console (the details are in the same "Program.cs" file if you're interested).

Let's run this method to see what happens. To run the application, navigate (using your favorite terminal) to the "TaskException.UI.Console" folder and type "dotnet run".

This will bring up the console application menu:

The code we've seen so far is part of option #1 "Use Task (parallel - await Exception)". If we press "1", then we get the following output:

The output shows us 7 records (since 2 of the 9 failed). And it shows us the "InvalidOperationException" that is thrown for ID 2.

But it does *not* show us the other exception that was thrown.

We'll dig the exception out of the Task directly and look at its values. Then it will be a little more clear what "await" is doing here.

Getting All of the Exceptions

As mentioned at the beginning, Task is very powerful. Because of the various ways that we can chain tasks or run them in parallel, Task uses what's known as an "AggregateException". This contains all of the exceptions that happen in relation to the task.

To see this we will do something a little different with the "Task.WhenAll" method call.

"Task.WhenAll" returns a Task (and this is why we can "await" it). But we can also set up our own continuation to see the details of what happened.

We have a separate method where we've done just that. Let's look at the loop on the "UseTaskAggregateException" method (starting on line 106 of the Program.cs file):

The "foreach" loop is that same as the prior method, but the "await" is different.

The continuation that is set up here is marked as "OnlyOnFaulted", so it will only run if at least one of the Tasks throws an exception. The "task.Exception" property has the AggregateException that is mentioned above.

The "OutputException" method show different parts of the exception. The details are in the "Program.cs" file if you're interested. Otherwise, we can look at the output to see what is in the exception.

Note: we are still using "await", but we are not awaiting the "WhenAll". Instead, we are awaiting the continuation task on "WhenAll". This will still pause our method until all of the tasks are complete (including this final continuation). It can get a bit confusing when mixing "await" with directly using Task.

If we re-run the console application and choose "2" this time (for "Use Task (parallel) - view AggregateException"), we get the following output:

The output shows the 7 successful records, but the exception is different from what we saw earlier.

Instead of the "InvalidOperationException", we get an "AggregateException". The exception message shows us that "One or more errors occurred", and it also concatenates the inner exception messages.

Also note that there is an "InnerException" property that has the "InvalidOperationException". This is the exception that is shown when used "await" above.

But an AggregateException also has an "InnerExceptions" property (which is a collection). If we iterate over this, we find both of our exceptions: the "InvalidOperationException" and the "NotImplementedException".

When we use a continuation to check a Task exception directly, we get more details than when we use "await". The good news is that we can iterate over these inner exceptions and pass them along to our logging system.

Changing the Order

I was wondering if the exception that "await" shows for this code is deterministic. And it turns out that it is.

As an experiment, I reversed the order of the task list before passing it to "WhenAll". Here's the code for the "await" block:

With the reverse in place, we get a different output:

The output shows us the other exception that was generated - the NotImplementedException for ID 5.

We can also reverse the task list before sending it to the continuation:

The output shows us a bit better what is happening:

By looking directly at the AggregateException, we can see that the "InnerException" is now the one related to ID 5 (which is shown when we "await" things above).

Looking at the "InnerExceptions" collection, we see that the order is reversed. The ID 5 exception is first, and the ID 2 exception is second.

So when it comes to awaiting "Task.WhenAll", it appears that the exception we ultimately get is based on the order of the Tasks that are passed to the "WhenAll" method. Fun, huh?

Is This Important?

So how much do we need to care about this? The typical answer I would give is "probably not much". Tasks are extremely powerful, but we often do not need that power. And that's why "await" exists.

"await" gives us the 95% scenario. A common scenario I have is that I need to call an asynchronous method, wait for it to finish, do something with the result, and handle exceptions. If I only need to wait for one method at a time, then "await" works perfectly for this.

When we need to step outside of this scenario, it's good to understand how to use Task more directly. By understanding the process of setting up continuations and unwrapping AggregateExceptions, we can take more control of the situation when we need it.

Often we only care that "an exception happened". This lets us know that our application is in a potentially unstable state, and we need to handle things accordingly. We do not care about all of the details of what went wrong, and "await" is all we need. Whether we need to dig deeper depends on the needs of the users and the application.

If you want more information about using Task and await, you can check out the articles and video series on my website: I'll Get Back to You: Task, Await, and Asynchronous Methods in C#.

Happy Coding!