Monday, December 20, 2021

Cancelling IAsyncEnumerable in C#

IAsyncEnumerable combines the power of IEnumerable (which lets us "foreach" through items) with the joys of async code (that we can "await"). Like many async methods, we can pass in a CancellationToken to short-circuit the process. But because of the way that we use IAsyncEnumerable, passing in a cancellation token is a bit different than we are used to.

Short Version:
The "WithCancellation" extension method on IAsyncEnumerable lets us pass in a cancellation token.
Let's take a quick look at how to use IAsyncEnumerable, and then we'll look at cancellation.

Using IAsyncEnumerable

I started using IAsyncEnumerable when I exploring channels in C#. Here's an example method that uses "await foreach" with an IAsyncEnumerable (taken from the Program.cs file of this GitHub repository: https://github.com/jeremybytes/csharp-channels-presentation):
    private static async Task ShowData(ChannelReader<Person> reader)
    {
        await foreach(Person person in reader.ReadAllAsync())
        {
            DisplayPerson(person);
        };
    }
The ChannelReader type has a "ReadAllAsync" method that returns an IAsyncEnumerable. What this means is that someone else can be writing to the channel while we read from it. The IAsyncEnumerable means that if there is not another item ready to read, we can wait for it. So, we can pull things off the channel as they are added, even if there are delays between each item getting added.

Waiting for items can cause a problem, and that's where the "async" part comes in. Since this is asynchronous, we can "await" each item. This gives us the advantages that we are used to with "await", meaning that we are not blocking processes and threads while we wait.

By using "await foreach" on an IAsyncEnumerable, we get this combined functionality: getting the next item in the enumeration and waiting asynchronously if the next item isn't ready yet.

For more information on channels, check the repository for list of articles and a recorded presentation: https://github.com/jeremybytes/csharp-channels-presentation.

The IAsyncEnumerable Interface

Things got a little more interesting when I was building my own classes that implement IAsyncEnumerable. The interface itself only has one method: GetAsyncEnumerator. An implementation could look something like this (we'll call this our "Processor" since it will process some custom data for us):
    public IAsyncEnumerator<int> GetAsyncEnumerator(
        CancellationToken cancellationToken = default)
    {
        while (...)
        {
            // interesting async stuff here
            yield return nextValue;
        }
    }
Like with IEnumerable, we can use "yield return" to return the next value from the enumeration. Unlike IEnumerable, we can also put asynchronous code in here, whether it's an asynchronous service call or waiting for a record to finish a complex process. (That goes in the "// interesting async stuff here" section; we won't look at that today.)

CancellationToken Parameter

An interesting thing about the GetAsyncEnumerable method is that it has a CancellationToken parameter. Notice that the the CancellationToken has a "default" value set. This means that the cancellation token is optional. If we do not pass in a token, the code will still work.

If we wanted to check the cancellation token in the code. This could look something like this:
    public async IAsyncEnumerator<int> GetAsyncEnumerator(
        CancellationToken cancellationToken = default)
    {
        while (...)
        {
            cancellationToken.ThrowIfCancellationRequested();
            // interesting stuff here
            yield return nextValue;
        }
    }
Each time through the "while" loop, the cancellation token will be checked. If cancellation is requested, then this will throw an OperationCanceledException.

Passing a Cancellation Token

The next part is where things get interesting. How do we actually pass a cancellation token to the IAsyncEnumerable? If we are using the "Processor" that we created above. That call could look something like this.
    await foreach (int currentItem in processor)
    {
        DisplayItem(currentItem);
    }
When we use "foreach", we do not call the "GetAsyncEnumerable" directly. That also means that we cannot pass in a cancellation token directly.

But there is an extension method available on IAsyncEnumerable that helps us out: WithCancellation.

Here's that same foreach loop with a cancellation token passed in:
    await foreach (int currentItem in processor.WithCancellation(tokenSource.Token))
    {
        DisplayItem(currentItem);
    }
This assumes that we have a CancellationTokenSource (called "tokenSource") elsewhere in our code.

For more information on Cancellation, you can refer to Task and Await: Basic Cancellation. The article was written a while back. Updated code samples (.NET 6) are available here: https://github.com/jeremybytes/using-task-dotnet6.

ConfigureAwait

As a side note, there is another extension method on IAsyncEnumerable: ConfigureAwait. This lets us use "ConfigureAwait(false)" in areas that we need it. ConfigureAwait isn't needed in the ASP.NET world anymore, but it can still be useful if we are doing desktop or other types of programming.

Wrap Up

IAsyncEnumerable gives us some pretty interesting abilities. I've been exploring it quite a bit lately. In some code comparisons, I was able to move async code into a library that made using it quite a bit easier. Once that sample code is ready, I'll be sharing it on GitHub.

Until then, keep exploring. Sometimes the answers are not obvious. It's okay if it takes some time to figure things out.

Happy Coding!

Thursday, September 30, 2021

Coding Practice: Learning Rust with Fibonacci Numbers

In my exploration of Rust, I built an application that calculates Fibonacci numbers (this was a suggestion from the end of Chapter 3 of The Rust Programming Language by Steve Klabnik and Carol Nichols).

It helped me learn a bit more about the language and environment.
  • for loops
  • Statements vs. expressions
  • Function returns (expressions)
  • checked_add to prevent overflow
  • Option enum (returned from checked_add)
  • Pattern matching on Option
  • Result enum (to return error rather than panic)
  • .expect with Result
  • Pattern matching on Result
So let's walk through this project.

The code is available on GitHub: https://github.com/jeremybytes/fibonacci-rust, and branches are set up for each step along the way. We will only be looking at the "main.rs" file in each branch, so all of the links will be directly to this file.

Fibonacci Numbers

The task is to calculate the nth Fibonacci number. The Fibonacci sequence is made by adding the 2 previous number in the sequence. So the sequence starts: 1, 1, 2, 3, 5, 8, 13, 21, 34. The 7th Fibonacci number (13) is the sum of the previous 2 numbers (5 and 8).

For our application, we will create a function to generate the nth Fibonacci number based on an input parameter. We'll call this function with multiple values and output the results.

Step 1: Creating the Project

Branch: 01-creation
The first step is to create the project. We can do this by typing the following in a terminal:
    cargo new fib
This will create a new folder called "fib" along with the Rust project file, a "src" folder to hold the code, and a "main.rs" file (in src) which is where we will be putting our code.

The "main.rs" file has placeholder code:
    fn main() {
        println!("Hello, world!");
    }
But we can use "cargo run" to make sure that everything is working in our environment.
    C:\rustlang\fib> cargo run
       Compiling fib v0.1.0 (C:\rustlang\fib)
        Finished dev [unoptimized + debuginfo] target(s) in 0.63s
         Running `target\debug\fib.exe`
    Hello, world!
Going forward, I'll just show the application output (without the compiling and running output).

Step 2: Basic Fibonacci

Branch: 02-basic
Now that we have the shell, let's create a function to return a Fibonacci number. Here's the completed function:
    fn fib(n: u8) -> u64 {
        let mut prev: u64 = 0;
        let mut curr: u64 = 1;
        for _ in 1..n {
            let next = prev + curr;
            prev = curr;
            curr = next;
        }
        curr
    }
There are several interesting bits here. Let's walk through them.

Declaring a Function
Let's start with the function declaration:
    fn fib(n: u8) -> u64 {

    }
The "fn" denotes that this is a function. "fib" is the function name. "n: u8" declares a parameter called "n" that is an unsigned 8-bit integer. And the "u64" after the arrow declares that this function returns an unsigned 64-bit integer.

When declaring parameters and return values for functions, the types are required. Rust does use type inference in some places (as we'll see), but function declarations need to have explicit types.

Declaring Variables
Next, we have some variables declared and assigned:
    let mut prev: u64 = 0;
    let mut curr: u64 = 1;
"let" declares a variable.

By default, variables are immutable. This means that once we assign a value, we cannot change it. For these variables, we use "mut" to denote that they are mutable. So we will be able to change the values later.

The variable names are "prev" and "curr". These will hold the "previous number" and the "current number" in the sequence.

The ": u64" declares these as unsigned 64-bit integer values. Fibonacci numbers tend to overflow very quickly, so I used a fairly large integer type.

Finally, we assign initial values of 0 and 1, respectively.

Looping with "for"
There are several ways to handle the loop required to calculate the Fibonacci number. I opted for a "for" loop:
    for _ in 1..n {

    }
"1..n" represents a range from 1 to the value of the incoming function argument. So if the argument is "3", this represents the range: 1, 2, 3.

The "for" statement will loop once for each value in the range. In this case the "_" denotes that we are not using the actual range value inside the loop. All we really need here is to run the loop 3 times. All of the calculation is done inside the loop itself.

Implicit Typing
Inside the "for" loop we do our calculations:
    let next = prev + curr;
    prev = curr;
    curr = next;
This creates a new variable called "next" inside the loop and assigns it the sum of "prev" and "curr". A couple of things to note. First, this variable is immutable (so we do not have the "mut" keyword). The value is assigned here and then it is not changed. Second, the "next" variable is implicitly typed. Instead of having a type declaration, it is set based on what is assigned to it. Since we are assigning the sum of two u64 values, "next" will also be a u64.

The next two lines update the "prev" and "curr" values. We needed to mark them as mutable when we declared them so that we could update them here.

This is a fairly naïve way of calculating Fibonacci numbers. If you'd like to see more details on how the calculation works, you can take a look at this article: TDDing into a Fibonacci Sequence with C#.

Statements vs. Expressions
The last line of the function is a bit interesting:
    curr
This returns the current value ("curr") from the function.

Rust does not use a "return" keyword to return a value. Instead, the last expression in a function is what is returned. (As a side note, this is similar to how F# works.)

What's the difference between a statement and an expression? A statement does some type of work; an expression returns a value.

In Rust, a statement ends with a semi-colon, and an expression does not end with a semi-colon. To make things more interesting, these are often combined.

Let's take a look back at a line of code:
    let next = prev + curr;
Overall, this is a statement: it declares and assigns a value to a variable called "next". And it ends with a semi-colon.
    prev + curr
"prev + curr" is an expression that returns the result of adding 2 values. So we really have a statement that includes an expression. (We can technically break this down further, but we won't do that here.)

So, let's get back to the return value of the function. The "fib" function returns a u64 value. The last expression in the function is:
    curr
It is important to note that this line does not end with a semi-colon. Because of this, the value of the "curr" variable (which is a u64) is returned for this function.

Because of my coding history, I'm used to putting semi-colons at the end of lines. So I'm sure that I'll mess this up many times before I get used to it. If you get an error that says a function is returning "()" instead of a particular type, it probably means that there's a semi-colon at the end of the expression you meant to return.

Here's the full function:
    fn fib(n: u8) -> u64 {
        let mut prev: u64 = 0;
        let mut curr: u64 = 1;
        for _ in 1..n {
            let next = prev + curr;
            prev = curr;
            curr = next;
        }
        curr
    }
Using the "fib" Function
Now that we have a function that returns a Fibonacci number, it's time to update the "main" function to use it.
    fn main() {
        println!("Fibonacci 1st = {}", fib(1));
        println!("Fibonacci 2nd = {}", fib(2));
        println!("Fibonacci 3rd = {}", fib(3));
        println!("Fibonacci 4th = {}", fib(4));
        println!("Fibonacci 5th = {}", fib(5));
    }
This uses the "println!" macro to output a string to the standard output. On each line, the set of curly braces represents a placeholder in the string. So in the first statement, the curly braces will be replaced by the value that comes back from calling "fib(1)".

So let's run and check the output:
    Fibonacci 1st = 1
    Fibonacci 2nd = 1
    Fibonacci 3rd = 2
    Fibonacci 4th = 3
    Fibonacci 5th = 5
It works!

Well, it mostly works. We'll see a shortcoming in a bit.

Step 3: Testing More Values

Branch: 03-mainloop
Before looking at where we have a problem in the "fib" function, let's make it easier to test for different values. For this, we'll add an array of numbers to test, and then loop through them.

Here's an updated "main" function:
    fn main() {
        let nths = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

        for nth in nths {
            println!("Fibonacci {} = {}", nth, fib(nth));
        }
    }
Creating an Array
The first line sets up an array called "nths" and initializes it with the values of 1 through 10. I use this name because our original task was to calculate the "nth" Fibonacci number. This is a collection of all the ones we want to calculate.

We're using type inference to let the compiler pick the type for "nths". In this case, it determines that it is an array with 10 elements of type u8. It decides on u8 because the values are used as arguments for the "fib" function, and that takes a u8.

As an interesting note, if you comment out the "println!" statement, the "nths" variable is an array with 10 elements of type i32 (a signed 32-bit integer). This is the default integer type.

Type inference works as long as it can be determined at compile time. If it cannot be determined at compile time, then an explicit type needs to be added.

Another "for" Loop
We use a "for" loop to go through the array. Instead of discarding the value from the "for" loop (like we did above), we capture it in the "nth" variable.

Inside the loop, we have a "println!" with 2 placeholders, one for the loop value and one for the result of the "fib" function.

Here's what that output looks like:
    Fibonacci 1 = 1
    Fibonacci 2 = 1
    Fibonacci 3 = 2
    Fibonacci 4 = 3
    Fibonacci 5 = 5
    Fibonacci 6 = 8
    Fibonacci 7 = 13
    Fibonacci 8 = 21
    Fibonacci 9 = 34
    Fibonacci 10 = 55
And now we can more easily test values by adding to the array.

Overflow!
As I noted at the beginning, Fibonacci sequences tend to overflow pretty quickly (they increase the value by half for each item). We can see this by adding a "100" to our array.
    let nths = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100];
Here's the output when we run with these values:
    Fibonacci 1 = 1
    Fibonacci 2 = 1
    Fibonacci 3 = 2
    Fibonacci 4 = 3
    Fibonacci 5 = 5
    Fibonacci 6 = 8
    Fibonacci 7 = 13
    Fibonacci 8 = 21
    Fibonacci 9 = 34
    Fibonacci 10 = 55
    thread 'main' panicked at 'attempt to add with overflow', src\main.rs:13:20
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    error: process didn't exit successfully: `target\debug\fib.exe` (exit code: 101)
This creates a "panic" in our application, and it exits with an error.

With a little experimentation, we will find that the 93rd Fibonacci number is fine, but the 94th will overflow the u64 value.

Checking for Overflow

Branch: 04-checkedadd
Now we can work on fixing the overflow. Our problem is with this line:
    let next = prev + curr;
If "prev" and "curr" are near the upper limits of the u64 range, then adding them together will go past that upper limit.

Some other languages will "wrap" the value (starting over again at 0). Rust will generate an error instead (in the form of a panic). If you do want to wrap the value, Rust does offer a "wrapped_add" function that does just that.

But we do not want to wrap, we would like to catch the error and give our users a better experience.

checked_add
Instead of using the default "+" operator, we can use the "checked_add" function. Here is that code:
    let result = prev.checked_add(curr);
"checked_add" does not panic if the value overflows. This is because it uses the Option enum.

Option Enum
The Option enum lets us return either a valid value or no value. "Some<T>" is used if there is a valid value, otherwise "None" is used.

For example, let's say that "prev" is 1 and "curr" is 2. The "result" would be "Some(3)".

If "prev" and "curr" are big enough to cause an overflow when added together, then "result" would be "None".

Pattern Matching
The great thing about having an Option as a return type is that we can use pattern matching with it.

Here is the inside of the updated "for" loop:
    let result = prev.checked_add(curr);
    match result {
        Some(next) => {
            prev = curr;
            curr = next;
        }
        None => {
            curr = 0;
            break;
        }
    }
The "match" keyword sets up the pattern matching for us.

The first "arm" has "Some(next)" as the pattern. The "next" part lets us assign a name to the value that we can use inside the block. In this case, "next" will hold the same value that it did in the earlier version ("prev" + "curr"), so inside the block, we can assign the "prev" and "curr" values like we did before.

The second "arm" has "None" as the pattern. This will be used if there is an overflow. If there is an overflow, then we set the "curr" variable to 0 and then break out of the "for" loop.

Here is the updated "fib" function:
    fn fib(n: u8) -> u64 {
        let mut prev: u64 = 0;
       let mut curr: u64 = 1;
        for _ in 1..n {
            let result = prev.checked_add(curr);
            match result {
                Some(next) => {
                    prev = curr;
                    curr = next;
                }
                None => {
                    curr = 0;
                    break;
                }
            }
        }
        curr
    }
Here's an updated array to test valid values and overflow values:
    let nths = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 90, 91, 92, 93, 94, 95, 96];
And here's the output:
    Fibonacci 1 = 1
    Fibonacci 2 = 1
    Fibonacci 3 = 2
    Fibonacci 4 = 3
    Fibonacci 5 = 5
    Fibonacci 6 = 8
    Fibonacci 7 = 13
    Fibonacci 8 = 21
    Fibonacci 9 = 34
    Fibonacci 10 = 55
    Fibonacci 90 = 2880067194370816120
    Fibonacci 91 = 4660046610375530309
    Fibonacci 92 = 7540113804746346429
    Fibonacci 93 = 12200160415121876738
    Fibonacci 94 = 0
    Fibonacci 95 = 0
    Fibonacci 96 = 0
Setting "curr" to 0 is not a great way to handle our error state. For now, it is fine because it gets rid of the panic, and our application keeps running.

Next up, we'll work on getting a real error that we can handle.

Using the Result Enum

Branch: 05-result
In the last article, I wrote a bit about my first impressions of error handling in Rust: Initial Impressions of Rust. A big part of that involves the Result enum.

Similar to the Option enum, the Result enum represents exclusive states. For Result, the options are "Ok" and "Err". Each of these can have their own type.

To update our "fib" function to return a Result, we'll need to make 2 updates.

Updating the Function Declaration
First, we'll need to update the signature of the function to return a Result. Here's the new signature:
    fn fib(n: u8) -> Result<u64, &'static str> {

    }
The "Result" enum has 2 generic type parameters. The first represents the type for the "Ok" value; the second represents the type for the "Err".

In this case, the "Ok" will be a u64.

The "Err" is a bit more confusing. We want to return a string, but if we try to use just "str", we get an error that the compiler cannot determine the size at compile time. And as we've seen, Rust needs to be able to determine things at compile time.

Instead of using "str", we can use "&str" to use the address of a string (Rust does use pointers; we won't talk too much about them today). The address is a fixed size, so that gets rid of the previous error. But we get a new error that there is a "missing lifetime specifier".

UPDATE Technical Note: '&str' is a string slice. This allows the Err to borrow the value of the string without taking ownership of it. (I've learned more about ownership since I wrote this article. It's pretty interesting.)

The good news is that the error also gives you a hint to consider using the "static" lifetime with an example.

I'm using Visual Studio Code with the Rust extension, so I get these errors and hints in the editor. But these same messages show up if you build using "cargo build".

Returning a Result
Now that we've updated the function signature, we need to actually return a Result. We can do this with some pattern matching.

Replace the previous expression at the end of the function:
    curr
with a "match":
    match curr == 0 {
        false => Ok(curr),
        true => Err("Calculation overflow")
    }
This looks at the value of the "curr" variable and compares it to 0. (Again, this isn't the best way to handle this, but we'll fix it a bit later).

If "curr" is not 0 (meaning there is a valid value), then we hit the "false" arm and return an "Ok" with the value.

If "curr" is 0 (meaning there was an overflow), then we hit the "true" arm and return an "Err" with an appropriate message.

Here's the updated "fib" function:
    fn fib(n: u8) -> Result<u64, &'static str> {
        let mut prev: u64 = 0;
        let mut curr: u64 = 1;
        for _ in 1..n {
            let result = prev.checked_add(curr);
            match result {
                Some(next) => {
                    prev = curr;
                    curr = next;
                }
                None => {
                    curr = 0;
                    break;
                }
            }
        }
        match curr == 0 {
            false => Ok(curr),
            true => Err("Calculation overflow")
        }
    }
Side Note: In thinking about this later, I could have done the pattern matching more elegantly. I started with an if/else block (which is why I'm matching on a boolean value). But we could also write the pattern matching to use the "curr" value directly:
    match curr {
        0 => Err("Calculation overflow"),
        _ => Ok(curr),
    }
This is more direct (but not really less confusing since 0 is a magic number here). Now the match is on the "curr" value itself. If the value is "0", then we return the Err. For the second arm, the underscore represents a catch-all. So if the value is anything other than "0", we return "Ok". Notice that I did have to reverse the order of the arms. The first match wins with pattern matching, so the default case needs to be at the end.

Both of these matches produce the same results. We won't worry about them too much because we'll be replacing this entirely in just a bit.

But since we changed the return type, our calling code needs to be updated.

Using ".expect"
One way that we can deal with the Result enum is to use the "expect()" function.

Here is the updated code from the "main" function:
    println!("Fibonacci {} = {}", nth, fib(nth).expect("Fibonacci calculation failed"));
After the call to "fib(nth)", we add an ".expect()" call and pass in a message.

"expect()" works on a Result enum. If the Result is "Ok", then it pulls out the value and returns it. So if there is no overflow, then the expected Fibonacci number is used for the placeholder.

But if Result is "Err", then "expect" will panic. That's not exactly what we want here, but this gets us one step closer.

With the "expect" in place, here is our output:
    Fibonacci 1 = 1
    Fibonacci 2 = 1
    Fibonacci 3 = 2
    Fibonacci 4 = 3
    Fibonacci 5 = 5
    Fibonacci 6 = 8
    Fibonacci 7 = 13
    Fibonacci 8 = 21
    Fibonacci 9 = 34
    Fibonacci 10 = 55
    Fibonacci 90 = 2880067194370816120
    Fibonacci 91 = 4660046610375530309
    Fibonacci 92 = 7540113804746346429
    Fibonacci 93 = 12200160415121876738
    thread 'main' panicked at 'Fibonacci calculation failed: "Calculation overflow"', src    \main.rs:5:53
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    error: process didn't exit successfully: `target\debug\fib.exe` (exit code: 101)
Because we get a panic when trying to fetch #94, the application halts and does not process the rest of the values (95 and 96).

If we look at the output, we see both of the messages that we added. "Fibonacci calculation failed" is what we put into the "expect", and "Calculation overflow" is what we put into the "Err" Result.

But I'd like to get rid of the panic. And we can do that with more pattern matching.

Matching on Result

Branch: 06-matchresult
Just like we used pattern matching with Option in the "fib" function, we can use pattern matching with Result in the "main" function.

Pattern Matching
Here's the updated "for" loop from the "main" function:
    for nth in nths {
        match fib(nth) {
            Ok(result) => println!("Fibonacci {} = {}", nth, result),
            Err(e) => println!("Error at Fibonacci {}: {}", nth, e),
        }
    }
Inside the "for" loop, we match on the result of "fib(nth)".

If the Result is "Ok", then we use "println!" with the same string that we had before.

If the Result is "Err", then we output an error message.

Adding an Overflow Flag
The last thing I want to do is get rid of the "curr = 0" that denotes an overflow. Even though this works, it's a bit unclear. (And it can cause problems since some implementations of Fibonacci consider "0" to be a valid value.)

For this, we'll add a new variable called "overflow" to the "fib" function. Here's the completed function with "overflow" in place:
    fn fib(n: u8) -> Result<u64, &'static str> {
        let mut prev: u64 = 0;
        let mut curr: u64 = 1;
        let mut overflow = false;
        for _ in 1..n {
            let result = prev.checked_add(curr);
            match result {
                Some(next) => {
                    prev = curr;
                    curr = next;
                }
                None => {
                    overflow = true;
                    break;
                }
            }
        }
        match overflow {
            false => Ok(curr),
            true => Err("Calculation overflow")
        }
  }
A new mutable "overflow" variable is created and set to "false". Then if there is an overflow, it is set to "true". Finally, "overflow" is used in the final pattern matching to determine whether to return "Ok" or "Err".

Final Output
With these changes in place, here is our final output:
    Fibonacci 1 = 1
    Fibonacci 2 = 1
    Fibonacci 3 = 2
    Fibonacci 4 = 3
    Fibonacci 5 = 5
    Fibonacci 6 = 8
    Fibonacci 7 = 13
    Fibonacci 8 = 21
    Fibonacci 9 = 34
    Fibonacci 10 = 55
    Fibonacci 90 = 2880067194370816120
    Fibonacci 91 = 4660046610375530309
    Fibonacci 92 = 7540113804746346429
    Fibonacci 93 = 12200160415121876738
    Error at Fibonacci 94: Calculation overflow
    Error at Fibonacci 95: Calculation overflow
    Error at Fibonacci 96: Calculation overflow
This version no longer panics if there is an overflow. If we do have an overflow, it gives us an error message. And all of the values will get calculated, even if an overflow occurs in the middle.

Wrap Up

Calculating Fibonacci numbers is not a very complex task. But in this walkthrough, we got to use several features of Rust and understand a bit more about how the language works.
  • for loops
  • Statements vs. expressions
  • Function returns (expressions)
  • checked_add to prevent overflow
  • Option enum (returned from checked_add)
  • Pattern matching on Option
  • Result enum (to return error rather than panic)
  • .expect with Result
  • Pattern matching on Result
Quite honestly, I didn't expect to get this much out of this exercise. I've calculate Fibonacci sequences lots of times before. I was surprised about what I learned.

It's okay to do "simple" exercises. And it's okay to be surprised when they don't go quite as you expected.

Happy Coding!

Sunday, September 26, 2021

Initial Impressions of Rust

I experimented a little with Rust this past week. I haven't gone very deep at this point, but there are a few things I found interesting. To point some of these out, I'm using a number guessing game (details on the sample and where I got it are at the end of the article). The code can be viewed here: https://github.com/jeremybytes/guessing-game-rust.

Disclaimer: these are pretty raw impressions. My opinions are subject to change as I dig in further.

The code is for a "guess the number" game. Here is a completed game (including both input and output):

    Guess the number!
    Please input your guess.
    23
    You guessed: 23
    Too small!
    Please input your guess.
    78
    You guessed: 78
    Too big!
    Please input your guess.
    45
    You guessed: 45
    Too small!
    Please input your guess.
    66
    You guessed: 66
    Too small!
    Please input your guess.
    71
    You guessed: 71
    You win!
So let's get on to some of the language and environment features that I find interesting.

Automatic Git

Cargo is Rust's build system and package manager. The following command will create a new project:

cargo new guessing-game

Part of the project creation is a new Git repository (and a .gitignore file). So there are no excuses about not having source control.

Pattern Matching

Here's a pattern matching sample (from the main.rs file in the repository mentioned above):

    match guess.cmp(&secret_number) {
        Ordering::Less => println!("Too small!"),
        Ordering::Greater => println!("Too big!"),
        Ordering::Equal => {
            println!("You win!");
            break;
        },
    }

In this block "guess.cmp(&secret_number)" compares the number the user guessed to the actual number. "cmp" returns an Ordering enumeration. So "Ordering::Less" denotes the "less than" value of the enumeration.

Each "arm" of the match expression has the desired functionality: either printing "Too small!", "Too big!", or "You win!".

As a couple of side notes: the "println!" (with an exclamation point) is a macro that prints to the standard output. I haven't looked into macros yet, so that will be interesting to see how this expands. Also the "break" in the last arm breaks out of the game loop. We won't look at looping in this article.

Error Handling

I like looking at different ways of approaching error handling. Earlier this year, I wrote about the approach that Go takes: Go (golang) Error Handling - A Different Philosophy. Go differs quite a bit from C#. While C# uses exceptions and try/catch blocks, Go uses strings - it's up to the programmer to specifically check for those errors.

Rust takes an approach that is somewhere in between by using a Result enumeration. It is common to return a Result which will provide an "Ok" with the value or an "Err".

Let's look at 2 approaches. For this, we'll look at the part of the program that converts the input value (a string) into a number.

Using "expect"
Let's look at the following code (also from the main.rs file):

    let guess: u32 = guess.trim().parse()
        .expect("invalid string (not a number)");
This code parses the "guess" string and assigns it to the "guess" number (an unsigned 32-bit integer). We'll talk about why there are two things called "guess" in a bit.

The incoming "guess" string is trimmed and then parsed to a number. This is done by stringing functions together. But after the parse, there is another function: expect.

The "parse" function returns a Result enumeration. If we try to assign this directly to the "guess" integer, we will get a type mismatch. The "expect" function does 2 things for us. (1) If "parse" is successful (meaning Result is "Ok"), then it returns the value that we can assign to the variable. (2) If "parse" fails (meaning Result is "Err"), then our message is added to the resulting error.

Here's a sample output:

    Guess the number!
    Please input your guess.
    bad number
    thread 'main' panicked at 'invalid string (not a number): ParseIntError { kind: InvalidDigit }', src\main.rs:19:14
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    error: process didn't exit successfully: `target\debug\guessing_game.exe` (exit code: 101)
I typed "bad number", and the parsing failed. This tells us that "'main' panicked", which means that we have an error that caused the application to exit. And the message has our string along with the actual parsing error.

Pattern Matching
Another way of dealing with errors is to look at the Result directly. And for this, we can use pattern matching.

Here is the same code as above, but we're handing the error instead:

    let guess: u32 = match guess.trim().parse() {
        Ok(num) => num,
        Err(msg) => {
            println!("{}", msg);
            continue;
        },
    };
Notice that we have a "match" here. This sets up a match expression to use the Result that comes back from the "parse" function. This works similarly to the match expression that we saw above.

If Result is "Ok", then it returns the value from the function ("num") which gets assigned to the "guess" integer.

If Result is "Err", then it prints out the error message and then continues. Here "continue" tells the containing loop to go to its next iteration.

Side note: The "println!" macro uses placeholders. The curly braces within the string are replaced with the value of "msg" when this is printed.

Here is some sample output:

    Guess the number!
    Please input your guess.
    bad number
    invalid digit found in string
    Please input your guess.
    23
    You guessed: 23
    Too small!
This time, when I type "bad number" it prints out the error message and then goes to the next iteration of the loop (which asks for another guess).

Overall, this is an interesting approach to error handling. It is more structured than Go and its error strings but also a lot lighter than C# and its exceptions. I'm looking forward to learning more about this and seeing what works well and where it might be lacking.

Immutability

Another feature of Rust is that variables are immutable by default. Variables must be made explicitly mutable. 

Here is the variable that is used to get the input from the console (from the same main.rs file):

    let mut guess = String::new();

    io::stdin().read_line(&mut guess)
        .expect("failed to read line");
The first line creates a mutable string called "guess". The next line reads from the standard input (the console in this case) and assigns it to the "guess" variable.

A couple more notes: You'll notice the "&" in the "read_line" argument. Rust does have pointers. Also the double colon "::" denotes a static. So "new" is a static function on the "String" type.

Immutable by default is interesting since it forces us into a different mindset where we assume that variable cannot be changed. If we want to be able to change them, we need to be explicit about it.

Variable Shadowing

The last feature that we'll look at today is how we can "shadow" a variable. In the code that we've seen already, there are two "guess" variables:

    let mut guess = String::new();
This is a string, and it is used to hold the value typed in on the console.

    let guess: u32 = guess.trim().parse()
This is a 32-bit unsigned integer. It is used to compare against the actual number that we are trying to guess.

These can have the same name because the second "guess" (the number) "shadows" the first "guess" (the string). This means that after the second "guess" is created, all references to "guess" will refer to the number (not the string).

At first, this seems like it could be confusing. But the explanation that goes along with this code helps it make sense (from The Rust Programming Language, Chapter 2):
We create a variable named guess. But wait, doesn’t the program already have a variable named guess? It does, but Rust allows us to shadow the previous value of guess with a new one. This feature is often used in situations in which you want to convert a value from one type to another type. Shadowing lets us reuse the guess variable name rather than forcing us to create two unique variables, such as guess_str and guess for example. 
I often use intermediate variables in my code -- often to help make debugging easier. Instead of having to come up with unique names for variable that represent the same thing but with different types, I can use the same name.

I'm still a bit on the fence about this. I'm sure that it can be misused (like most language features), but it seems to make a lot of sense in this particular example.

Rust Installation

One other thing I wanted to mention was the installation process on Windows. Rust needs a C++ compiler, so it recommends that you install the "Visual Studio C++ Build Tools". This process was not as straight forward as I would have liked. Microsoft does have a stand-alone installer if you do not have Visual Studio, but it starts up the Visual Studio Installer. I ended up just going to my regular Visual Studio Installer and checking the "C++" workload. I'm sure that this installed way more than I needed (5 GBs worth), but I got it working on both of my day-to-day machines.

Other than the C++ build tools, the rest of the installation was pretty uneventful.

I'm assuming that installation is a bit easier on Unix-y OSes (like Linux and macOS) since a C++ compiler (such as gcc) is usually already installed.

There is also a way to run Rust in the browser: https://play.rust-lang.org/. I haven't played much with this, so I'm not sure what the capabilities and limitations are.

For local editing, I've been using Visual Studio Code with the Rust extension.

Documentation and Resources

The sample code is taken from The Rust Programming Language by Steve Klabnik and Carol Nichols. it is available online (for free) or in printed form.

I have only gone as far as Chapter 2 at this point. I really liked Chapter 2 because it walks through building this guessing game project. With each piece of code, various features were showcased, and I found it to be a really good way to get a quick feel for how the language works and some of the basic paradigms.


Wrap Up

I'm not sure how far I'll go with Rust. There are definitely some interesting concepts (that's why I wrote this article). Recently, I converted one of my applications to Go to get a better feel for the language and stretch my understanding a bit (https://github.com/jeremybytes/digit-display-golang). I may end up doing the same thing with Rust.

Exploring other languages can help us expand our ways of thinking and how we approach different programming tasks. And this is useful whether or not we end up using the language in our day-to-day coding. Keep expanding, and keep learning.

Happy Coding!

Friday, July 9, 2021

A Collection of 2020 Recorded Presentations

2020 was "interesting". One good thing that came out of it is that I had the chance to speak remotely for some user groups and conferences that I would not normally get to attend in person. Many of those presentations were recorded. Here's a list for anyone who is interested

Abstract Art: Getting Abstraction "Just Right" (Apr 2020)

Modern Devs Charlotte

Abstraction is awesome. And abstraction is awful. Too little, and our applications are difficult to extend and maintain. Too much, and our applications are difficult to extend and maintain. Finding the balance is the key to success. The first step is to identify your natural tendency as an under-abstractor or an over-abstractor. Once we know that, we can work on real-world techniques to dial in the level of abstraction that is "just right" for our applications.


I'll Get Back to You: Task, Await, and Asynchronous Methods in C# (Apr 2020)

Enterprise Developers Guild

There's a lot of confusion about async/await, Task/TPL, and asynchronous and parallel programming in general. So let's start with the basics and look at how we can consume asynchronous methods using Task and then see how the "await" operator can makes things easier for us. Along the way, we’ll look at continuations, cancellation, and exception handling.


A Banjo Made Me a Better Developer (May 2020)

Dev Around the Sun

What does a banjo have to do with software development? They both require learning. And picking up a banjo later in life showed me 3 things that I've brought into my developer life. (1) You can learn; a growth mindset removes blockages. (2) You don't have to be afraid to ask for help; experienced banjoists/developers can point you in the right direction. (3) You don't have to be perfect before you share what you've learned; it's okay to show what you have "in progress". In combination, these have made me a better banjo player and a better developer.


I'll Get Back to You: Task, Await, and Asynchronous Methods in C# (Jun 2020)

Houston .NET User Group

There's a lot of confusion about async/await, Task/TPL, and asynchronous and parallel programming in general. So let's start with the basics and look at how we can consume asynchronous methods using Task and then see how the "await" operator can makes things easier for us. Along the way, we’ll look at continuations, cancellation, and exception handling.


What's New in C# 8 Interfaces (and how to use them effectively) (Jun 2020)

Southeast Valley .NET User Group
North West Valley .NET User Group

C# 8 brings new features to interfaces, including default implementation, access modifiers, and static members. We'll look at these new features and see where they are useful and where they should be avoided. With some practical tips, "gotchas", and plenty of examples, we'll see how to use these features effectively in our code.


What's New in C# 8 Interfaces (and how to use them effectively) (Jul 2020)

Tulsa Developers Association

C# 8 brings new features to interfaces, including default implementation, access modifiers, and static members. We'll look at these new features and see where they are useful and where they should be avoided. With some practical tips, "gotchas", and plenty of examples, we'll see how to use these features effectively in our code.


I'll Get Back to You: Task, Await, and Asynchronous Methods in C# (Aug 2020)

Code PaLOUsa

There's a lot of confusion about async/await, Task/TPL, and asynchronous and parallel programming in general. So let's start with the basics and look at how we can consume asynchronous methods using Task and then see how the "await" operator can makes things easier for us. Along the way, we’ll look at continuations, cancellation, and exception handling.


Get Func-y: Understanding Delegates in C# (Oct 2020)

St Pete .NET

Delegates are the gateway to functional programming. So let's understand delegates and how we can change the way we program by using functions as parameters, return types, variables, and properties. In addition, we'll see how the built in delegate types, Func and Action, are waiting to make our lives easier. By the time we're done, we'll see how delegates can add elegance, extensibility, and safety to our programming. And as a bonus, you'll have a few functional programming concepts to take with you.


More to Come!

I have a few more recordings from 2021 presentations, and I'll post those soon. As always, if you'd like me to visit your user group, drop me a line. I'll see what I can do about being there online or in person (once in-person stuff is a bit more back to normal).

Happy Coding!

Thursday, June 24, 2021

New and Updated: "C# Interfaces" Course on Pluralsight

An updated version of my "C# Interfaces" course is now available on Pluralsight.


You can click on the link and watch a short promo for the course. If you are not a Pluralsight subscriber, you can sign up for a free trial and watch the entire course.

"C# Interfaces" title slide with a photo of the author, Jeremy Clark

Description

Code that is easy to maintain, extend, and test is key to applications that can move quickly to meet your users’ needs. In this course, C# Interfaces, you’ll learn to use interfaces to add flexibility to your code. First, you’ll explore the mechanics (“What are interfaces?”) and why you want to use them. Next, you’ll discover how to create your own interfaces to make it easy to change or swap out functionality. Finally, you’ll learn about default member implementation and how to avoid some common stumbling blocks. When you’re finished with this course, you’ll have the skills and knowledge of C# interfaces needed to write application code that is easy to maintain, extend, and test.

What's Different

For those who have seen a previous version of this course, here are a few key things that have been added or updated:
  • New section on default implementation
  • A new comparison between interfaces and abstract classes
  • New demo on using configuration and compile-time binding
  • Updated content for dynamic loading of assemblies
  • All code samples are cross-platform friendly (Windows, Linux, macOS)
For a full list of changes, you can check out the change log: "C# Interfaces" Change Log.

Additional Resources

I have added an "Additional Resources" repository on GitHub that includes:
  • The latest code samples (to be updated when .NET 6 drops later this year)
  • Instructions on running the samples with Visual Studio 2019 and Visual Studio Code
  • A list of relevant files for each code sample
  • Articles and links for further study

There are also a couple of samples that I was not able to fit into the course. So stay tuned to this blog; I'll be posting those in the near future.

Happy Coding!

Sunday, February 28, 2021

Recorded Presentation: ASP.NET MVC for Absolute Beginners - Your Secret Decoder Ring

ASP.NET MVC has a lot of conventions around routing, controllers, actions, views, and parameters. Coding-by-convention has both good and bad parts. A good part is that it can be fairly light in configuration. A bad part is that it is easy to get lost if you don't know the conventions.

In January, I had the opportunity to share with the Central Ohio .NET Developer's Group (Meetup link).

Video link:  

In this presentation, we create a new MVC project and look at the default routes, controllers, and views to get an initial understanding of some of the conventions. With that information, we can take the next steps of adding our own code -- collecting parameters and generate mazes.

The parameter collection lets us pick a maze size, the color, and a maze-generation algorithm to use.


Clicking the "Generate Maze" buttons gets us a result:


If this sounds interesting, take a look at the GitHub repo with the code and the video on YouTube:



Happy Coding!

Tuesday, February 9, 2021

What's the Difference between Channel and ConcurrentQueue in C#?

In response to the previous article (Introduction to Channels in C#), I've received several questions from folks who have been using ConcurrentQueue<T> (or another type from Collections.Concurrent). These questions boil down to: What's the difference between Channel<T> and ConcurrentQueue<T>?
Short answer: Specialization and optimization.
If you're interested in the long answer, keep reading.

Specialization: Separate Reader and Writer Properties

In the last article, we saw that Channel<T> has separate Reader and Writer properties (which are ChannelReader<T> and ChannelWriter<T>, respectively). But we didn't really take advantage of that in the introductory demo.

Since Channel<T> has separate properties for reading and writing, we can create functions that can only read from a channel or only write to a channel.

As an example, we'll look at some code from a naïve machine learning project of mine: https://github.com/jeremybytes/digit-display. This recognizes hand-written digits from bitmaps. The process is a bit processor intensive (some operations take about 1/3 of a second). The code uses concurrent operations to take advantage of all of the cores of my CPU. This greatly speeds things up when processing hundreds of digits.

Example: Separate Writer
We'll start with a function that produces results from the input. This is taken from the "digit-console-channel" project in the repository mentioned above. I won't include line numbers in my references because that code is currently in flux.

Here is the "Produce" function (from the Program.cs file in the digit-console-channel project):

C#
    private static async Task Produce(ChannelWriter<Prediction> writer,
    string[] rawData, FSharpFunc<int[], Observation> classifier)
    {
      await Task.Run(() =>
      {
        Parallel.ForEach(rawData, imageString =>
        {
          // some bits skipped here.

          var result = Recognizers.predict(ints, classifier);

          var prediction = new Prediction 
          {
            // some bits skipped here, too.
          };

          writer.WriteAsync(prediction);
        });
      });

      writer.Complete();
    }

I won't go through all of the details of this code (that's for another time), but I will point out a couple of things.

First, the function takes "ChannelWriter<Prediction>" as a parameter. "Prediction" is a custom type that contains things like the predicted value, the actual value, and the original image.

The important thing about the parameter is that it indicates that this function will only write to the channel. It will not read from the channel. It does not have access to read from the channel.

Here is how this method is called in the code (from the Program.cs file in the digit-console-channel project):

C#
    var channel = Channel.CreateUnbounded<Prediction>();

    var listener = Listen(channel.Reader, log);
    var producer = Produce(channel.Writer, rawValidation, classifier);

    await producer;
    await listener;

In the above code, the first line creates an unbounded channel. Then in the third line, we call the "Produce" function and pass in "channel.Writer". Instead of passing in the entire channel, we pass in only the part that we need -- the "write" functionality.

Example: Separate Reader
Just like having a function that takes a ChannelWriter as a parameter, we have a function that takes a ChannelReader as a parameter.

Here is the "Listen" function (from the Program.cs file in the digit-console-channel project):

C#
    private static async Task Listen(ChannelReader<Prediction> reader,
    List<Prediction> log)
    {
      await foreach (Prediction prediction in reader.ReadAllAsync())
      {
        // Display the result
        Console.SetCursorPosition(0, 0);
        WriteOutput(prediction);

        // logging bits have been skipped
      }
    }

This function takes a "ChannelReader<Prediction>" as a parameter. This indicates that this function will only read from the channel. It does not have the ability to write to the channel.

To repeat the code from above, here is where the "Listen" function is called:

C#
    var channel = Channel.CreateUnbounded<Prediction>();

    var listener = Listen(channel.Reader, log);
    var producer = Produce(channel.Writer, rawValidation, classifier);

    await producer;
    await listener;

The call to "Listen" uses the "channel.Reader" property. Rather than passing in the entire channel, we just use the functionality that we need -- the ability to read from the channel.

Specialization: Closing the Channel

Another specialization that Channel<T> has is the ability to close the channel. I should be a bit more specific here: the terminology of "closing the channel" comes from my work with channels in Go (golang). In Go, closing a channel indicates that no new items will be written to a channel (writing to a closed channel results in a "panic" -- equivalent to an unhandled exception). Items can still be read from a closed channel until the channel is empty.

If you're curious about channels in Go, see "Go (golang) Channels - Moving Data Between Concurrent Processes".

In C#, we don't "close the channel", we mark the "Writer" as complete -- meaning, there will be no new items written to the channel. We can still read from the channel until all of the items have been read.

Marking the Writer as Complete
The ChannelWriter<T> class has a "Complete" method. This is included in the "Produce" function that we saw earlier.

C#
    private static async Task Produce(ChannelWriter<Prediction> writer,
    string[] rawData, FSharpFunc<int[], Observation> classifier)
    {
      await Task.Run(() =>
      {
        Parallel.ForEach(rawData, imageString =>
        {
          // some bits skipped here.

          var result = Recognizers.predict(ints, classifier);

          var prediction = new Prediction 
          {
            // some bits skipped here, too.
          };

          writer.WriteAsync(prediction);
        });
      });

      writer.Complete();
    }

The last line of this function calls "Writer.Complete()". This happens outside of the "Parallel.ForEach" loop -- after all of the items have been processed and we are done writing to the channel.

Signaling the Reader
The important aspect of marking the Writer as complete is that it lets the Reader know that there will be no new items.

Let's look at the "Listen" function again:

C#
    private static async Task Listen(ChannelReader<Prediction> reader,
    List<Prediction> log)
    {
      await foreach (Prediction prediction in reader.ReadAllAsync())
      {
        // Display the result
        Console.SetCursorPosition(0, 0);
        WriteOutput(prediction);

        // logging bits have been skipped
      }
    }

The core of this function is a "foreach" loop that iterates using the result from "reader.ReadAllAsync()". As we saw in the previous article, this is an IAsyncEnumerable<T>.

At some point, we need to exit the "foreach" loop. That's why "Complete()" is important. Once the channel writer indicates that there will be no new items, the "foreach" exits after all of the remaining items have been read. (There are more technical bits to how IAsyncEnumerable works, but that's not important to understand this bit of code.)

Optimization: Channel Options

Another difference between Channel<T> and ConcurrentQueue<T> is that Channel<T> has some features to tune the operation.

The "Channel.CreateBounded<T>" and "Channel.CreateUnbounded<T>" functions take an optional parameter ("BoundedChannelOptions" or "UnboundedChannelOptions"). These options give the compiler a chance to optimize the operations.

Here's a link to the Microsoft documentation for "BoundedChannelOptions": BoundedChannelOptions Class. This gives a bit more detail on what is available.

Let's look at a few of these options.

Option: SingleWriter
The boolean SingleWriter property lets us specify whether we have a single process writing to the channel or if we have multiple processes (the default is "false", meaning multiple).

If we have multiple writers, there is more overhead to make sure that the capacity of a bounded channel is not exceeded and that writers using "WaitToWriteAsync" or "TryWriteAsync" are managed appropriately.

If we only have a SingleWriter, then the compiler can optimize the code since it does not need to worry about some of those complexities.

For our example code, we need to allow multiple writers. The ParallelForeach loop (in the "Produce" function) spins up concurrent operations for each item in the loop. The result is that we have multiple processes writing to the channel.

Option: SingleReader
Like SingleWriter, the boolean SingleReader property lets us specify whether we have a single process reading from the channel or if we have multiple processes (the default is "false", meaning multiple).

With multiple readers, there is more overhead when readers are using "WaitToReadAsync" or "TryReadAsync".

If we only have a SingleReader, then the compiler can optimize the code.

For our example code, we only have 1 reader (in the "Listen" function). We could set this option to "true" for our code.

Option: AllowSynchronousContinuations
Another option is "AllowSynchronousContinuations". I won't pretend to understand the specifics behind this option.

Setting this option to true can provide measurable throughput improvements by avoiding scheduling additional work items. However, it may come at the cost of reduced parallelism, as for example a producer may then be the one to execute work associated with a consumer, and if not done thoughtfully, this can lead to unexpected interactions. The default is false.
I'm not exactly sure what scenarios this option is useful for. But it looks like we can get increased throughput at the cost of decreased parallelism.

The main point is that Channel<T> has some tuning that is not available in concurrent collections.

Channel<T> is not an Enumerable

Another thing to be aware of when looking at the differences between Channel<T> and ConcurrentQueue<T> (or other concurrent collections) is that Channel<T> is not an IEnumerable<T>. It also does not implement any other collection interfaces (such as ICollection<T>, IReadOnlyCollection<T>, or IList<T>).

IEnumerable<T> is important for 2 reason.

First, this is important because of LINQ (Language Integrated Query). LINQ provides a set of extension methods on IEnumerable<T> (and IQueryable<T> and some other types) that let us filter, sort, aggregate, group, and transform data in collections.

ConcurrentQueue<T> does implement IEnumerable<T>, so it has all of these extension methods available.

Channel<T> does not implement IEnumerable<T> (the same is true of ChannelReader<T>), so these extension methods are not available.

Second, IEnumerable<T> is important because it allows us to "foreach" through a collection.

ConcurrentQueue<T> does support "foreach".

Channel<T> does not directly support "foreach", but we have seen that ChannelReader<T> has a "ReadAllAsync" method that returns IAsyncEnumerable<T>. So this allows us to use "await foreach" on a channel reader, but it's less direct than using a concurrent collection.

There is no Peek

A functional difference between Channel<T> and ConcurrentQueue<T> is that Channel<T> does not have a "Peek" function. In a queue, "Peek" allows us to look at the next item in the queue without actually removing it from the queue.

In a Channel (more specifically ChannelReader<T>), there is no "Peek" functionality. When we read an item from the channel, it also removes it from the channel.

Update: ChannelReader<T> now has both Peek and TryPeek methods. This lets you look at the next item (if there is one) without removing it from the channel.

Which to Use?

So the final question is "Should I use Channel<T> instead of ConcurrentQueue<T>?"

Personally, I like the specialization of Channel<T>, since this is what I generally need -- the ability to produce items in one concurrent operation (or set of operations) and read the output in another concurrent operation.

I like how I can be explicit about the processes that are writing to the channel as opposed to the processes that are reading from the channel. I also like the ability to mark the Writer as "Complete" so that I can stop reading. (This is doable with a ConcurrentQueue, but we would need to write our own signaling code for that.)

If ConcurrentQueue<T> is working for you, there is no reason to rip it out and change it to Channel<T> unless there are specific features (such as the optimizations) that you need. But for new code, I would recommend considering if Channel<T> will work for you.

I'm still exploring Channel<T> and figuring out how I can adjust my concurrent/parallel patterns to take advantage of it. Be sure to check back. I'll have more articles on what I've learned.

Happy Coding!

Monday, February 8, 2021

An Introduction to Channels in C#

Channels give us a way to communicate between concurrent (async) operations in .NET. Channel<T> was included in .NET Core 3.0 (prior to that, it was available as a separate NuGet package). I first encountered channels in Go (golang) several years ago, so I was really interested when I found out that they are available to use in C#.
Channels give us a way to communicate between concurrent (async) operations in .NET.
We can think of a channel as a sort of thread-safe queue. This means that we can add items to a channel in one concurrent operation and read those items in another. (We can also have multiple readers and multiple writers.)

As an introduction to channels, we'll look at an example that mirrors some code from the Go programming language (golang). This will give us an idea of the syntax and how channels work. In a future article, we'll look at a more real-world example.

The Go Example: I have been exploring Go for a while and looking at the similarities and differences compared to C#. You can look at the Go example (and accompanying article) here: Go (golang) Channels - Moving Data Between Concurrent Processes. The article also mentions a little bit about C# channels.

What is Channel<T>?

The purpose of a channel is to let us communicate between 2 (or more) concurrent operations. Like a thread-safe queue, we can add an item to the queue, and we can remove an item from the queue. Since channels are thread-safe, we can have multiple processes writing to and reading from the channel without worrying about missed writes or duplicate reads.

Update: I've had several questions regarding the difference between Channel<T> and ConcurrentQueue<T> (or other concurrent collections). The short answer: specialization and optimization. For a longer answer, here's an article: What's the Difference between Channel and ConcurrentQueue in C#?

An Example
Let's imagine that we need to transform 1,000 records and save the resulting data. Due to the nature of the operation, the transformation takes several seconds to complete. In this scenario, we can set up several concurrent processes that transform records. When a transformation is complete, it puts the record onto the channel.

In a separate operation, we have a process that listens to that channel. When an item is available, it takes it off of the channel and saves it to the data store. Since this operation is faster (only taking a few milliseconds), we can have a single operation that is responsible for this.

This is an implementation of the Observer pattern -- at its core, it is a publish-subscribe relationship. The channel takes results from the publisher (the transformation process) and provides them to the subscriber (the saving process).

Channel Operations
There are a number of operations available with Channel<T>. The primary operations that we'll look at today include the following: 
  1. Creating a channel
  2. Writing to a channel
  3. Closing a channel when we are done writing to it
  4. Reading from a channel
Channel<T> gives us quite a few ways of accomplishing each of these functions. What we'll see today will give us a taste of what is available, but there is more to explore.

Parallel Operations

The sample project is available on GitHub: https://github.com/jeremybytes/csharp-channels. This project has a service with several end points. One end point provides a single "Person" record based on an ID. This operation has a built-in 1 second delay (to simulate a bit of lag).

Our code is in a console application that gets data -- the goal is to pull out 9 records from the service. If we do this sequentially, then the operation takes 9 seconds. But if we do this concurrently (i.e., in parallel), the the operation takes a little over 1 second.

The concurrent operations that call the service put their results onto a channel. Then we have a listener that reads values off of the channel and displays them in the console.

Creating a Channel

To create a channel, we use static methods on the "Channel" class. The two methods we can choose from are "CreateBoundedChannel" and "CreateUnboundedChannel". A bounded channel has a constraint on how many items the channel can hold at one time. If a channel is full, a writer can wait until there is space available in the channel.

Whether we have a bounded or an unbounded channel, we must also specify a type (using a generic type parameter). A channel can only hold one type of object.

Here is the code from our sample (line 61 in the Program.cs file in the project mentioned above):

C#
    var channel = Channel.CreateBounded<Person>(10);

The above code creates a bounded channel that can hold up to 10 "Person" objects. Our sample data only has 9 items, so we will not need to worry about the channel reaching capacity for this example.

Writing to a Channel

A channel has 2 primary properties: a Reader and a Writer. For our channel, the Writer property is a ChannelWriter<Person>. A channel writer has several methods; we'll use the "WriteAsync" method.

Here's the "WriteAsync" method call (line 72 in the Program.cs file):

C#
    await channel.Writer.WriteAsync(person);

Like many things in C#, this method is asynchronous. The "await" will pause the operation until it is complete.

Let's look at the "WriteAsync" method in context. Here is the enclosing section (lines 64 to 81 in the Program.cs file):

C#
    // Write person objects to channel asynchronously
    foreach (var id in ids)
    {
      async Task func(int id)
      {
        try
        {
          var person = await GetPersonAsync(id);
          await channel.Writer.WriteAsync(person);
        }
        catch (HttpRequestException ex)
        {
          Console.WriteLine($"ERROR: ID {id}: {ex.Message}");
        }
      }
      asyncCalls.Add(func(id));
    }

The "foreach" loops through a collection of "ids" (the "id" property of the "Person" object).

Inside the "foreach" loop, we have a local function (named "func" -- not a great name, but it makes the code similar to the anonymous function in the Go example).

The "func" function calls "GetPersonAsync". "GetPersonAsync" calls the service and returns the resulting "person".

After getting the "person", we put that object onto the channel using "Writer.WriteAsync".

The catch block is looking for exceptions that might be thrown from the service -- for example if the service is not available or the service returns an empty object.

The last line of the "foreach" loop does a couple of things. It calls the local "func" function using the "id" value from the loop. "func" is asynchronous and returns a Task.

"asyncCalls" is a "List<Task>". By calling "asyncCalls.Add()", we add the Task from the "func" call to the list. We'll see how this is used later.

There are no blocking operations inside the "foreach" loop. Instead, each iteration of the loop kicks off a concurrent (async) operation. This means that the loop will complete very quickly, and we will have 9 operations running concurrently.

Closing a Channel

When we are done writing to a channel, we should close it. This tells anyone reading from the channel that there will be no more values, and they can stop reading. We'll see why this is important when we look at reading from a channel.

One issue that we have to deal with in our code is that we do not want to close the channel until all 9 of the concurrent operations are complete. This is why we have the "asyncCalls" list. It has a list of all of the operations (Tasks), and we can wait until they are complete.

Here's the code for that (lines 83 to 88 in the Program.cs file):

C#
    // Wait for the async tasks to complete & close the channel
    _ = Task.Run(async () =>
        {
          await Task.WhenAll(asyncCalls);
          channel.Writer.Complete();
        });

"await Task.WhenAll(asyncCalls)" will pause the operation of this section of code until all of the concurrent tasks are complete. When the "await" is done, we know that it is safe to close the channel.

The next line -- "channel.Writer.Complete()" -- closes the channel. This indicates that no new items will be written to the channel.

These 2 lines of code are wrapped in a task (using "Task.Run"). The reason for this is that we want the application to continue without blocking at this point.  The "Task.WhenAll" will block, but it will block inside a separate concurrent (asynchronous) operation.

This means that the read operation (in the next section) can start running before all of the concurrent tasks are complete.

If running this section asynchronously is a bit hard to grasp, that's okay. It took me a while to wrap my head around it. For our limited sample (with only 9 records), it does not make any visible difference. But it's a good pattern to get used to for larger, more complex scenarios.

Reading from a Channel

Our last step is to read from a channel -- this uses the "Reader" property from the channel.  For our example, "Reader" is a "ChannelReader<Person>".

There are multiple ways that we can read from the channel. The sample code has an "Option 1" and "Option 2" listed (with one of the options commented out). We'll look at "Option 2" first.

Option: WaitToReadAsync / TryRead
"Option 2" uses a couple of "while" loops to read from the channel. Here is the code (lines 97 to 105 in the Program.cs file):

C#
    // Read person objects from the channel until the channel is closed
    // OPTION 2: Using WaitToReadAsync / TryRead
    while (await channel.Reader.WaitToReadAsync())
    {
      while (channel.Reader.TryRead(out Person person))
      {
        Console.WriteLine($"{person.ID}: {person}");
      }
    }

The first "while" loop makes a call to "WaitToReadAsync" on the channel "Reader" property. When we "await" the "WaitToReadAsync" method, the code will pause until an item is available to read from the channel *or* until the channel is closed.

When an item is available to read, "WaitToReadAsync" returns "true" (so the "while" loop will continue). If the channel is closed, then "WaitToReadAsync" returns "false", and the "while" loop will exit.

So far, this only tells us whether there is an item available to read. We have not done any actual reading. For that, we have the next "while" loop.

The inner "while" loop makes a call to "TryRead" on the channel "Reader" property. As with most "Try" methods, this uses an "out" parameter to hold the value. If the read is successful, then the "person" variable is populated, and the value is written to the console.

"TryRead" will return "false" if there are no items available. If that's the case, then it will exit the inner "while" loop and return to the outer "while" loop to wait for the next available item (or the closing of the channel).

Note: After the channel is closed (meaning the Writer is marked Complete), the Reader will continue to read from the channel until the channel is empty. Once a closed channel is empty, the "WaitToReadAsync" will return false and the loop will exit.

Option: ReadAllAsync
Another option ("Option 1" in the code) is to use "ReadAllAsync". This code is nice because it has fewer steps. But it also uses an IAsyncEnumerable. If you are not familiar with IAsyncEnumerable, then it can be a bit difficult to understand. That's why both options are included.

Let's look at the code that uses "ReadAllAsync" (lines 90 to 95 in the Program.cs file):

C#
    // Read person objects from the channel until the channel is closed
    // OPTION 1: Using ReadAllAsync (IAsyncEnumerable)
    await foreach (Person person in channel.Reader.ReadAllAsync())
    {
        Console.WriteLine($"{person.ID}: {person}");
    }

"ReadAllAsync" is a method on the channel "Reader" property. This returns an "IAsyncEnumerable<Person>". This combines the power of enumeration ("foreach") with the concurrency of "async".

The "foreach" loop condition looks pretty normal -- we iterate on the result of "ReadAllAsync" and use the "Person" object from that iteration.

What is a little different is that there is an "await" before the "foreach". This means that the "foreach" loop may pause between iterations if it needs to wait for an item to be populated in the channel.

The effect is that the "foreach" will continue to provide values until the channel is closed and empty. Inside the loop, we write the value to the console.

At a high level, this code is a bit easier to understand (since it does not have nested loops and uses a familiar "foreach" loop). But when we look more closely, we see that we "await" the loop, and what returns from "ReadAllAsync" may not be a complete set of data -- our loop may wait for items to be populated in the channel.

I would recommend learning more and getting comfortable with IAsyncEnumerable (which was added with C# 8). It's pretty powerful and, as we've seen here, it can simplify our code. But it only simplifies the code if we understand what it's doing.

Application Output

To run the application, first start the service in the "net-people-service" folder. Navigate to the folder from the command line (I'm using PowerShell here), and type "dotnet run". This starts the service and hosts it at "http://localhost:9874".

Console
    PS C:\Channels\net-people-service> dotnet run
    info: Microsoft.Hosting.Lifetime[0]
        Now listening on: http://localhost:9874
    info: Microsoft.Hosting.Lifetime[0]
        Application started. Press Ctrl+C to shut down.
    info: Microsoft.Hosting.Lifetime[0]
        Hosting environment: Development
    info: Microsoft.Hosting.Lifetime[0]
        Content root path: C:\Channels\net-people-service

To run the application, navigate to the "async" folder (from a new command line) and type "dotnet run".

Console
    PS C:\Channels\async> dotnet run
    [1 2 3 4 5 6 7 8 9]
    9: Isaac Gampu
    3: Turanga Leela
    7: John Sheridan
    4: John Crichton
    6: Laura Roslin
    5: Dave Lister
    2: Dylan Hunt
    1: John Koenig
    8: Dante Montana
    Total Time: 00:00:01.3324302
    PS C:\Channels\async>

There are a couple of things to note about this output. First, notice the "Total Time" at the bottom. This takes a bit over 1 second to complete. As mentioned earlier, each call to the service takes at least 1 second (due to a delay in the service). This shows us that all 9 of these service calls happen concurrently.

Next, notice the order of the output. It is non-deterministic. Even though we have a "foreach" loop that goes through the ids in sequence (1, 2, 3, 4, 5, 6, 7, 8, 9), the output is not in the same order. This is one of the joys of running things in parallel -- the order things complete is not the same as the order things were started.

If we run this application multiple times, the items will be in a different order each time. (Okay, not different *each* time since we are only dealing with 9 values, but we cannot guarantee the order of the results.)

There's Much More

In this article, we've looked at some of the functions of Channel<T>, including creating, writing to, closing, and reading from channels. But Channel<T> has much more functionality.

When we create a channel (using "CreateBounded" or "CreateUnbounded"), we can also include a set of options. These include "SingleReader" and "SingleWriter" to indicate that we will only have one read or write operation at a time. This allows the compiler to optimize the channel. In addition, there is an "AllowSynchronousContinuations" option. This can increase throughput by reducing parallelism.

ChannelWriter<T> has several methods in addition to "WriteAsync". These include "WaitToWriteAsync" and "TryWrite" which we may need in scenarios where we have bounded channels and we exceed the capacity. There is also a "TryComplete" method that tries to close the channel.

ChannelReader<T> has many methods as well. Today, we saw "ReadAllAsync", "WaitToReadAsync", and "TryRead". But there is also a "ReadAsync". We can mix and match these methods in different ways depending on what our needs are.

Wrap Up

Channel<T> is really interesting to me. When I started looking at Go (golang) a few years back, I found channels to be clean and easy to use. I wondered if it would make sense to have something similar in C#. Apparently, I was not the only one. Channel<T> uses many of the same concepts as channels in Go, but with a bit more power and functionality. This adds complexity as well (but I've learned to accept complexity in the C# world).

In today's example, we saw how Channel<T> lets us communicate between concurrent operations. In our case, we have 9 concurrent operations that all write their results to a channel. Then we have a reader that pulls the results from the channel and displays them on the console.

This allowed us to see some basics of creating, writing to, closing, and reading from channels. But the real world can get a bit more complicated.

I have been experimenting with channels in my digit recognition project (https://github.com/jeremybytes/digit-display). This uses machine learning to recognize hand-written digits from bitmap data. These operations are computationally expensive, so I've been using tasks with continuations to run them in parallel. But I've also created versions that uses channels. This simplifies some things (I don't need to be as concerned about getting back to the UI thread in the desktop application). But it creates issues in others (using a Parallel.Foreach can end up starving the main application thread). I've got a bit more experimenting to do. The solution that I have right now works, but it doesn't feel quite right.

As always, there is still more to learn.

Happy Coding!