Jeremy Bytes

Wednesday, April 24, 2024

Thoughts on Primary Constructors in C#

Primary constructors seems to be one of those polarizing features in C#. Some folks really love them; some folks really hate them. As in most cases, there is probably a majority of developers who are in the "meh, whatever" group. I have been experimenting and reading a bunch of articles to try to figure out where/if I will use them in my own code.

I have not reached any conclusions yet. I have a few examples of where I would not use them, and a few examples of where I would. And I've come across some scenarios where I can't use them (which affects usage in other areas because I tend to write for consistency).

Getting Started

To get started understanding primary constructors, I recommend a couple of articles:

Refactor your C# code with primary constructors by David Pine
Dark side of the primary constructors in C# 12 by Marek Sirkovský

The first article is on Microsoft devblogs, and David walks through changing existing code to use primary constructors (including using Visual Studio's refactoring tools). This is a good overview of what they are and how to use them.

In the second article, Marek takes a look at primary constructors and adds some thoughts. The "Initialization vs. capture" section was particularly interesting to me. This is mentioned in David's article as well, but Marek takes it a step further and shows what happens if you mix approaches. We'll look at this more closely below.

Very Short Introduction

Primary constructors allow us to put parameters on a class declaration itself. The parameters become available to the class, and we can use this feature to eliminate constructors that only take parameters and assign them to local fields.

Here's an example of a class that does not use a primary constructor:


  public class HomeController : Controller
{
    private readonly ILogger<HomeController> _logger;

    public HomeController(ILogger<HomeController> logger)
    {
        _logger = logger;
    }
    
    public IActionResult Index()
    {
        _logger.Log(LogLevel.Trace, "Index action called");
        return View();
    }
    // other code
}

In a new ASP.NET Core MVC project, the HomeController has a _logger field that is initialized by a constructor parameter. I have updated the code to use the _logger field in the Index action method.

The code can be shorted by using a primary constructor. Here is one way to do that:


  public class HomeController(ILogger<HomeController> logger) 
    : Controller
{
    private readonly ILogger<HomeController> _logger = logger;
    
    public IActionResult Index()
    {
        _logger.Log(LogLevel.Trace, "Index action called");
        return View();
    }
    // other code
}

Notice that the class declaration now has a parameter (logger), and the separate constructor is gone. The logger parameter can be used throughout the class. In this case, it is used to initialize the _logger field, and we use the _logger in the Index action method just like before. (There is another way to handle this that we'll see a bit later).

So by using a primary constructor, we no longer need a separate constructor, and the code is a bit more compact.

Fields or no Fields?

As David notes in his article, Visual Studio 2022 offers 2 different refactoring options: (1) "Use primary constructor" and (2) "Use primary constructor (and remove fields)". (Sorry, I couldn't get a screen shot of that here because the Windows Snipping Tool was fighting me.)

The example above shows what happens if we choose the first option. Here is what we get if we chose the second (and remove the fields):


  public class HomeController(ILogger<HomeController> logger) 
    : Controller
{
    public IActionResult Index()
    {
        logger.Log(LogLevel.Trace, "Index action called");
        return View();
    }
    // other code
}

In this code, the separate _logger field has been removed. Instead, the parameter on the primary constructor (logger) is used directly in the class. A parameter on the primary constructor is available throughout the class (similar to a class-level field), so we can use it anywhere in the class.

Guidance?

Here's where I hit my first question: fields or no fields? The main difference between the 2 options is that a primary constructor parameter is mutable.

This means that if we use the parameter directly (option 2), then it is possible to assign a new value to the "logger" parameter. This is not likely in this scenario, but it is possible.

If we maintain a separate field (option 1), then we can set the field readonly. Now the value cannot be changed after it is initialized.

One problem I ran into is that there is not a lot of guidance in the Microsoft documentation or tutorials. They present both options and note the mutability, but don't really suggest a preference for one over the other.

The Real Difference

Reading Marek's article opened up the real difference between these two approaches. (Go read the section on "Initialization vs. capture" for his explanation.)

The compiler treats the 2 options differently. With option 1 (use the parameter to initialize a field), the parameter is discarded after initialization. With option 2 (use the parameter directly in the class), the parameter is captured (similar to a captured variable in a lambda expression or anonymous delegate) and can be used throughout the class.

As Marek notes, this is fine unless you try to mix the 2 approaches with the same parameter. For example, if we use the parameter to initialize a field and then also use the parameter directly in our code, we end up in a bit of a strange state. The field is initialized, but if we change the value of the parameter later, then the field and parameter will have different values.

If you try the mix and match approach in Visual Studio 2022, you will get a compiler warning. In our example, if we assign the logger parameter to a field and then use the logger parameter directly in the Index action method, we get the following warning:

Warning CS9124 Parameter 'ILogger<HomeController> logger' is captured into the state of the enclosing type and its value is also used to initialize a field, property, or event.

So this tells us about the danger, but it is a warning -- meaning the code will still compile and run. I would rather see this treated as an error. I pay pretty close attention to warnings in my code, but a lot of folks do not.

Fields or No Fields?

I honestly have not landed on which way I'll tend to go with this one. There may be other factors involved. I look at consistency and criticality below -- these will help me make my decisions.

Consistency and Dependency Injection

A primary use case for primary constructors is dependency injection -- specifically constructor injection. Often our constructors will take the constructor parameters (the dependencies) and assign them to local fields. This is what we've seen with the example using ILogger.

So, when I first started experimenting with primary constructors, I used my dependency injection sample code. Let's look at a few samples. You can find code samples on GitHub: https://github.com/jeremybytes/sdd-2024/tree/main/04-dependency-injection/completed, and I'll provide links to specific files.

Primary Constructor with a View Model

In a desktop application sample, I use constructor injection to get a data reader into a view model. This happens in the "PeopleViewModel" class in the "PeopleViewer.Presentation" project (link to the PeopleViewModel.cs file).


  public class PeopleViewModel : INotifyPropertyChanged
{
    protected IPersonReader DataReader;

    public PeopleViewModel(IPersonReader reader)
    {
        DataReader = reader;
    }

    public async Task RefreshPeople()
    {
        People = await DataReader.GetPeople();
    }
    // other code
}

This class has a constructor parameter (IPersonReader) that is assigned to a protected field.

We can use a primary constructor to reduce the code a bit:


  public class PeopleViewModel(IPersonReader reader) 
    : INotifyPropertyChanged
{
    protected IPersonReader DataReader = reader;

    public async Task RefreshPeople()
    {
        People = await DataReader.GetPeople();
    }
    // other code
}

This moves the IPersonReader parameter to a primary constructor.

Using a primary constructor here could be seen as a plus as it reduces the code.

As a side note: since the field is protected (and not private), Visual Studio 2022 refactoring does not offer the option to "remove fields".

Primary Constructor with a View?

For this application, the View needs to have a View Model injected. This happens in the "PeopleViewerWindow" class of the "PeopleViewer.Desktop" project (link to the PeopleViewerWindow.xaml.cs file).


  public partial class PeopleViewerWindow : Window
{
    PeopleViewModel viewModel { get; init; }

    public PeopleViewerWindow(PeopleViewModel peopleViewModel)
    {
        InitializeComponent();
        viewModel = peopleViewModel;
        this.DataContext = viewModel;
    }

    private async void FetchButton_Click(object sender, RoutedEventArgs e)
    {
        await viewModel.RefreshPeople();
        ShowRepositoryType();
    }
    // other code
}

The constructor for the PeopleViewerWindow class does a bit more than just assign constructor parameters to fields. We need to call InitializeComponent because this is a WPF Window. In addition, we set a view model field and also set the DataContext for the window.

Because of this additional code, this class is not a good candidate for a primary constructor. As far as I can tell, it is not possible to use a primary constructor here. I have tried several approaches, but I have not found a way to do it.

Consistency

I place consistency pretty high on my list of qualities I want in my code. I learned this from a team that I spent many years with. We had about a dozen developers who built and supported a hundred applications of various sizes. Because we had a very consistent approach to our code, it was easy to open up any project and get to work. The consistency between the applications made things familiar, and you could find the bits that were important to that specific application.

I still lean towards consistency because humans are really good at recognizing patterns and registering things as "familiar". I want to take advantage of that.

So, this particular application, my tendency would be to not use primary constructors. Constructor injection is used throughout the application, and I would like it to look similar between the classes. I emphasize "this particular application" because my view could be different if primary constructors could be used across the DI bits.

Incidental vs. Critical Parameters

When it comes to deciding whether to have a class-level field, I've also been thinking about the difference between incidental parameters and critical parameters.

Let me explain with an example. This is an ASP.NET Core MVC controller that has 2 constructor parameters:


  public class PeopleController : Controller
{
    private readonly ILogger<PeopleController> logger;
    private readonly IPersonReader reader;

    public PeopleController(ILogger<PeopleController> logger, 
        IPersonReader personReader)
    {
        this.logger = logger;
        reader = personReader;
    }

    public async Task<IActionResult> UseConstructorReader()
    {
        logger.Log(LogLevel.Information, "UseContructorReader action called");
        ViewData["Title"] = "Using Constructor Reader";
        ViewData["ReaderType"] = reader.GetTypeName();

        IEnumerable<Person> people = await reader.GetPeople();
        return View("Index", people.ToList());
    }
    // other code
}

The constructor has 2 parameters: an ILogger and an IPersonReader. In my mind, one of these is more important than the other.

IPersonReader is a critical parameter. This is the data source for the controller, and none of the controller actions will work without a valid IPersonReader.

ILogger is an incidental parameter. Yes, logging is important. But if the logger is not working, my controller could still operate as usual.

In this case, I might do something strange: use the ILogger parameter directly, and assign the IPersonReader to a field.

Here's what that code would look like:


  public class PeopleController(ILogger<PeopleController> logger,
    IPersonReader personReader) : Controller
{
    private readonly IPersonReader reader = personReader;

    public async Task<IActionResult> UseConstructorReader()
    {
        logger.Log(LogLevel.Information, "UseContructorReader action called");
        ViewData["Title"] = "Using Constructor Reader";
        ViewData["ReaderType"] = reader.GetTypeName();

        IEnumerable<Person> people = await reader.GetPeople();
        return View("Index", people.ToList());
    }
    // other code
}

I'm not sure how much I like this code. The logger parameter is used directly (as noted above, this is a captured value that can be used throughout the class). The personReader parameter is used for initialization (assigned to the reader field).

Note: Even though it looks like we are mixing initialization and capture, we are not. The logger is captured, and the personReader is used for initialization. We are okay here because we have made a decision (initialize or capture) for each parameter, so we will not get the compiler warning here.

To me (emphasizing personal choice here), this makes the IPersonReader more obvious -- it has a separate field right at the top of the class. The assignment is very intentional.

In contrast, the ILogger is just there. It is available for use, but it is not directly called out.

These are just thoughts at this point. It does conflict a bit with consistency, but everything we do involves trade-offs.

Thoughts on Primary Constructors and C#

I have spent a lot of time thinking about this and experimenting with different bits of code. But my thoughts are not fully formed yet. If you ask me about this in 6 months, I may have a completely different view.

I do not find typing to be onerous. When it comes to code reduction, I do not think about it from the initial creation standpoint; I think about it from the readability standpoint. How readable is this code? Is it simply unfamiliar right now? Will it get more comfortable with use?

These are all questions that we need to deal with when we write code.

Idiomatic C#?

In Marek's article, he worries that we are near the end of idiomatic C# -- meaning a standard / accepted / recommended way to write C# code (see the section "End of idiomatic C#?"). Personally, I think that we are already past that point. C# is a language of options that are different syntactically without being different substantially.

Expression-Bodied Members

For example, whether we use expression-bodied members or block-bodied members does not matter to the compiler. They mean the same thing. But they are visually different to the developer, and human minds tend to think of them as being different.

var

People get too attached to whether to use var or not. I have given my personal views in the past (Demystifying the "var" Keyword in C), but I don't worry whether anyone else adopts them. This has been treated as a syntax difference without a compiler difference. But that changed with the introduction of nullable reference types: var is now automatically nullable (C# "var" with a Reference Type is Always Nullable). There is still a question of whether developers need to care about this or not. I've only heard about one edge case so far.

Target-typed new

Another example is target-typed new expressions. Do you prefer "List<int>? ids = new()" or "var ids = new List<int>()"? The compiler does not care; the result is the same. But humans do care. (And your choice may be determined by how attached to "var" you are.)

No More Idiomatic C#

These are just 3 examples. Others include the various ways to declare properties, top level statements, and whether to use collection initialization syntax (and on that last one, Visual Studio has given me some really weird suggestions where "ToList()" is fine.)

In the end, we need to decide how we write C# moving forward. I have my preferences based on my history, experience, competence, approach to code, personal definition of readability, etc. And when we work with other developers, we don't just have to take our preferences into account, but also the preferences and abilities of the rest of the team. Ultimately, we each need to pick a path and move foreward.

Happy Coding!

Monday, April 22, 2024

Event Spotlight: Software Design & Development 2024 (May 2024)

Software Design & Development 2024

Dates: May 13-17, 2024

Location: Barbican Centre, London

Website: https://sddconf.com/

In just a few weeks, this will be my 6th time at SDD, and I'm looking forward to a great week. Register by Friday, April 26 to save a bit. https://sddconf.com/register

My Sessions

I will be presenting 5 breakout sessions, plus a full-day workshop. There's a theme to my talks this year: techniques of abstraction in C#, and how to (hopefully) get it right.

Abstract Art: Getting Abstraction "Just Right"

Abstraction is awesome, and abstraction is awful. Let's look at some techniques to get it right for each application.

IEnumerable, ISaveable, IDontGetIt: Understanding C# Interfaces

Interfaces (one type of abstraction) are a way to make our code more flexible, maintainable, and testable. We'll use lots of code to explore these benefits.

Practical Reflection in C#

Reflection has lots of uses (some safe and some not so safe). We'll look at the dangers, and also some specific use cases that allow us to extend applications without recompiling.

DI Why? Getting a Grip on Dependency Injection

Dependency Injection (and the associated patterns) let us write loosely coupled code. And this can make our applications easier to maintain, extend, and test. We'll see how this is important by looking at lots of code.

LINQ - It's not Just for Databases

Language Integrated Query (LINQ) is one of my favorite things in C#. Most people think it's mainly for databases, but I use it all the time to filter, sort, and group data in my applications without going back to the data source.

Take Your C# Skills to the Next Level -- Full-day Workshop

Let's make our software easy to extend, change, and debug. Interfaces, dependency injection, delegates, and unit testing can get us there. You'll leave with practical experience you can apply to your own projects.

Get more information on Software Design & Development 2024 here: https://sddconf.com/

And don't forget to register by Apr 26 for a discount: https://sddconf.com/register

Happy Coding!

Wednesday, February 28, 2024

Continue Processing with Parallel.ForEachAsync (even when exceptions are thrown)

Parallel.ForEachAsync is a very useful tool for running code in parallel. Recently, we have been exploring what happens when an exception is thrown inside the loop:

If we "await" ForEachAsync, then we get a single exception (even if exceptions are thrown in multiple iterations of the loop).
The loop short-circuits -- meaning not all items are processed.

In this series of articles, we look at these issues and how to deal with them.

Parallel.ForEachAsync and Exceptions
Getting Multiple Exceptions from Parallel.ForEachAsync
Continue Processing with Parallel.ForEachAsync (even when exceptions are thrown) this article

Code samples for all articles are available here: https://github.com/jeremybytes/foreachasync-exception.

In the last article, we saw how to deal with behavior #1 by getting all of the available exceptions. In this article, we will look at behavior #2 -- short-circuiting. We can eliminate the short-circuiting of the loop so that all of the items are processed, and we can collect the exceptions along the way.

Short Version:

Handle exceptions inside the body of ForEachAsync

For slides and code samples on Parallel.ForEachAsync (and other parallel approaches), you can take a look at the materials from my full-day workshop on asynchronous programming: https://github.com/jeremybytes/async-workshop-2022. (These materials use .NET 6.0. Updates for .NET 8.0 are coming in a few months.) For announcements on public workshops, check here: https://jeremybytes.blogspot.com/p/workshops.html.

Prevent Short-Circuiting

Parallel.ForEachAsync will stop processing if it encounters an unhandled exception. This is the behavior that we've seen in our other examples. The result is that we only process 17 of the 100 items in our loop.

Since an unhandled exception is the cause of the short-circuit, we can continue processing by eliminating that unhandled exception.

And an easy way to eliminate an unhandled exception is to handle it.

Try/Catch inside the ForEachAsync Body

Here is the updated code (from the "doesnt-stop/Program.cs" file):


      await Parallel.ForEachAsync(Enumerable.Range(1, 100),
        new ParallelOptions() { MaxDegreeOfParallelism = 10 },
        async (i, _) =>
        {
            try
            {
                Console.WriteLine($"Processing item: {i}");
                await Task.Delay(10); // simulate async task
                MightThrowException(i);
                Interlocked.Increment(ref TotalProcessed);
            }
            catch (Exception ex)
            {
                Console.WriteLine($"Caught in Loop: {ex.Message}");
            }
        });

Here we have a try/catch block inside the body of ForEachAsync. If an exception is thrown, it is handled inside of the loop. From ForEachAsync's perspective, there are no unhandled exceptions, so it continues processing.

Output

In the output, all 100 items processed -- or are at least attempted to be processed. Here is the last part of the output:

Caught in Loop: Bad thing happened inside loop (87)

  Processing item: 92

  Processing item: 93

  Processing item: 94

  Processing item: 95

  Caught in Loop: Bad thing happened inside loop (84)

  Processing item: 96

  Caught in Loop: Bad thing happened inside loop (81)

  Processing item: 97

  Processing item: 98

  Processing item: 99

  Processing item: 100

  Caught in Loop: Bad thing happened inside loop (99)

  Caught in Loop: Bad thing happened inside loop (93)

  Caught in Loop: Bad thing happened inside loop (96)

  

  Total Processed: 67

  Total Exceptions: 33

  Done (Doesn't Stop for Exceptions)

All 100 items were processed. 67 were successful and 33 of them failed -- this is what we expect based on our method that throws exceptions.

Observations

With this approach, we do not have to deal with AggregateException. Instead, we handle the individual exceptions as they occur. This could include logging or retrying the operation.

Because we have a standard try/catch block, we get the full exception (including stack trace and other information). We can log this information if we need to investigate further.

Since the loop does not stop, we do not need to worry about where the loop left in a short-circuit. All of the items have a chance to be processed.

We do need to worry about concurrency. The body of the catch block could be running for multiple items at the same time. So we need to ensure that our logging methods and any other processing in the catch block is thread-safe.

Wrap Up

Ultimately, the approach we take depends on the specific needs of the process. But we do need to keep the default behavior of Parallel.ForEachAsync in mind:

If we "await" ForEachAsync, then we get a single exception (even if exceptions are thrown in multiple iterations of the loop).
The loop short-circuits -- meaning not all items are processed.

If getting an exception in the ForEachAsync loop is truly exceptional (meaning it really should never happen), then the default behavior may be okay. An exception is a catastrophic error that stops the process and lets us know that (at least) one item failed. In this case, default behavior may be fine (as we saw in the first article).

It may be that short-circuiting is okay (because we need to restart the loop from the beginning in case of failure), but we still want more information about what happened. We can get all of the available exceptions by either using a continuation or by using ConfigureAwaitOptions (as we saw in the last article).

If we want to continue processing even if some of the items fail, then we can take the approach from this article -- put a try/catch block inside the body of ForEachAsync (as we saw in this article).

Whatever approach we take, it is always good to know that there are options. Each application has its own needs. Our job as programmers is to pick an option that works well for the application. So keep looking at options; you never know when you'll need a particular one.

Happy Coding!

Tuesday, February 27, 2024

Getting Multiple Exceptions from Parallel.ForEachAsync

Parallel.ForEachAsync is a very useful tool for running code in parallel. Last time, we looked at what happens when an exception is thrown inside of the loop:

If we "await" ForEachAsync, then we get a single exception (even if exceptions are thrown in multiple iterations of the loop).
The loop short-circuits -- meaning not all items are processed.

Depending on what we are doing in the parallel loop, these items may not be a concern. But there are situations where I would like to get all of the exceptions back; and there are times when I would like to capture the exceptions and continue processing.

In this series of articles, we look at these issues and how to deal with them.

Parallel.ForEachAsync and Exceptions
Getting Multiple Exceptions from Parallel.ForEachAsync this article
Continue Processing with Parallel.ForEachAsync (even when exceptions are thrown)

Code samples for all articles are available here: https://github.com/jeremybytes/foreachasync-exception.

In this article, we will take on the first issue: how to get all available exceptions from the loop. The number of exceptions that we get depends on the second issue. Since the loop short-circuits, we end up with some exceptions, but not as many as if the loop were to complete normally. (The third article in the series shows how we can keep the loop from short-circuiting.)

2 Approaches

We will look at 2 approaches to getting the available exceptions.

Using a Continuation
With this approach, we look at the exceptions in a continuation instead of letting them bubble up through the ForEachAsync method/Task.
Using ConfigureAwaitOptions.SuppressThrowing
This uses a feature added in .NET 8: ConfigureAwaitOptions. With this approach, we suppress the exceptions in the loop and then use a "Wait" to show the aggregate exception. I came across this approach in Gérald Barré's article "Getting all exceptions thrown from Parallel.ForEachAsync".

The approaches both give access to the entire set of exceptions. The differences are subtle, and we'll look at those after we've seen both approaches.

A Reminder

As a reminder, here is how we left our code in the previous article (from the original-code/Program.cs file):


      try
    {
        await Parallel.ForEachAsync(
            Enumerable.Range(1, 100),
            new ParallelOptions() { MaxDegreeOfParallelism = 10 },
            async (i, _) =>
            {
                Console.WriteLine($"Processing item: {i}");
                await Task.Delay(10); // simulate async task
                MightThrowException(i);
                Interlocked.Increment(ref TotalProcessed);
            });
    }
    catch (Exception ex)
    {
        Console.WriteLine($"Exception: {ex.Message}");
    }

Every 3rd iteration of the loop throws an exception. This causes the ForEachAsync loop to short-circuit (after 17 items, generally). Because of the "await" on ForEachAsync, only 1 of the exceptions is shown (out of 5, generally).

Using A Continuation

For the first approach, we will use a continuation. Instead of letting the exceptions bubble up through the ForEachAsync method call, we add a continuation that runs some code if there is an exception. From there, we can gracefully handle the exception(s).

Here is the code for that (from the "continuation/Program.cs" file):


      try
    {
        await Parallel.ForEachAsync(Enumerable.Range(1, 100),
            new ParallelOptions() { MaxDegreeOfParallelism = 10 },
            async (i, _) =>
            {
                Console.WriteLine($"Processing item: {i}");
                await Task.Delay(10); // simulate async task
                MightThrowException(i);
                Interlocked.Increment(ref TotalProcessed);
            })
            .ContinueWith(task =>
            {
                if (task.IsFaulted)
                    Console.WriteLine($"Exception: {task.Exception!.Message}");
            });
    }

After calling "Parallel.ForEachAsync", we add a continuation by calling ".ContinueWith". This specifies code that we want to run after a Task has completed.

The parameter of "ContinueWith" is a delegate that has the code we want to run when the Task is complete. The "task" parameter represents the ForEachAsync task itself.

Because we have access to the this task, we can check the "IsFaulted" property. A task is faulted if an exception is thrown in the task (which is exactly what we are expecting here). If an exception is thrown, then we output it to the console.

Output

When we run the code, we get the following output:

Processing item: 9

  Processing item: 4

  Processing item: 1

  Processing item: 6

  Processing item: 5

  Processing item: 8

  Processing item: 3

  Processing item: 10

  Processing item: 2

  Processing item: 7

  Processing item: 11

  Processing item: 16

  Processing item: 12

  Processing item: 13

  Processing item: 14

  Processing item: 15

  Processing item: 17

  Exception: One or more errors occurred. (Bad thing happened inside loop (6))
  (Bad thing happened inside loop (3)) (Bad thing happened inside loop (9)) (Bad
  thing happened inside loop (15)) (Bad thing happened inside loop (12))

  

  Total Processed: 12

  Total Exceptions: 5

  Done (AggregateException from Continuation)

This shows us the 17 items that were processed, and the Exception message "One or more errors occurred" along with additional information. This is a typical message from an AggregateException.

For more specifics on AggregateException, you can take a look at "Task and Await: Basic Exception Handling" and "'await Task.WhenAll' Shows One Exception - Here's How to See Them All".

An AggregateException contains all of the exceptions that were thrown in a Task. It has a tree structure of inner exceptions that let it handle various complexities of Task (including concurrent tasks and child tasks).

The Original Catch Block Does Nothing

The call to Parallel.ForEachAsync is inside a "try" block (just like in our original code). There is still a "catch" block, but it is not used in this scenario.

Here's the updated "catch" block:


      catch (Exception)
    {
        // You never get to this code. Exceptions are handled
        // inside the ForEachAsync loop.

        // But just in case, rethrow the exception
        Console.WriteLine("You shouldn't get here");
        throw;
    }

As the comment says, we should never get to this catch block. But just in case we do, we rethrow the exception. In this case, it would result in an unhandled exception and the application would crash.

But why won't we get to this "catch" block?
Our code has changed in a subtle way. In the original code, we used "await" on the ForEachAsync method/Task. When we "await" a faulted Task, the AggregateException is not thrown; instead, one of the inner exceptions is thrown. (This is what we saw in the previous article.)

The difference here is that we no longer "await" the ForEachAsync method. Instead, we "await" the continuation task returned from "ContinueWith". The code in our continuation is not likely to throw an exception, so we will not hit the "catch" block.

Displaying the Inner Exceptions

So we now have access to the AggregateException. Let's display the inner exceptions. Here is a method to do that (from the "continuation/Program.cs" file):


      private static void ShowAggregateException(AggregateException ex)
    {
        StringBuilder builder = new();

        var innerExceptions = ex.Flatten().InnerExceptions;
        builder.AppendLine("======================");
        builder.AppendLine($"Aggregate Exception: Count {innerExceptions.Count}");

        foreach (var inner in innerExceptions)
            builder.AppendLine($"Continuation Exception: {inner!.Message}");
        builder.Append("======================");

        Console.WriteLine(builder.ToString());
    }

This flattens the inner exceptions (to get rid of the tree structure), loops through each exception, and builds a string to display on the console.

We just need to adjust our continuation to call this new method:


      .ContinueWith(task =>
    {
        if (task.IsFaulted)
            ShowAggregateException(task.Exception);
    });

Now we get the following output:


  Processing item: 9
Processing item: 2
Processing item: 6
Processing item: 1
Processing item: 3
Processing item: 8
Processing item: 7
Processing item: 4
Processing item: 10
Processing item: 5
Processing item: 11
Processing item: 12
Processing item: 16
Processing item: 14
Processing item: 15
Processing item: 13
Processing item: 17
======================
Aggregate Exception: Count 5
Continuation Exception: Bad thing happened inside loop (6)
Continuation Exception: Bad thing happened inside loop (3)
Continuation Exception: Bad thing happened inside loop (9)
Continuation Exception: Bad thing happened inside loop (12)
Continuation Exception: Bad thing happened inside loop (15)
======================

Total Processed: 12
Total Exceptions: 5
Done (AggregateException from Continuation)

Now we can see all 5 of the exceptions that were thrown in our loop. In this case, we are simply putting the exception message on the console. But we do have access to each of the full exceptions, including the stack traces. So we have the access to the same information as if we were to "catch" them.

This shows all of the exceptions that were thrown by using a continuation. Now let's look at another way to get the exceptions.

Using ConfigureAwaitOptions.SuppressThrowing

I came across this approach in Gérald Barré's article "Getting all exceptions thrown from Parallel.ForEachAsync". You can read his original article for more details.

This code is in the "configure-await-options" project in the code repository.

An Extension Method

This approach uses an extension method to suppress throwing exceptions on the awaited task (to keep the AggregateException from getting unwrapped). Then it waits on the Task to get the AggregateException directly.

Note: This uses a feature added in .NET 8: ConfigureAwaitOptions. So, this approach will not work for earlier versions of .NET.

Here is the extension method (from the "configure-await-options/Program.cs" file):


  public static class Aggregate
{
    internal static async Task WithAggregateException(this Task task)
    {
        await task.ConfigureAwait(ConfigureAwaitOptions.SuppressThrowing);
        task.Wait();
    }
}

This extension method takes a Task as a parameter and returns a Task. So we can think of this as a decorator of sorts -- it modifies the functionality of a Task in a fairly transparent way to the caller.

At the heart of this is the "ConfigureAwait" call. "ConfigureAwait" got a new parameter in .NET 8: ConfigureAwaitOptions. If you want the details, I would recommend reading Stephen Cleary's article: ConfigureAwait in .NET 8.

In this code, "SuppressThrowing" will keep the Task from throwing an exception when it is awaited. If we "await" a task that has this option set, an exception will not be thrown at the "await".

The next line of the method calls "Wait" on the task. Generally, we want to avoid "Wait" because it is a blocking operation. But since we call "Wait" on a task that has already been awaited, we do not need to worry about this. The task is already complete.

But when we call "Wait" on a faulted task, the exception is thrown. And in this case, it is the full AggregateException (rather than the unwrapped inner exception that we normally get with "await").

Using the Extension Method

To use the extension method, we tack it onto the end of the Parallel.ForEachAsync call (also in the "configure-await-options/Program.cs" file):


      try
    {
        await Parallel.ForEachAsync(Enumerable.Range(1, 100),
            new ParallelOptions() { MaxDegreeOfParallelism = 10 },
            async (i, _) =>
            {
                Console.WriteLine($"Processing item: {i}");
                await Task.Delay(10); // simulate async task
                MightThrowException(i);
                Interlocked.Increment(ref TotalProcessed);
            }).WithAggregateException();
    }
    catch (AggregateException ex)
    {
        ShowAggregateException(ex);
    }
    catch (Exception ex)
    {
        Console.WriteLine($"Exception: {ex.Message}");
    }

Notice that "WithAggregateException" is called on the "ForEachAsync" task. This means that the Task that is awaited at the top is the Task that has been modified by the extension method.

So instead of throwing an individual exception, this throws an AggregateException.

You can see that we have an additional "catch" block for AggregateException, and this uses the same "ShowAggregateException" method from the other example.

Output

The output is similar to the other example:


  Processing item: 10
Processing item: 7
Processing item: 9
Processing item: 2
Processing item: 4
Processing item: 6
Processing item: 8
Processing item: 5
Processing item: 1
Processing item: 3
Processing item: 11
Processing item: 15
Processing item: 12
Processing item: 13
Processing item: 14
Processing item: 16
Processing item: 17
======================
Aggregate Exception: Count 5
Continuation Exception: Bad thing happened inside loop (6)
Continuation Exception: Bad thing happened inside loop (9)
Continuation Exception: Bad thing happened inside loop (3)
Continuation Exception: Bad thing happened inside loop (12)
Continuation Exception: Bad thing happened inside loop (15)
======================

Total Processed: 12
Total Exceptions: 5
Done (AggregateException from ConfigureAwaitOptions)

Now we can see all 5 of the exceptions that were thrown in our loop. As with the first example, we have access to each of the full exceptions, including the stack traces.

Differences in the Approaches

These approaches both accomplish the same thing. They give us access to all of the exceptions that were thrown inside the ForEachAsync loop. Here are some things to keep in mind.

Availability

Using a Continuation is available back to .NET 6 (technically, it goes back further, but ForEachAsync only goes back to .NET 6, so that's really what we care about).

ConfigureAwaitOptions is new in .NET 8. So this will work great going forward, but does not work with prior versions of .NET.

Which approach you prefer depends on what version of .NET you use. If your code is .NET 8, then either approach will work.

Throwing Exceptions

Using a Continuation does not throw the AggregateException. Instead, the AggregateException is cracked open and examined in the continuation. This is also why we do not hit the "catch" block that we have in the code.
Note: With this approach, the outer AggregateException is never thrown, so we do not get the stack trace and some other elements that are filled in when an exception is thrown. All of the inner exceptions (the ones we care about here) *have* been thrown, so they do have the stack trace and other information.

ConfigureAwaitOptions does throw the AggregateException. When "task.Wait()" is called, the AggregateException is thrown. This is caught in the appropriate "catch" block where we can examine it.

Which approach you prefer will depend on your philosophy when it comes to exceptions and exception handling.

Throwing an exception is fairly expensive computationally. So some developers try to avoid throwing exceptions if processing can be handled another way.

On the other hand, some developers prefer to throw exceptions so that they can follow a consistent try/catch pattern throughout their code. This gives consistency and takes advantage of the chain-of-responsibility set up by the exception handling mechanism.

A Missing Feature

Both of these approaches give us access to all of the exceptions that were thrown. However, they do not address the short-circuiting problem. Even though our loop is set up to process 100 items, it stops processing after 17. As noted in the previous article, this is due to the way that Parallel.ForEachAsync is designed.

This missing feature may or may not be an issue, depending on what your code is trying to do.

Bottom Line

Both of these approaches accomplish our mission of getting all of the exceptions thrown in Parallel.ForEachAsync.

Wrap Up

Once we start diving into exception handling, we find that there are different options and different approaches. This includes addressing the default behavior of Parallel.ForEachAsync:

If we "await" ForEachAsync, then we get a single exception (even if exceptions are thrown in multiple iterations of the loop).
The loop short-circuits -- meaning not all items are processed.

With both of the approaches shown in this article, we have solved for behavior #1: we get access to the AggregateException of the ForEachAsync task. With this information, we can find all of the errors that occurred in the loop.

If this is all that we are concerned about, then we are good to go. But if we are concerned about item #2 (the loop short-circuiting), we need to look at another approach. And that is what we will do in the next article.

Happy Coding!

Monday, February 26, 2024

Parallel.ForEachAsync and Exceptions

Parallel.ForEachAsync is a very useful tool for running code in parallel. But what happens when an exception is thrown inside the loop?

By default, the following things happen:

If we "await" ForEachAsync, then we get a single exception (even if exceptions are thrown in multiple iterations of the loop).
The loop short-circuits -- meaning not all items are processed.

Over the next several articles, we will look at these issues and how to deal with them.

Parallel.ForEachAsync and Exceptions this article
Getting Multiple Exceptions from Parallel.ForEachAsync
Continue Processing with Parallel.ForEachAsync (even when exceptions are thrown)

Code samples for all articles are available here: https://github.com/jeremybytes/foreachasync-exception.

This article shows the basics of using Parallel.ForEachAsync and what happens when exceptions are thrown.

Parallel.ForEachAsync Basics

The code for today's article is in the "original-code" folder of the GitHub repository. The early code is work-in progress; only the repository only contains the finished code at the end of the article.

We'll start by looking at the basics of Parallel.ForEachAsync. Here's a bit of code that sets up a non-parallel loop:


      static async Task Main(string[] args)
    {
        Console.Clear();

        foreach (int i in Enumerable.Range(1, 100))
        {
            Console.WriteLine($"Processing item: {i}");
            await Task.Delay(10); // simulate async task
        }

        Console.WriteLine("Done (Original Code)");
    }

This code uses a regular foreach loop to iterate from 1 to 100. Inside the loop, we output to the console and simulate some async work with "Task.Delay(10)". This will delay processing for 10 milliseconds. Since this code is running sequentially, it will take about 1 second for the entire loop to complete.

Here is what the output look like:

An animation of console output showing "Processing Item: 1" "Processing Item: 2" all the way to "Processing Item: 100". It takes about 1 second to complete the list. At the end "Done (Original Code)" is output.

Using Parallel.ForEachAsync

The next step is to change this to a parallel loop:


      static async Task Main(string[] args)
    {
        Console.Clear();

        await Parallel.ForEachAsync(
            Enumerable.Range(1, 100),
            async (i, _) =>
            {
                Console.WriteLine($"Processing item: {i}");
                await Task.Delay(10); // simulate async task
            });

        Console.WriteLine("Done (Original Code)");
    }

Here are a couple of notes on how this code works:

First, notice that we "await" the Parallel.ForEachAsync method. The loop runs asynchronously, so if we do not "await" here, then the Main method would keep going. Because of the "await", the last line (writing "Done" to the console) will not run until after all iterations of the loop are complete.

Next, let's look at the parameters for "ForEachAsync".

The first parameter (Enumerable.Range(1, 100)) is the IEnumerable to iterate through. This is the same as the "in" part of the non-parallel foreach loop.

The second parameter is a delegate that has the work we want to run in parallel.

Delegate Parameter
This delegate has 2 parameters (which we have as (i, _) here). The "i" parameter is the item in the current iteration of the loop. This is equivalent to the "i" in the foreach loop. We can use "i" inside the delegate body just like we can use "i" inside the body of the foreach loop.

The second parameter of the delegate is a CancellationToken. Since we are not dealing with cancellation here, we use a discard "_" to represent this parameter.

The body of the delegate has the actual work. This is the same as the contents of the foreach loop above. We output a line to the console and then simulate some work with await Task.Delay(10).

Because we have "await" in the body of the delegate, the delegate itself is also marked with the "async" modifier (before the parameters).

Output
Because our code is now running in parallel, it completes much faster. Here is what the output looks like (it is too fast to see well):

The speed will depend on how many virtual cores are available to do the processing. Parallel.ForEachAsync normally figures out how many resources to use on its own. We'll add some hints to it later on so we can get more consistent results.

One thing to note about the output is that "100" prints out before "98". This is one of the joys of parallel programming -- order is non-deterministic.

Now let's move on to see what happens when one or more of these items throws an exception.

Throwing an Exception

Here's a method that sometimes throws an exception:


      private static void MightThrowException(int item)
    {
        if (item % 3 == 0)
        {
            Interlocked.Increment(ref TotalExceptions);
            throw new Exception($"Bad thing happened inside loop ({item})");
        }
    }

This will throw an exception for every 3rd item. (We could hook this up to a random number generator, but this gives us some predictability while we look at results.)

Interlocked.Increment
There may be a line of code that does not look familiar here:

Interlocked.Increment(ref TotalExceptions);

In this case "TotalExceptions" is a static integer field at the top of our class. This lets us keep track of how many exceptions are thrown.

"Interlocked.Increment" is a thread-safe way to increment a shared integer. Using the "++" operator is not thread-safe, and may result in incorrect values.

Exceptions in the ForEachAsync Loop
Now we'll update the code to call "MightThrowException" inside our loop. Since we do not want an unhandled exception here, we will wrap the whole thing in a try/catch block:


      static async Task Main(string[] args)
    {
        Console.Clear();
        try
        {
            await Parallel.ForEachAsync(
                Enumerable.Range(1, 100),
                async (i, _) =>
                {
                    Console.WriteLine($"Processing item: {i}");
                    await Task.Delay(10); // simulate async task
                    MightThrowException(i);
                    Interlocked.Increment(ref TotalProcessed);
                });
        }
        catch (Exception ex)
        {
            Console.WriteLine($"Exception: {ex.Message}");
        }

        Console.WriteLine($"\nTotal Processed: {TotalProcessed}");
        Console.WriteLine($"Total Exceptions: {TotalExceptions}");
        Console.WriteLine("Done (Original Code)");
    }

We've changed quite a few things.

First, we have wrapped the entire "ForEachAsync" call in a try/catch block. This is to make sure we do not have an unhandled exception.

Next, we have added the "MightThrowException" call inside of our loop. This will throw an exception for every 3rd item.

Next, we added "Interlocked.Increment(ref TotalProcessed);". This is after the point an exception might be thrown. So if an exception is not thrown, we increment a "TotalProcessed" field (similar to the "TotalExceptions" field). This will give us a count of the items that were processed successfully.

In the "catch" block, we output the exception message.

Finally, we have console output for the total number of items processed successfully and the total number of exceptions.

Output
Here is the output for this code (note: this output is not animated):


  Processing item: 15
Processing item: 16
Processing item: 17
Processing item: 26
Processing item: 18
Processing item: 19
Processing item: 20
Processing item: 21
Processing item: 22
Processing item: 23
Processing item: 24
Processing item: 25
Processing item: 27
Exception: Bad thing happened inside loop (6)

Total Processed: 18
Total Exceptions: 9
Done (Original Code)

This is just the last part of the output, but it tells us enough of about what is happening.

The Issues

The output shows us the 2 issues from the start of this article. These may or may not concern us, depending on what we are doing. But we do need to be aware of them.

Hidden Exceptions

1. If we "await" ForEachAsync, then we get a single exception (even if exceptions are thrown in multiple iterations of the loop).

When an exception is thrown in a Task, it gets wrapped in an AggregateException. This is because Tasks can be complex (with concurrent and child tasks). An AggregateException wraps up all of the exceptions that happen into a single exception.

But when we "await" a Task, the AggregateException gets unwrapped for us. This can be good because we now have a "real" exception and do not have to deal with an AggregateException. But it can be bad because it hides the number of exceptions that actually occur.

Since we "await" the ForEachAsync method, we only see one exception: "Exception: Bad thing happened inside loop (6)". So this is only showing the exception for item #6.

But we can see in the "Total Exceptions" that 9 exceptions were thrown. The other exceptions are hidden from us here.

Short-Circuited Loop

2. The loop short-circuits -- meaning not all items are processed.

The other thing to notice about the output is that the loop stops processing part way through. Only 27 of the 100 iterations of the loop ran. This is the nature of the ForEachAsync method. If a task throws an exception, the loop stops processing.

Depending on our scenario, we may want the loop to continue even if one of the iterations throws an exception.

We deal with both of these items in the next 2 articles.

A Little Consistency

Before leaving this code, let's add a little bit of consistency.

One of the problems with parallel code is that the decision of how many items to run at the same time is left up to the parallel infrastructure. If we have a lot of resources available, then there will be more items run in parallel.

But this also means that output will vary depending on what machine we are running on (and how many resources that machine has at the time). In this code, my desktop and laptop produce different results. The desktop generally stops after 27 items, the laptop will stop after 17 (sometimes fewer, depending on what else is going on).

Parallel.ForEachAsync has an optional parameter where we can set the maximum degrees of parallelism. This will limit the number of items run concurrently, and if we set this to a value lower than our machine resources, will also add some consistency to the output.

Here is our loop with the additional parameter. (This is the final state of our "original-code" project and can be found in the original-code/Program.cs file.)


      await Parallel.ForEachAsync(
        Enumerable.Range(1, 100),
        new ParallelOptions() { MaxDegreeOfParallelism = 10 },
        async (i, _) =>
        {
            Console.WriteLine($"Processing item: {i}");
            await Task.Delay(10); // simulate async task
            MightThrowException(i);
            Interlocked.Increment(ref TotalProcessed);
        });

This second parameter is a ParallelOptions object that sets the MaxDegreesOfParallelism property to 10. This means that a maximum of 10 items run concurrently. (It may be fewer items if there are not enough resources available.)

This gives me a consistency between my mid-range machines.


  Processing item: 6
Processing item: 9
Processing item: 5
Processing item: 1
Processing item: 3
Processing item: 8
Processing item: 11
Processing item: 16
Processing item: 12
Processing item: 13
Processing item: 14
Processing item: 15
Processing item: 17
Exception: Bad thing happened inside loop (9)

Total Processed: 12
Total Exceptions: 5
Done (Original Code)

Now I get a fairly consistent 17 items processed. I want the consistency here so that we can more readily compare results when we look at different ways of handling issues.

Wrap Up

So to recap, here are 2 things that we need to be aware of when we use Parallel.ForEachAsync:

If we "await" ForEachAsync, then we get a single exception (even if exceptions are thrown in multiple iterations of the loop).
The loop short-circuits -- meaning not all items are processed.

This may be fine for the work that we are doing, but we may want to go beyond that. We will tackle the first item in the next article. We'll look at 2 different ways to get all of the exceptions from ForEachAsync, and why we might choose one rather than the other.

In the 3rd article, we will tackle the issue of short-circuiting. So be sure to check back.

Happy Coding!

Friday, February 23, 2024

Have you lived more than 5 days in the last 4 months?

Last night (more correctly, very early this morning), I learned something new about myself:

I haven't been living every day.

I read 1970s science fiction. I have over 600 physical books in my collection, and I've read about half so far. My collection started about 7 years ago (I'm still not sure why I ended up with this category). Here's a recent picture of my collection:

2 bookshelves full of mostly paperback books.

Some of it is good; some is bad; some is amazing. They are rarely life-changing, but I was floored by a very timely passage from an otherwise mediocre book: The Earth Tripper by Leo P. Kelley.

The Earth Tripper

"When I was a child, I never lived in the present. I was always waiting for--oh, for Saturday to come, because on Saturday I had been promised a new bicycle. So Tuesday and all the other days didn't matter at all while I waited for Saturday to come. It came. So did the new bicycle and I was happy. But then it was Sunday all of a sudden and the bicycle wasn't quite so new anymore. I heard that the circus was coming to our town. I watched the men put up the posters all over town, and I got my sister--she was three years older than I was--to read the date to me--you know, the date the circus would arrive. I went home and made a red circle around the day on our kitchen calendar. I waited and waited, and at last the circus came. For a time, the lions and tigers, the bareback riders, and all the handsome aerialists filled my world. But next day, they were gone.

"One day I took the calendar down from the kitchen wall and I counted all the days that I had circled in red. There were five of them from January to May. Then I counted all the ordinary days and I realized that I had been cheating myself. My mother made me stay in my room alone that night while the family ate dinner without me. She couldn't understand why I had torn the calendar to shreds."

"Didn't you explain to her what you had discovered?"

"I didn't know how to explain. I was ashamed. How could I tell anyone--how could I dare tell anyone that I had really lived only five days in four months?"

[pp 139-140, The Earth Tripper, Leo P. Kelley, Coronet Books, 1974, Emphasis mine]

This was particularly impactful to me because I wasn't expecting it. It was an otherwise mediocre book. I only had about 20 pages left. And I was reading it at 1:30 in the morning (because I couldn't sleep). I was just trying to finish up this book.

But when I got to this passage, I realized that I have been living for the red circles.

And how could I dare tell anyone that I had really lived only five days in four months?

Red Circles

For me, the red circle days are the days when I get to help other developers, when I get to make a difficult topic understandable, when I get to show someone something they didn't know. Historically I have done this is a number of ways: blog articles, videos, online courses, speaking engagements, workshops, and one-on-one mentoring.

But I have been limiting my red circle days even further by only including the days I help someone actively rather than passively.

Active vs. Passive

Active help is when I get feedback while I am doing it. An example of this is giving a conference talk. While giving the talk, I can watch the lightbulbs go on -- I can see when someone is really understanding the topic and is excited to make use of what they have learned. It is also the conversations that happen after the talk: helping folks with their specific questions, clarifying a point, or going deeper into a specific area that the talk does not allow for.

These are the times that I feel most useful. And these are the times that I know that my particular skillset allows me to make complex topics more approachable. And these are the times when I know that I am exactly where I need to be.

I know that these active events do not reach everyone in the room. About 10% get enough out of it to leave a glowing rating or a comment. About 5% find it so not useful to them that they leave a comment. And I assume everyone else gets something out of it (probably not life-changing, but maybe useful some day) -- at minimum, they didn't hate it. But I am able to help some people (and see it), and that is enough for me to keep going.

Passive help is when I do not get to see the impact of my work. For example, on this blog. I have written over 500 articles (over 15 years). Some of these are more useful than others. I don't get a huge amount of traffic, but I do get about 20,000 - 30,000 views per month. If I make an assumption that 1% of those views are actually useful to someone, that means that I help 6 - 10 people a day.

But I don't usually think about these 6 - 10 people because I don't get to actually see them.

So my red circles show up on the active help days but not the passive help days.

Too Few Circles

I'm not sure when I started relying on red circle days. For most of my speaking career (career isn't the right word, but I'll use it here), I have averaged about 1 event every 3 weeks. And during my peak year, it was a lot more frequent than that.

But things change. When I moved out of Southern California, I lost access to the local user groups, community events, and local-ish events (within a half-day drive). With COVID, most events did not happen (some went online). And in the past couple of years, several of my favorite events have gone away.

This year, I am confirmed for 2 events so far. I have another 3 or 4 potentials. The hardest part is that there were several events that I was relying on that did not select me. I have been having quite a bit of trouble with those (for a variety of reasons).

The circles are very far apart. Right now, I am in a spot with 6 months between circles.

Finding Life on the In-Between Days

Recently, I have really been feeling a big gap in my life -- like I am merely passing time until the next big thing.

The passage from the book made me realize that I have been living for the red circle days. And I have been getting away with it for a long time because the circles were fairly close together. Somewhere along the way, I forgot about all of the days in between.

The point is not to figure out how to get more red circles. The point is to figure out how to find life on all of the other days.

Honestly, I'm not quite sure how I am going to do this yet. But that's okay.

Rough Days

The last couple months have been kind of rough for me -- and not because of the red circle days. I have discovered some things about myself that have impacted how I look at the world, how I see myself, and how I interact with other people. These discoveries are difficult to work through but are ultimately good.

I have been able to put names and ideas on things that I have recognized in myself and my life. It has been hard because a lot of things I thought I knew about myself had to be reframed (and a bit of that reframing is still happening).

And now I am aware of the impact of red circle days. I knew that there was something wrong, but I didn't know what. Now that I have identified an issue, I can go about changing things.

Are you living just for red circle days? If so, I challenge you to find the life on the days in between.

I am going to find that life, and I am going to start living it.

Happy Living!