Sunday, January 27, 2013

BackgroundWorker Component Compared to .NET Tasks

There are a lot of questions as to whether the BackgroundWorker component should be used for new applications.  If you've read my previous articles (such as this one), you know my opinion on the subject.  To summarize, the BackgroundWorker is an easy-to-use component that works very well for putting a single method into the background while keeping the UI responsive.

In .NET 4.0, we got Tasks.  Tasks are much more powerful than the BackgroundWorker (orders of magnitude more powerful) and have much more flexibility as well.  In .NET 4.5, we got a way to report progress in Tasks (through the IProgress interface).

A Comparison
Personally, I've been working with Tasks a bit more in my code.  And based on this, I thought I would revisit  my BackgroundWorker sample and rewrite it (with the same functionality) using Task instead.  The result is almost the same amount of code (just a few lines different).  I purposefully wrote comparable code so that we could compare the two approaches.

The Initial Code
Our baseline application will be the BackgroundWorker w/ MVVM project that was reviewed a while back.  This application was chosen as a baseline because all of the BackgroundWorker functionality is confined to the ViewModel (in ProcessViewModel.cs).  This makes it easier to swap out the functionality using Tasks.  I'm only going to cover the differences in the ViewModel here; if you want a better idea of the entire project, see the article mentioned above.  The source code for both projects can be downloaded in a single solution here: http://www.jeremybytes.com/Demos.aspx#KYUIR.

I did make a few updates to the UI.  This is primarily because my development machine is using Windows 8, and the previous color scheme didn't look quite right.  Here is the application in action:


As a reminder, we want to support cancellation and progress (progress includes both a percentage for the progress bar and a message for the output textbox.

So, let's start comparing the code!

Fields and Properties
The Fields of the ViewModel are almost the same.  First, the BackgroundWorker fields:

Now, the Task fields:


As we can see, the difference is that the BackgroundWorker field (_worker) has been removed and a CancellationTokenSource (_cancelTokenSource) has been added.  We'll talk a bit about the CancellationTokenSource in just a bit.  We need this in order to support cancellation of our Task.

The Properties are the same in both projects.  Our properties are used for data binding in the UI (the goo that makes the MVVM pattern work).  Since our UI hasn't changed, the public interface of the ViewModel does not need to change either.

The Constructor
Next, we'll take a look at the constructors.  In our BackgroundWorker project, the constructor sets up our component:


But in the Task project, we don't need this initialization.  We've actually moved some of this functionality down a bit.  But we're left with an empty constructor:


Starting the Process
Now that we've got our basic pieces in place, let's take a look at starting our long-running process.  The StartProcess method is fired when clicking the Start button in the UI.  This kicks things off.

Here is the StartProcess method from the BackgroundWorker project:


And here is the StartProcess from the Task project:


There are several similarities between these.  First, we see that the Output property is cleared out.  Then we see that the StartEnabled and CancelEnabled properties are set -- these properties are databound the the IsEnabled properties of the buttons.  We also instantiate the ProcessModel object (_model) if it has not already been created.  As a reminder, the ProcessModel object contains our long-running process and is the Model part of our Model-View-ViewModel.

Now the differences.  First, notice that we need to instantiate a CancellationTokenSource.  This is the private field that we noted above, and we'll be using this to cancel the process.  We'll talk about cancellation a bit more below.

Next, rather than starting the BackgroundWorker (with RunWorkerAsync), we call a method called DoWorkAsync.  This method returns a task.  We can see that it takes our _model as a parameter (just like the BackgroundWorker), but it also takes a CancellationToken as well as a Progress object.  The Progress object simply implements the IProgress<T> interface, and we'll talk a bit more about this later.

The last piece of our StartProcess method is to add a ContinueWith to our task.  This is the equivalent of hooking up an event to the BackgroundWorker's RunWorkerCompleted event.  When the task has completed, the TaskComplete method will run (and we'll see this below as well).

Cancellation
To actually cancel the process, we use the CancelProcess method.  This does not immediately stop any running processes, it simply sets a flag to indicate that things should be cancelled (if possible).

For the BackgroundWorker, we just call the CancelAsync method on the component itself:


For Tasks, we call Cancel on the CancellationTokenSource:


So, let's talk a bit about cancellation.  For the BackgroundWorker, cancellation is baked into the component, and we just need to call the CancelAsync method.  For Tasks, we need to create a CancellationToken and pass it to the task itself (we'll see how it's used in just a bit).  One of the interesting things about a CancellationToken is that you can't just create one directly.  This is why we have an internal field that is a CancellationTokenSource.  The CancellationTokenSource manages a CancellationToken.  We can use the Token property to get this token (which is exactly what we do in our call to DoWorkAsync above).  We can also set that token to a cancelled state by calling the Cancel method on the CancellationTokenSource.  Notice that we are not setting any properties on the token itself; in fact, we're not allowed to directly update the state of the token.

The Long-Running Process
Now we'll take a look at our long-running process.  This happens in the DoWork event of the BackgroundWorker:


As a quick reminder, this method loops through the Model (which implements IEnumerable).  First, it handles whether the process should be cancelled.  Then it calculates the progress percentage and returns a string for progress display.  Finally, if it gets to the end of the loop without cancellation, it returns the current iteration value.

We'll see that we have very similar code in the DoWorkAsync method in our Task project:


The first thing to note is that we have a custom ProgressObject that contains 2 properties.  This will allow us to report both a percentage complete as well as a message.  Notice that our DoWorkAsync takes a parameter of type IProgress<ProgressObject>.  This lets us know what type of object to expect when the progress is reported.  Note that IProgress<T> is available in .NET 4.5; if you are using .NET 4.0, then you need to report progress manually.

So, let's walk through the DoWorkAsync method.  First, notice that it returns Task<int>.  Because it returns a task, we can use ContinueWith to determine what to do after the task completes -- and this is exactly what we did in the StartProcess method above.

For parameters, DoWorkAsync takes a ProcessModel (our model for doing work), a CancellationToken, and an IProgress<T>.

Since we need to return a Task, we need to create one.  This is done through the Task.Factory.StartNew() method.  The version of the method that we're using takes 2 parameters: a Func<int> (since integer is our ultimate return type) and a CancellationToken.  For the first parameter, we're just using a lambda expression to in-line the code.  We could have made this a separate method, but I included it here so that it would look more similar to the BackgroundWorker version.

Inside the lambda expression, we are doing the same thing as in the BackgroundWorker.DoWork event: we loop through the model.  Inside the loop, we first check for cancellation.  Since we are using a CancellationToken, we just need to call ThrowIfCancellationRequested.  This will throw an OperationCanceledException if IsCancellationRequested is true on the token.  Since we passed the token as part of the StartNew method, the exception will be automatically handled and the IsCanceled property of the Task will be set to true.

Next, we calculate the progress.  The calculation is the same as the BackgroundWorker method.  The difference is that to report the progress, we need to create a new ProgressObject (our custom object to hold the percentage and the message), and then call the Report method on our progress object.

And finally (at the end of the lambda expression), we return the value of the last iteration (an integer).

So, we can see that this part is just a bit different.  It's not really more complicated, but we do need to understand Tasks well in order to get all of these pieces to fit together.

Reporting Progress
Updating the progress bar and message for the UI is similar between the projects.  Here is the UpdateProgress method from the BackgroundWorker project:


As a reminder, the ProgressChanged event fires on the BackgroundWorker whenever the ReportProgress method is called.  The event arguments for the event include the ProgressPercentage (which is an integer) as well as the UserState (an object).  Since the UserState is of type object, we can put whatever we like into it.  In this case, we put a string that is displayed in the output box, but we can put a more complex object in there if we like.

The UpdateProgress from the Task project is similar:


Here we can see that our custom ProgressObject is used as our parameter.  This method is called whenever the Report method is called on the Progress object.  This callback was hooked up when we originally created the Progress object in our StartProcess method:


Notice that when we "new" up the Progress object, we pass the UpdateProgress method as a parameter.  This acts as the event handler whenever progress is reported.

We do get some extra flexibility and type-safety with this methodology.  First, our Progress uses a generic type.  This type is ProcessObject in our case, but if we only wanted a percentage, we could have specified Progress<int> (and not worry about a custom object type).  Because we have a generic type, we don't have to worry about casting and will get compile-time errors if we try to use the types incorrectly.  (For more advantages to using Generics, you can look up T, Earl Grey, Hot: Generics in .NET.)

Completing the Process
Our final step is to determine what happens after our long-running process has completed.  Here is the code from the BackgroundWorker project:


Here, we check for an error condition, check the cancelled state, and have our "success" code which puts the result into our Output box and resets the progress bar.  Whatever the completion state, we reset the enabled properties of our buttons.

And from the Task project:


We can see that this code is almost identical.  Our parameter is a Task<int>.  Since it is a Task, we can check the IsFaulted state (to see if there were any exceptions), check the IsCanceled property, and then use the IsCompleted property for our "success" state.

Should I Use BackgroundWorker or Task?
So, we've seen some similarities and some differences between using the BackgroundWorker component and using a Task.  When we look at the total amount of code, the files are very similar (197 lines of code in the BackgroundWorker and 194 lines of code in the Task).

In writing the Task project, I purposely tried to line up the methods with the BackgroundWorker project.  This was to facilitate a side-by-side comparison.  But using either project, we can reduce the amount of code with lambda expressions and other in-line coding techniques.  But the goal wasn't to create the most compact code; it was to compare techniques.

Advantage BackgroundWorker
The BackgroundWorker component has several advantages for this scenario (and I want to emphasize "for this scenario").  First, the BackgroundWorker is easy for a developer to pick up.  As has been mentioned in previous articles, since the BackgroundWorker has a limited number of properties, methods, and events, it is very approachable.  Also, most developers have been working with events already, and so the programming style will seem familiar.

In contrast, there is a steeper learning curve regarding Task.  We need to understand how to construct a Task (the static Factory is just one way to do this) and how to pass in a CancellationToken and IProgress object.  These are both non-obvious (meaning, I never would have guessed that I needed a CancellationTokenSource -- my instinct was to try to use the CancellationToken directly).  We also need to understand what it means when we pass Tasks as parameters and return values.  So, a bit more effort is required.

One other thing to consider is the cancellation process.  With the BackgroundWorker, we raise a flag and then set the Cancel property on the event argument.  With the Task, the cancellation process throws an exception, and exceptions are relatively expensive.  Unfortunately, throwing an exception is the standard way of making sure the Task's IsCanceled property is set to true.  Since IsCanceled is read-only, it cannot be set directly.  So, we get a bit of a performance hit with the cancellation process for Task.  It's probably not enough for us to worry about in most cases (definitely not in this scenario), but it is something to be aware of.

Advantage Task
Tasks are extremely flexible and powerful.  We only touched on a very small part of what Task can be used for; there is much more (such as parallel operations).  And once we start using Task more frequently we can better understand how to take advantage of the async and await keywords that we have in .NET 4.5.

[Update Jan 2015: If you want to take a closer look at Task, be sure to check out the article series here: Exploring Task, Await, and Asynchronous Methods]

The Verdict
The BackgroundWorker is a specialized component.  It is very good at taking a single process and moving it off of the UI thread.  It is also very easy to take advantage of cancellation and progress reporting.

For this scenario, I would lean toward using the BackgroundWorker component -- this is because I like to use the most precise tool that I can.  And this scenario is just what the BackgroundWorker was designed for.

With that said, I am using Task more and more in my code in various scenarios.  It is extremely powerful and flexible.  I am a bit disappointed to see that Task is not 100% compatible with WinRT (although there are fairly easy ways to go back and forth between Task and IAsyncOperation).  But then again, the BackgroundWorker doesn't even exist in the WinRT world.

Wrap Up
I have a bit of a soft spot for the BackgroundWorker component.  I have found it incredibly useful in many projects that I've done.  And if you are dealing with a situation where the BackgroundWorker fits in, I would still encourage its use.  (If you want to review those uses, check here: BackgroundWorkerComponent: I'm Not Dead Yet.)

But, if you're dealing with a situation where the BackgroundWorker does not fit, then don't try to force it.  Instead, look at how Task can be used to make things work.  And as we use async and await more frequently, understanding Task becomes a critical part to writing understandable and reliable code.

Happy Coding!

9 comments:

  1. Thank you Jeremy for providing your blog and comparison examples.

    In the TaskWithProcess project, clicking on the Cancel button cause an unhandled exception to be thrown.

    How to we get the Task's status to equal "IsCanceled" ? (Task in a cancelled state)

    I was able to provide a work-around or hack, by references the "Token" value but that requires changes to your original design.

    Snippet:
    if (_cancelTokenSource.Token.IsCancellationRequested || task.IsCanceled)
    {
    Output = "Canceled";
    }
    else
    if (task.IsFaulted)
    {
    // Handle exception state
    }


    Regards,
    Michael

    ReplyDelete
    Replies
    1. Hi, Michael,

      It's not quite correct that clicking the Cancel button causes an unhandled exception. Yes, when we call "cancelToken.ThrowIfCancellationRequested()" it does throw an exception. But this exception is automatically handled by the Task infrastructure, so we do not need to handle this exception ourselves.

      If we run the sample with debugging, then the debugger stops on the exception, but we can continue running the application from there without problems. And if we run the sample *without* debugging, we see that there is no unhandled exception since Task takes care of it for us. The reason that we call this "Throw.." method is because this is what sets "Task.IsCanceled" to true.

      So, yes it does throw an exception, but it is not an unhandled exception. (I don't really like the idea of throwing an exception here, either. But this is the way that the Task Parallel Library was written.)

      -Jeremy

      Delete
    2. Thanks Jeremy,

      Yes, I re-ran your example "as is', not an debugging mode and the Cancel request worked as expected.

      -Michael

      Delete
  2. why can't you do this in your for loop

    if(cancelToken.IsCancellationRequested)
    break;

    ReplyDelete
    Replies
    1. This code would stop the processing; however, it would not put the Task into a canceled state. This is important because we usually treat a canceled task differently than a completed task. When we use "ThrowIfCancellationRequested", then we can check the state of the task and continue our application as appropriate. For more information, see "Task and Await: Basic Cancellation" (http://jeremybytes.blogspot.com/2015/01/task-and-await-basic-cancellation.html).

      Delete
  3. Jeremy,

    Is there a reason the cancelToken.ThrowIfCancellationRequested(); is inside the foreach loop in the DoWorkAsync()? Seems to me it could go above the foreach and only get executed once unless there is something with threading I haven't grasped yet.

    dbl

    ReplyDelete
    Replies
    1. Yes, there is a very important reason that "ThrowIfCancellationRequested" is inside the foreach loop. When we call "Cancel" on our CancellationTokenSource, it sets the token to a "canceled" state, but this is merely a flag. It is up to us to check the status of the token.

      When we call "ThrowIfCancellationRequested", this checks the state of our token. If the token is in a canceled state, it throws the exception; otherwise this method does nothing. By doing this inside our foreach block, we check the token on each iteration of our loop. If it finds the token in a canceled state, then it will throw the exception and short-circuits the process.

      If you want a little more about canceling tasks (but without a loop), you can check out this article: Task and Await: Basic Cancellation.

      Delete
  4. How is the performance ? Task - is that a create every time I need it mechanism , or can I have it hang around .. the BackGroundWorker I can reuse over and over what is the cost one versus the other performance wise? For example on such an event I get every second should I Task.Run(() => MyProcess()); or RunWorkerAsync - who is faster ?

    ReplyDelete
    Replies
    1. I'm not sure what the differences are in performance. The BackgroundWorker component is more-or-less a facade around a thread pool thread while using Task creates a state machine. So there are bound to be some differences there. I would not be concerned about creation vs. re-use unless you have 100s or 1000s of instances that you're working with.

      One thing to keep in mind is that the BackgroundWorker component can only do one thing at a time. So if you have an event that calls RunWorkerAsync before the previous process has finished, you'll get an exception. Task is much more powerful and flexible. BackgroundWorker is easy to use. Each has it's pros and cons.

      For more information on Tasks, you can check here for some articles as well as a video series: http://www.jeremybytes.com/Downloads.aspx#Tasks

      Delete