Friday, December 22, 2017

Your Ideas Needed: Other Ways to Run Code in Parallel

My last article, How Does Task in C# Affect Performance?, drew quite a few suggestions on how to improve the performance. If you'd like to contribute your code, now's your chance!

As mentioned in the prior article, the technique of generating a bunch of tasks is a brute force approach. The reason that I like it is that it gets me the UI behavior that I really want. The goal is machine learning for recognizing hand-written digits, but the visualization is the whole point of this application.

Since the process of making a prediction takes a bit of time, I want the UI to update as each item is predicted. Here's an animation of the application running the current code (available on GitHub: https://github.com/jeremybytes/digit-display):


The point of this application is visualization (back to the first iteration): I want to see different algorithms side-by-side to understand what each one gets right and wrong. If they vary in accuracy, are they making the same errors or completely different ones? The speed is also part of that.

Here are the things I like about this particular implementation:
  1. It's easy to tell by the animation above that the Manhattan Classifier runs faster than the Euclidean Classifier.
  2. We don't have to wait for complete results to start analyzing the data.
  3. It gives me an idea of the progress and how much longer the process will take.
These are things that the brute-force method accomplishes. You can look at the previous article to see the code that runs this.

A Different Attempt
Before I did the manual Tasks, I tried to use a Parallel.ForEach. It was a while back, and I remember that I couldn't get it to update the UI the way that I wanted.

I thought I would take another stab at it. Unfortunately, I ended up with an application that went into a "Not Responding" state and updated the UI in a block:


Instead of showing two different algorithms, this shows two different methods of running the tasks in parallel.

On the left, the "Parallel Manhattan Classifier" runs in the "ParallelRecognizerControl". This is a user control that uses a Parallel.ForEach. On the right, the "Manhattan Classifier" runs in the "RecognizerControl". This is a user control that uses the brute-force approach described previously.

A couple things to note:
  1. The application goes into a "Not Responding" state. This means that we have locked our UI thread.
  2. The results all appear at once. This means that we are not putting things into the UI until after all the processes have completed.
This code is available in the "Parallel" branch of the GitHub project, specifically, we can look in the ParallelRecognizerControl.xaml.cs file.


This uses the "Parallel.ForEach" to loop through our data. Then it calls the long-running process: "Recognizer.predict()".

After getting the prediction, we call "CreateUIElements" to put the results into our UI. The challenge is that we need to run this on the UI thread. If we try to run this directly, we run into threading issues. The "Task.Factory.StartNew()" allows us to specify a TaskScheduler that we can use to get back to the UI thread (more on this here: Task, Await, and Asynchronous Methods).

But as we can see from the animation, this does not produce the desired results.

I tried a few approaches, including a concurrent queue (part of that is shown in the commented code). That got a pretty complicated pretty quickly, so I didn't take it too far.

How You Can Help
If you're up for a challenge, here's what you can do.
The application is already configured to run the "ParallelRecognizerControl" in the left panel and the "RecognizerControl" in the right panel. So you should only have to modify the one file.

If you come up with something good (or simply fun, interesting, or elegant), submit a pull request, and we'll take a look at the different options in future articles.

If you don't want to hash out the code yourself, leave a comment with your ideas along with your approach.

Remember: We're looking for something that gives us a UI that updates as the items are processed. There are much faster ways that we can approach this without the visualization. But the visualization is why this application exists.

Happy Coding!

2 comments:

  1. This will work:

    private void PopulatePanel(string[] input)
    {
    startTime = DateTime.Now;
    var uiContext = SynchronizationContext.Current;
    Task.Run(() => Parallel.ForEach(input, data =>
    {
    var stringInts = data.Split(',');
    var act = stringInts[0];
    var ints = stringInts.Skip(1).Select(int.Parse).ToArray();
    var result = Recognizer.predict(ints, classifier);
    uiContext.Post(_ => CreateUIElements(result, act, data, DigitsBox), null);
    }));
    }

    ReplyDelete
  2. Getting rid of Linq will make it going 10-15% faseter:

    private void PopulatePanel(string[] input)
    {
    startTime = DateTime.Now;
    var uiContext = SynchronizationContext.Current;
    Task.Run(() => Parallel.ForEach(input, data =>
    {
    var stringInts = data.Split(',');
    var act = stringInts[0];
    var ints = GetNumbers(stringInts);
    var result = Recognizer.predict(ints, classifier);
    uiContext.Post(_ => CreateUIElements(result, act, data, DigitsBox), null);
    }));
    }

    private static int[] GetNumbers(string[] stringInts)
    {
    var result = new int[stringInts.Length - 1];
    for (int intIdx = 0, strIdx = 1; intIdx < result.Length; intIdx++, strIdx++)
    {
    result[intIdx] = int.Parse(stringInts[strIdx], NumberStyles.Integer);
    }

    return result;
    }

    ReplyDelete