Thursday, April 28, 2016

More Fun with Machine Learning - Recognizing Digits

I've been interested in machine learning for a long time, but I've been moving very slowly. I got curious recently and decided to put two of my projects together to see what would happen.

Back in June 2014, I came across a machine learning competition to recognize hand-written digits. I wasn't up for the machine-learning parts, but I did write a program to figure out how to display the datasets: Coding Practice: Displaying Bitmaps from Pixel Data


Then in July 2015, I got really excited about Mathias Brandewinder's book Machine Learning Projects for .NET Developers -- mainly because Chapter 1 was all about recognizing hand-written digits using this same training set:


I went through the code, first with the C# sample and then with F#. And working through the process step-by-step was very helpful. Ultimately, the result of this, though, was an accuracy percentage. The initial evaluator that used Manhattan Distance came out with an accuracy of 93.4%.

Then Mathias goes on to show other algorithms that get that accuracy much closer.

A Visual Display of Accuracy
Fast forward to a couple weeks ago: I got it into my head that I should combine these two things. Rather than just showing an accuracy percentage, why don't I display what the computer thinks the digit is next to the bitmap of the digit itself?

And that's exactly what I did.

You can get the code from GitHub. I put it into the same "digit-display" project that I used previously, but I added a new branch to hold the combined code: GitHub: jeremybytes/digit-display - Recognizer Branch.

Just a note: if you download and run the code as it is right now, it takes about 3 minutes to process 1,000 records. I'm working on improving the speed, but I'm just happy to get the results right now.

Here's the output:


This is a *large* image, so you might want to click on it to see the full size (non-blurry) version.

Before analyzing the results, let's take a look at how I modified the code.

The Original Project
The original project had a WPF application and a separate project that loaded the data (which is a string of pixel brightness values) and turned them into bitmaps.


The New Project
I added the C# code from Mathias' book in a separate project (named "Recognizer"):


I know I should be using the F# code here, but most of that is in a script. Future steps will be for me to put that F# code into a library that I can bring in to this solution.

Updates to the WPF Application
I took the path of least resistance to get this to work. In my WPF application, I have most of the code in the code-behind of the MainWindow.

I added a new method that would initialize the digit recognizer (GitHub: MainWindow.xaml.cs):


This uses the "BasicClassifier" (as described in Mathias' book). I load up the data from the "training" set and use that to train the classifier.

In the previous code, I loaded up data from the training set to display. But I changed the code a bit to display the "test" set.

The difference between the "training" set and the "test" set is that the training set has a field that tells what the hand-written digit is supposed to be. This lets gives our classifier both the bitmap data and the expected output.

The "test" set only has the bitmap data. It does not have a field that tells what the number is supposed to be. Instead, we'll use our eyes to pick out the wrong ones.

In the App.config file, we have both files referenced so we can load them in the right places:


Displaying the Digits
In the old code, I created an Image control and loaded it into a WPF wrap panel. In the new code, I add the Image, and then I add a TextBlock that holds the result of our classifier.


The "imageString.Split..." code takes our original input (which is a string of comma-separated values) and turns it into an integer array -- the format needed by our classifier.

Then we pass that to the "classifier.Predict()" method, and it gives us back the recognized digit.

And that gives us the output (again, it takes about 3 minutes to process 1,000 records):


Analyzing the Results
This is where things get interesting. Now we (as humans) can look through the results to find the pairs of numbers that do not match. Here are just a few.



When looking through the numbers that are wrong, it's easy to see why the classifier made the choice that it did. There are similarities in shapes. And these similarities are easy to see when looking at all the numbers the classifier got *right*.

My next steps are to convert this to use the F# code for the machine learning bits and also figure out how to speed things up -- probably by parallelizing a bunch of the code.

Humans are Awesome
Looking at the hand-written digits compared to the numbers predicted by the classifier really gives me an appreciation for how difficult this problem really is.

As a human, I have no trouble interpreting the hand-written digits (except for a couple that are really ambiguous). And it's interesting to think about the things that go on in our brains that allow us to recognize things so quickly -- without conscious analysis. It all happens in a moment without having to think about it.

Before you leave, scan through the results to see how it gets some "hard" ones right, and some "easy" ones wrong. Our brains are pretty amazing:

Click for full-size image
Teaching a computer how to do that is pretty impressive. And honestly, I'm surprised that such a simple algorithm (remember, this is the "step 1" algorithm that we're using here) can get the accuracy as high as it does. This really taught me that we should start out simple and only get more complex as we need to.

And it also gives us a lot of new places to explore.

Happy Coding!

Monday, April 25, 2016

Integrating NUnit into Visual Studio -- Update for NUnit 3

Overview: This article talks about using the NUnit Test Adapter to integrate the NUnit test runner with the Visual Studio Test Explorer. In particular, we need to ensure that we're using the right Test Adapter package for the version of NUnit that we're using.

I like using the Visual Studio Test Explorer. This is integrated into my environment, and I can always undock the window and move it to a different monitor. In particular, I love this button:


This is the "Run Tests After Build" button (it's currently only available in the expensive version of Visual Studio). When this button is toggled down, impacted unit tests are automatically run every time I build. This gives me immediate feedback when I break something.

Integrating NUnit with the Test Adapter
I've started using NUnit more and more (check out the "Why NUnit?" articles here: http://www.jeremybytes.com/Demos.aspx#UTMMF). In addition to the NUnit framework, there is also a Test Adapter package available from NuGet that integrates the NUnit test runner with the Visual Studio test explorer.

I've been using this functionality for quite some time now: Integrating NUnit into Visual Studio Test Explorer.

The problem is that we've had a bit of a version mis-match between NUnit and the Test Adapter since NUnit 3 came out last November. The good news is that last week (April 19th, 2016), the Test Adapter that's compatible with NUnit 3 finally hit release.

NUnit Version 3 and the NUnit3TestAdapter
To use NUnit and the Test Adapter with a particular project in Visual Studio, we just need to use NuGet to grab the appropriate packages. But we need to pay attention to the packages that we're pulling.

For NUnit 3, we want the following 2 packages:


Since we're using Nunit version 3 (specifically v3.2.1), we need to use the "NUnit3TestAdapter" package.

Note: This is a completely different package from the old test adapter. It is not merely the old package with a new version.

NUnit Version 2 and the NUnitTestAdapter
If you're still using NUnit version 2, we need to use a completely different test adapter package:


Here, we have NUnit version 2 (specifically v2.6.4), so we need to use the "NUnitTestAdapter" package.

Note: This is a different package from the new test adapter.

Wrap Up
The moral of the story is that we need to be careful about the packages that we pull down when using NUnit.

When using NUnit 3, we need to use the NUnit3TestAdapter package.
When using NUnit 2, we need to use the NUnitTestAdapter package.

Even though the NUnit framework packages are the same (with different versions), the test adapter packages are different packages. If we get a mismatch, then we won't see our tests in the test explorer.

But when we get the right packages together, things work great. We get to see our tests inside Visual Studio, and we can interact with them easily in the integrated environment.

Happy Coding!

Saturday, April 2, 2016

April 2016 Speaking Engagements

I have four events scheduled for April. Be sure to stop by if you can.

Tuesday, April 5, 2016
San Diego .NET User Group
San Diego, CA
Meetup Event
o Unit Testing Makes Me Faster: Convincing Your Boss, Your Co-Workers, and Yourself

Saturday, April 16, 2016
Utah Code Camp
Salt Lake City, UT
Event Site
o DI Why? Getting a Grip on Dependency Injection
o Learn to Love Lambdas (and LINQ, Too!)

Wednesday, April 20, 2016
Agile SoCal
Irvine, CA
Group Site
o Unit Testing Makes Me Faster: Convincing Your Boss, Your Co-Workers, and Yourself

Thursday, April 21, 2016
Central California .NET User Group
Fresno, CA
Meetup Event
o I'll Get Back to You: Task, Await, and Asynchronous Methods

A Look Back
Early in March, I had a chance to speak at LA DOT NET. I had a great time talking about Task and Await. Time was a bit short, and I thank everyone who stuck it out with me to the end.

I also went up to Berkeley, CA to speak at EastBay.NET. I talked about Task and Await (which has been quite a popular topic for me lately), and I also had a chance to share some views toward learning. We're always new at something, and it's okay not to be perfect at it.

I just got back from a trip to Kentucky for Code PaLOUsa. I had a great time. I got to catch up with some of my friends, and I talked to tons of people while I was there. It was great to talk to folks from Kentucky, Tennessee, Indiana, Michigan, Ohio, Nebraska, and Missouri.

I was given a chance to share "Becoming a Social Developer" before the keynote on Tuesday. Several people came up to talk to me about it, and I hope that it encouraged lots more conversations during the event.

On Tuesday afternoon, I got to share one of my favorite topics: C# Interfaces. A big thanks to the people who were willing to stand up during the presentation. I had a great time (as usual).

Jeremy at Code PaLOUsa 2016

A Look Ahead
I'm looking forward to several events coming up, including Visual Studio LIVE! Austin in TX (May), NDC Oslo in Norway (June), KCDC in Kansas City MO (June), and the Silicon Valley Code Camp in San Jose CA (October).

I also have a number of pending proposals out there for events in Tennessee, Missouri, Wisconsin, California, Florida, and Sweden. Keep an eye on my website where new events are posted as soon as they are confirmed: www.jeremybytes.com.

In addition to the larger events, I'm sure I'll have things pop up for user groups and local events. If you'd like me to come to your event, just drop me a note and we'll see what we can work out.

Hopefully I'll see you at an event soon. If so, be sure to stop by and say "Hi".

Happy Coding!