Jeremy Bytes: January 2017

Sunday, January 29, 2017

A Look Back at January 2017

This is normally where I would put my speaking engagements for February, but I don't have any. I'll be taking a bit of time off instead.

So let's take a look back at January.

A Look Back
I had the great opportunity to go back to NDC London this year. I had an amazing time. It was really great to catch up with friends from all over the world and also make some new friends.

I gave 2 talks. The first was Get Func<>-y: Delegates in .NET. I had a good group in attendance, and it seemed to be well received.

[Update 04/2017: A recording of this presentation is available here: Get Func-y: Delegates in .NET.]

Here's what @jesbach87 said (direct link):

Thanks for the great picture (although it does make me look like a bit of an evil genius).

My friend Dave Fancher (@davefancher) was also there (direct link):

I also got some nice nice compliments on the talk:

I also got some great comments a week later:

My second talk was Focus on the User: Making the World a Better Place. These were the brave folks who came to the first presentation of the day the day after the event party. So I knew they were serious.

[Update 04/2017: A recording of this presentation is available here: Focus on the User: Making the World a Better Place.]

Thanks to my friend Jessica Engstrom (@grytlappen) for the great picture (direct link):

Liam Westley (@westleyl) was nice enough to point out that I tend to stand like a teapot. Based on the pictures from Twitter, it's kind of hard to deny that.

And thanks to Andy (@Pondidum) for this nice comment (direct link):

Recordings from both of these talks will be posted in the coming months. I'll be sure to post the links when they're available.

In addition to the presentations, I had a ton of great conversations. I got to catch up with some folks who I haven't seen for a while. I got to see some of my friends from Europe. And I met some new folks. I had some really great conversations on a huge variety of topics. That's one of the best parts of going to any event.

Sightseeing in London
I also had 3 days to spend in the city. London is one of my favorite places. There is history everywhere, you can walk around and find interesting stuff without even trying, and I really like riding the underground.

Here are a few pictures:

"Sophie" (at the Natural History Museum) is the most complete stegosaurus ever found. Considering that I've also met "Sue" (the most complete T-Rex), I think I'm starting to build a collection.

The Victoria and Albert Museum was amazing. And it included this sculpture. How do you even do that?!?

I spent time in Regents Park, the London Zoo, and Hyde Park. One of the highlights was visiting Speakers Corner in Hyde Park.

This is a place where people get on a soapbox and talk about whatever they'd like. This was a really interesting experience. The topics were varied, the people were interesting, and the dialogues made you think. There was also a bit of humor as people listening threw in their own comments.

I had many other adventures in London (and took over 500 photos). I'm looking forward to going back in May for SDD 2017.

If you're interested in seeing all the photos, you can view the gallery here: Jeremy goes to London (Jan 2017).

A Look Ahead
Even though I don't have anything scheduled for February, I do have a few events in March. On the 18th, I'll be at the Boise Code Camp in Boise ID. Details are still pending. I've heard great things about this event, and I've wanted to go for the last 2 years. So I'm really glad that I have the chance to get there this year.

At the end of the month, I'll be in Las Vegas NV for dotNet Group.org. I've been to this group several times, and I'm glad for the chance to go back. Keep an eye on my website for more details.

You can always check my website to see what's coming up.

Remember, if you'd like me to be at your event this year, be sure to drop me a note. I already have 16 tentative events planned. I'm not sure how many I'll be able to fit in this year.

Happy Coding!

Wednesday, January 25, 2017

Machine Learning Digit Recognizer: Automatically Recognizing Errors

As an ongoing project, I've been working on an application that lets us visualize the results of recognizing hand-written digits (a machine learning project). To see a history, check out the "Machine Learning (sort of)" section in my Functional Programming Articles. You can grab the code from GitHub: jeremybytes/digit-display.

One thing that has kept me from working with this project is that looking for mistakes is tedious. I had to scan through the results which included the bitmap and the computer prediction and determine if it was correct. I had a bit of a brainstorm on how I could automate this process, and that's what this is about.

Here's the result:

All of the "red" marks are done automatically. No more manual error counting.

Using a Single File
The key to getting this to work was to use a single data file instead of two data files.

Originally, I used a separate training set (approximately 42,000 records) and validation set (approximately 28,000 records). These files came from the Kaggle challenge that I talked about way back in my original article (Wow, have I really been looking at this for 2-1/2 years?). Both files contain the bitmap data for the hand-written digits. But the training set also includes a field for the actual value represented.

Rather than using both files, I decided to use the training set for both files. This way, when I could check the actual value to see if the prediction was correct.

There is a bit of a problem, though. If I used the same records for both training and validation, I would end up with 100% accuracy because the records are exactly the same. So the trick was to take a single file and cut out the bit that I wanted to use for the validation set and exclude it from the training set.

Here's an example. Let's say that I had a training set with 20 values:

5, 3, 6, 7, 1, 8, 2, 9, 2, 6, 0, 3, 3, 4, 2, 1, 7, 0, 7, 2, 5

What we can do is carve up the set so that we use some of it for training and some for validation. So, we can use 5 numbers for validation starting at an offset of 2:

5, 3, [6, 7, 1, 8, 2,] 9, 2, 6, 0, 3, 3, 4, 2, 1, 7, 0, 7, 2, 5

This leaves us with 2 separate sets:

Training: 5, 3, 9, 2, 6, 0, 3, 3, 4, 2, 1, 7, 0, 7, 2, 5
Validation: 6, 7, 1, 8, 2

This is obviously a simplified example. In our real file of 42,000 records, we'll be carving out a set of 325 records that we can work with. This still leaves us with lots of training data.

Note: This code is available in the "AutomaticErrorDetection" branch in the GitHub project: AutomaticErrorDetection branch.

Configuration Values
To hold the values, I opted to use the App.config file. I'm not very happy with this solution, but it works. I would much rather be able to select the values in the application itself, but that was troublesome. I'll come back to talk about this a bit later.

Here's App Settings section of the configuration file:

This shows that our training file and data file now point to the same thing ("train.csv"). It also shows that we want to use 325 records for prediction, and that we want to start with record number 1,000 in the file.

Loading Training Data
This means that when we load up the records to use as a training set, we want to take the first 1,000 records, then skip 325 records, and then take the rest.

Here is the original function to load up the training set (in "FunRecognizer.fs")

Original Loader - All Records

This just loaded up all of the records in the file.

Here are the updated functions to load up just the parts of the file we want:

New Loader - Skip Over the Records to be Validated

First we pull some values out of the configuration file, including the file name, the offset, and the record count. One thing I like here is that we can pipe the values for "offset" and "recordCount" to "Int32.Parse" to convert them from string values to integer values really easily.

Then we load up the data. By using "Array.concat", we can take two separate arrays and combine them into a single array. In this case, we're looking at the data in the file. The first part resolves to "data.[1..1000]" which would give us the first 1000 records. The second part resolves to "data.[1000+325+1..]" which is "data.[1326..]". Since we don't have a second number, this will start at record 1326 and just read to the end of the array (file).

The effect is that when we load up the training set, we skip over 325 records that we can then use as our validation set.

Loading Validation Data
We do something similar on the validation side. When we load up the data, we'll use the same file, and we'll pull out just the 325 records that we want.

Here's the method for that (in "FileLoader.cs"):

This starts by grabbing the values from configuration.

Then using a bit of LINQ, we Skip the first 1,000 records (in the case of the configuration shown above), then we Take 325 records (also based on the configuration). The effect is that we get the 325 records that were *not* used in the training set above.

By doing this, we can use the same file, but we don't have to be concerned that we're using the same records.

Marking Errors
There were a few more changes to allow the marking of errors. We can pull out the actual value from the data in the file and use that to see if our prediction is correct.

I added a parameter to the method that creates all of the UI elements (in "RecognizerControl.xaml.cs"):

The new parameter is "string actual". I'm using a string here rather than an integer because the prediction coming from our digit recognizer is a string.

Then in the body of this method, we just see if the prediction and actual are the same:

If they don't match, then we'll set the background color and increment the number of errors. There is a little more code here, but this gives enough to show what direction we're headed.

The result is that the errors are marked automatically. This saves a huge amount of time (and tedium) since we don't have to mark them manually. (Plus, I was never confident that I caught them all.)

Exploring Data
I wanted the values to be configurable because it's really easy to tune algorithms to work with a particular set of data. I wanted to be able to easily try different sets of data. Even with the simple algorithms that we have in this code, we can see differences.

If we pick an offset of 1,000, we get these results:

But if we pick an offset of 10,000, we get these results:

With the first set of data, the Euclidean Classifier looks a lot more accurate. But with the second set of data, the Manhattan Classifier looks to be more accurate. So I want to be able to try different validation sets to make sure that I'm not tuning things just for specific values.

I also do like the side-by-side comparison. This shows if the errors are the same or different.

Easily Changing Values
In earlier versions of this application, the "Record Count" and "Offset" values on the screen were editable values. This made it really easy to change the values and click "Go" to see the new results. But that's not possible when we're using the configuration file. So why the change?

On my first attempt, I tried to figure out how to get the values from the screen to the relevant parts of the application. This was pretty easy to do in the validation set code, but it got a bit trickier to get it into the training set code.

The training set code is nested several levels deep in the functional code. This meant adding some parameters to the "reader" function that is shown above. But because this is called within a function that is called within a function that is called within a function, how could I get those values in there?

I tried adding a parameter and then bubbling that parameter up. This became problematic from a syntactical standpoint, but I also wasn't happy exposing those values in that way. It seemed very "leaky".

So an easy way to fix this was to store the values in a central location that everyone could access. And that's why I created the configuration file.

Future Updates
Now that I have this working, I'm going to do a bit more experimentation. I would like to have the ability to change the values without having to restart the application. So, I'm going to put back the editable text boxes and see if I can work with a separate object to hold these values.

This would have to be in a separate project to prevent circular dependencies. And I would also want to figure out how to use the same immutable object for the training code and validation code. This will ensure that both are using the same values. If they use different values, then things will fall apart pretty quickly.

Wrap Up
It's always fun to play and experiment with different things. Now that I've made this application easier to work with (and much less tedious), I'm more likely to explore different algorithms. I've done a little bit of experimentation in the past, but it will be easier to see results now.

As I continue, I'll look at converting more of this code to F# (I still need more practice with that). And we'll see if we can teach the computer to get better at recognizing those hand-written digits.

Happy Coding!

Tuesday, January 10, 2017

Update: Low Risk vs. High Risk

Back in November, I wrote about a potentially life-changing opportunity that involved a high risk decision: Low Risk vs. High Risk. I've gotten a couple of interesting reactions to that article.

Reactions
First, some people took encouragement from my series of low-risk decisions that (eventually) got me to a good place in my career with an interesting future. I'm glad that this was helpful, and I hope that no one ever feels stuck. You can make progress with small steps.

Second, many people sent their good wishes to me as I tried to make the decision. And a few have contacted me since then to get an update. Thank you to everyone who was rooting for me to succeed, whatever I did.

So now it's time for an update.

An Unexpected Path
If you did take the time to read through the previous article, or have heard me talk about my career path, you'll know that things don't always go according to plan. And generally, I've ended up in a better situation than I was headed for. I really shouldn't be surprised that the same thing happened here.

Even after my head cleared from travel, illness, and cold medicine, I was still having trouble with the high-risk decision. I was heading toward it, but I was conflicted. Something didn't seem right, but I also felt like I needed to explore things further.

Then things changed. As I was headed in that direction, another path opened up. This is a path that was previously blocked, so I was a bit surprised to see it as an option.

This new path is still high risk. The life-changing aspects are similar to the original. There are fewer unknowns, but that makes things a bit scarier in some ways. Importantly, I'll have the help of a good friend which makes the chances of success much better.

I don't like jumping into a game without knowing all the rules. But that's not possible here. I have some fear, but I think the possibilities are worth the risk.

"Courage is not the absence of fear but rather the judgment that something else is more important than fear. The brave may not live forever, but the cautious do not live at all all."
[From The Princess Diaries which borrowed from Franklin D. Roosevelt]

I apologize for being intentionally vague here. I'm not comfortable sharing details until it's a bit more clear whether this will be a success or a failure. But I am taking the risk. For the curious, you'll have to wait a bit longer.

So as usual, I'm headed in a general direction and not quite sure exactly where I'll end up. But at this point, I'm at peace with the decision to take the new path. And it's hard to ask for more than that.

[Update August 2017, I took the high risk path: Taking the Risk.]

Friday, January 6, 2017

Does X Make You Successful?

I was recently asked to complete a survey about Test Driven Development. After looking through the questions, I found that I couldn't really answer them.

In how many projects did you use TDD?
Was TDD successful in at least 50% of the projects?
Did TDD delay the release date of the projects?
Describe the advantages you realized after using TDD.
Describe the disadvantages you realized after using TDD.
Did the software maintenance decrease after using TDD?

The questions are fair questions, and the survey was put together by an engineering student (and I love that he is reaching out to people to ask about their experiences). The problem I had in trying to answer the questions is that it's hard to take a single practice (such at TDD) in isolation and credit success/failure/maintenance costs to that one thing.

Disclaimer
I'm not much of a TDDer myself. I'm a huge believer in unit testing, and I'm convinced that Unit Testing Makes Me Faster. But I'm more of a "test along side" developer, where I'm writing code and writing tests more-or-less together. I don't strictly follow the red-green-refactor cycle, and I don't mandate 100% code coverage in my projects.

Note: If you're curious about my unit testing talk, you can see a recording from Visual Studio Live! from last May.

You might ask why I have videos showing people how to do TDD if I don't use it widely myself. That's mostly environmental. Based on the types of applications I've been building and the environments I've been working in, TDD hasn't been a the the best fit (although I'm sure there are those who disagree with me). But I have seen TDD be an extremely useful tool in a lot of circumstances, so I want to encourage people to explore it and help them get over some of the roadblocks that might stop them.

Other Factors
The problem with trying to isolate success to any one practice (whether Agile, Scrum, TDD, CI/CD) is that there are always other factors that influence success or failure.

Failures
Specifically with regard to TDD, I've seen teams fail horribly using it. I was a bit outside of these groups, and TDD was not the cause of their issues. There were issues with the management not trusting the developers. There were issues of mandating tools and processes that the developers did not believe in. There were issues around team dynamics and trust.

So there were projects that failed while using TDD. But I would not attribute the failure to TDD.

Successes
On the other side, I have a good friend who is a huge TDD proponent. He has been very successful using it, and he helps other people understand it and be successful with it.

I also know a company with a very successful development department. They have several teams that all build code using TDD. But they also have good team dynamics, trust, and a learning mindset. They are always looking for ways to do things better, and they are not afraid to discard things that don't work in their environment.

Isolating Success
The gist of this is that it's really hard to isolate what makes us successful.

I've heard people say, "Once we went to CI/CD, we saw X improvement [in speed / cost / maintenance]." But it's really hard to credit that to Continuous Integration/Continuous Delivery only. That's because most teams are not ready to simply flip the CI/CD switch.

To get to the point where we can be successful with CI/CD, we need to have good automated testing in place, we need to have good source control, we need to have good branch/merge practices. Then we can get to automated deployments. So even if we can make our users happier once when we have CI/CD in place, our success is really attributable to the other factors as well.

Continuous Improvement
One thing that I emphasize when I'm encouraging people to include unit testing in their environment is that it takes time to learn something new. It's not something that we will be instantly productive with.

With any process, framework, library, or language, we go through 3 phases:

Learning the technical bits
This is where we get the basics about how to install tools, what commands are available, and how to get things working from a technical standpoint.
Learning the best practices
This is where we look for experience and advice from other people who have used this tool. We can see what worked for them and what didn't work. And this gives us a good place to start in our environment.
Learning how things fit in our environment
This is where we see what works in our own world. The best practices that we picked up from other developers were things that worked well in their environment, but that doesn't mean they will work for us.

Once we get through phase 3, we can be really productive with this tool. We've figured out how it can really help us, and we're comfortable using is effectively. (And we may not get to phase 3 if we find that the tool really doesn't fit in our environment.)

There is No Silver Bullet
There is so single tool or practice that will make us successful. I've seen teams using Agile fail and I've seen teams using Agile succeed (and I won't get into the "you're doing it wrong" discussion here). I've seen teams using TDD fail, and I've seen teams using TDD succeed.

My biggest frustration was watching a group that was really broken. The management didn't trust their developers and so they tried to come up with the one process that would ensure that every project would be successful. But there is no silver bullet. And every 6 months, they would give up on what they were doing and try another process to ensure success. Over the course of years, I saw each of these processes fail.

There was nothing wrong with the practice or process they chose. And the practice was not the cause of the failure. We need to look beyond any particular practice and talk about what makes up a productive team.

Asking the Right Questions
Programming practices come and go. Programming languages come and go. Programming frameworks come and go. Each of these can be useful tools in the hands of good developers. And they can also be used to create complete disasters.

We need to think about the questions that we ask about any of these tools.

What problem is this tool designed to solve?
Do I have this problem?

There was a time in my career where I did an analysis of the MVVM design pattern, and determined that it was not appropriate for our environment. Of the 3 problems it was designed to solve, we had already solved 2 of those problems another way, and we didn't have the 3rd problem. Since then, I have used MVVM quite successfully in a lot of other environments. But we do need to stop and ask those questions.

So rather than asking if a particular tool or practice makes us successful, we should be asking "What problem is this tool designed to solve?" And of course, "Do I have this problem?"

Happy Coding!

Wednesday, January 4, 2017

Trying to Get Foq Working with NUnit Test Runners (RESOLVED)

I've been exploring mocking using F# (Simple Mocking in F#... & More Simple Mocking...). At some point, mocking gets more complex, and it makes sense to use a mocking library. I've been looking at Foq but have had some issues getting things to work. It turns out the problem seems to lie with the NUnit test runners.

I'm looking for advice on how to get this working, so comments are very welcome here. (And I have tried searching on Google, but the keywords I've been using have not returned anything relevant.)

*** Update (Resolution) ***
When I tweeted out this article, I got lots of responses. That's one thing I really like about the F# community: they are always ready to help. It turns out the issue was easy to resolve. I just needed to change the F# Runtime version from 4.0 to 3.0 in the project settings:

After that, the tests ran successfully. Feel free to read the rest of this article if you're curious, but it turns out the solution was pretty simple (once you know about it).

If you'd like to grab the project, you can get it on GitHub: jeremybytes/mocking-tests.

The Problem
When I started exploring testing using FsUnit, I got error messages when I tried to use Foq. To narrow things down, I eliminated as many variables as possible. That meant using the sample from the "Foq.Usage.fsx" file that came down when grabbing the package from NuGet.

To setup the project, I created a new F# library. Then from NuGet, I got the following packages: "Foq" (v1.7.1), "NUnit" (v3.5.0), and "NUnit3TestAdapater" (v3.6.0).

The Sample Test
Here's the sample test copied from the usage file into the code file (in Library1.fs):

This test should pass, but it fails in the Test Explorer:

Here's the error message:

Method not found: 'Foq.ResultBuilder`2<!0,!!0> Foq.Mock`1.Setup(Microsoft.FSharp.Core.FSharpFunc`2<!0,Microsoft.FSharp.Quotations.FSharpExpr`1<!!0>>)'.

Unfortunately, I'm not familiar enough with code quotations in F# (and parsing expressions in general) to be able to understand this message. So I did some more work to try to narrow things down.

Verifying NUnit
First, I verified that NUnit and the NUnit3TestAdapater actually work with the F# library. For this, I simply added another test (in Library1.fs):

And the test runner has no problems with this. Here's the passing test and the failing test output:

Working Foq Test in F# Interactive
The reason I think the test runner might be the problem is that I was able to successfully run the test in F# interactive. To try to eliminate variables, I created a new script file with just the following (in Script.fsx):

When executing this script, I get the following output:

This shows me that we're not getting any errors just by running this script with Foq and the code quotation.

Testing for Failure
Just to make sure that things were working, I decided to invert the test. I changed the mock object so that it would return "false" instead of "true", but I left the assertion unchanged:

When executing this script, we get the NUnit test failure message:

So that is working as expected.

Trying the Console Runner
My next step to try to isolate this to the test runners was to try the NUnit Console Runner. For this, I grabbed the "NUnit.ConsoleRunner" (v3.5.0) package.

When running the library tests from the command line, I got the same result: 1 passed, 1 failed:

And digging through the output file, I came across the same message as when running the tests in Visual Studio.

Any Ideas?
So the question for the F# gurus out there: Is there a way to get the NUnit test runners to work with this Foq example?

Unfortunately, I'm not at the point where I can parse the error message, so I don't know if it's a simple fix, or something that won't work because of the way that the NUnit test runners are implemented.

*** Update (Resolution) ***
When I tweeted out this article, I got lots of responses. That's one thing I really like about the F# community: they are always ready to help. It turns out the issue was easy to resolve. I just needed to change the F# Runtime version from 4.0 to 3.0 in the project settings.

Why I Care
I know that some folks are saying, "Well just run your tests as a script." And that would work for some scenarios.

What I'm exploring is how to use F# for testing C# code. For that, I want the test runner experience to be as seamless as possible. I'm currently using NUnit, and I prefer to have the integrated test runner in Visual Studio. So that's why I'm looking into this further.

I'll be sure to post solutions because I'd really like to get this working, and I'm sure I'm not the only one.

Happy Coding!

Monday, January 2, 2017

January 2017 Speaking Engagements

It's the start of a new year, and it's time to get back to speaking. January is usually pretty busy for me, but this year, the local user group leaders haven't hit me up. So, I've just got one event scheduled. But it's a good one.

Mon - Fri, Jan 16 - 20, 2017
NDC London
London, England
Conference Site
Topics:
o Get Func-y: Delegates in .NET
o Focus on the User: Making the World a Better Place

I feel very fortunate to be going back to NDC London (and it's only 2 weeks away!). Last year, NDC London was my first international speaking engagement. It was a great experience, and I met a lot of great people, including Anthony Chu (@nthonyChu) from Vancouver, Canada who I got to see again in November at the Microsoft MVP Summit, and also Scott Nimrod (@Bizmonger) from Miami, FL who I got to spend some time talking with in November.

In addition, I got to spend time with some of my speaker friends, and I also had the chance to meet Mark Seemann (@ploeh) whose Dependency Injection book was very influential for my development, and it helped give me the tools I needed to help lots of other folks. I'm looking forward to seeing him again this year.

I'm looking forward to my talks. I've been talking about delegates for quite a while. They've been very useful in my career. And as I've gotten into functional programming, I understand how they are a bit of a gateway to that world in C#.

In addition, I get to talk about making the world a better place. As a developer, that's my goal. Sometimes it's a simple as making one person's job easier to do. And even though that seems small, it does change the world for the better. In this talk, I tell a lot of stories from my career and show that I've been most successful when I've really understood my user. On further reflection, I see that this actually started before I was a professional developer, and it's been extremely useful in my career.

A Look Back
In December, I spoke at Visual Studio Live!, part of the Live! 360 conference in Orlando, FL. I had a really good time and got a really amazing response. I was scheduled to give 3 talks, and I was a bit concerned when I saw the room that they put me in:

Yikes! That's a big room

It turned out I didn't need to be concerned. I had over 200 people show up for my first talk. And attendance for the others was good as well.

What really amazed me was the response. Usually when I do a talk, I get a few mentions in Twitter. But when I came out of one of my talks, I had 54 notifications!

This was a combination of tweets, likes, and retweets. Here are a few that I picked out. The most complimentary came from Londovir (Link to @Londovir's tweet). (Sorry Londo, I don't have your real name):

From Casey Vlastuin (direct link):

It's always interesting to see what people pick up on. From Derek Jacobs (direct link):

I met Aaron Van Wieren at Live! 360 last year, and I got to spend a bit of time with him this year as well. (If you want to be impressed with endurance, he regularly runs 50 mile races.) He's a great guy to hang out with (direct link):

And from Carla Lewis (direct link):

And finally, it looks like I made Jose Cunha into a believer in unit testing. I'm really happy to see this since it's the goal of that talk (direct link):

Lots of other folks were nice enough to Tweet about my presentations as well. Unfortunately, I can't include all of them here. A big thank you for all the folks who came to my talks. And I'm really glad that so many people were able to leave with useful information.

At the airport on my way home, I got a bit of conflicting information regarding how long Orlando has been around:

Anyone know which one it is?

Share Your Development Experience
One of the reasons that I speak is that it allows me to multiply my impact. If I write an application for 10 people, I can have a positive effect on those 10 people. But if I can influence 10 developers, and they each have a positive effect on 10 people, I've multiplied my influence.

So whether you're speaking, blogging, podcasting, streaming, answering questions on StackOverflow, or simply talking with the other developers on your team, make sure you share your experiences. We all learn from each other, and we can multiply our impact to make the world a better place.

Happy Coding!