I was reminded of this yesterday as I went through the drive-thru of a restaurant. So, it's time to take a closer look at the real problem of "metrics as a goal".
A Good Metric Becomes Useless
Yesterday, I was in the drive-thru of a local fast food restaurant. After paying at the window, the employee asked me to pull around to the front of the building, and they would bring me my food there. They have asked me to do this three times over the last couple months, so it stuck out a bit more this time.
Here's the problem: The employees at the drive-thru were being judged on the length of each transaction. The restaurant has sensors set up to see how long each car is at the window (and the shorter, the better). To get me off of the sensor, they asked me to drive around to the front of the restaurant. At this point, the employee has to walk around the counter and come outside to bring me the food.
This sounds like a good metric to check ("how long does it take to serve each customer?"). But the metric became the goal. The effect is that the employees were actually working *harder* to meet that goal. It takes them longer to walk out to the front of the restaurant (and it is work that is completely unnecessary). And this also means that it takes longer to actually serve the customer.
Because the metric became the goal, the employees were working harder to meet the metric, and the actual numbers lost their value -- they no longer know how long it *really* takes to serve each customer.
Code Coverage as a Goal
Bosses love metrics because they are something to grab on to. This is especially true in the programming world where we try to get a handle on subjective things like "quality" and "maintainability".
Code Coverage is one of the metrics that can easily get mis-used. Having our code covered 100% by unit tests (meaning each line of code is represented in a test) sounds like a really good quality to have in our projects. But when the number becomes the goal, we run into problems.
I worked with a group that believed if they had 100% code coverage, they would have 0 defects in the code. Because of this, they mandated that all projects would have to have 100% coverage.
And that's where we run into a problem.
100% Coverage, 0% Useful
As a quick example, let's look at a method that I use in my presentation "Unit Testing Makes Me Faster" (you can get code samples and other info on that talk on my website). The project contains a method called "PassesLuhnCheck" that we want to test.
As a little background, the Luhn algorithm is a way to sanity-check a credit card number. It's designed to catch digit transposition when people type in numbers manually. You can read more about it on Wikipedia: Luhn Algorithm.
So let's write a test:
This test is (almost) 100% useless. It calls our "PassesLuhnCheck" method, but there are no assertions -- meaning, it doesn't check the results.
The bad part is that this is a passing test:
This doesn't really "pass", but most unit testing frameworks are looking for failures. If something doesn't fail, then it's considered a "pass".
Note: I said that this test is *almost* useless because if the "PassesLuhnCheck" method throws an exception, then this test will fail.
Analyzing Code Coverage
Things get a bit worse when we run our code coverage metrics against this code. By using the built-in Visual Studio "Analyze Code Coverage" tool, we get this result:
This says that with this one test, we get 92% code coverage! It's a bit easier to see when we turn on the coverage coloring and look at the method itself:
Note: I didn't write this method, I took it from this article: Extremely Fast Luhn Function for C# (Credit Card Validation).
The blue represents the code that the tool says is "covered" by the tests. The red shows the code that is not covered. (My colors are a bit more obnoxious than the defaults -- I picked bold colors that show up well on a projector when I'm showing this during presentations.)
So this shows that everything is covered except for a catch block. Can we fix that?
From my experience with this method, I know that if we pass a non-numeric parameter, it will throw an exception. So all we have to do is add another method call to our "test":
This test also passes (since it does not throw an unhandled exception). And our code coverage gets a bit better:
We now have 100% code coverage. Success! Except, our number means absolutely nothing.
When a number becomes the goal rather than a guide, that number can easily become useless.Code coverage is a very useful metric. It can tell us that we're headed in the right direction. If we have 0% code coverage, then we know that we don't have any useful tests. As that number gets higher (assuming that we care about the tests and not just the number), we know that we have more and more useful tests. We just have to be careful that the number doesn't become the goal.
Overly Cynical?
Some people might think that I'm overly cynical when it comes to this topic. But I've unfortunately seen this behavior in a couple different situations. In addition to the restaurant employees that I mentioned above, I ran into someone who worked only for the metric -- to the detriment of the customer.
Many years ago, I worked in a call center that took hotel reservations. The manager of that department *loved* metrics. Everyday, she would print out the reports from the previous day and hang them up outside her office with the name at the top of the list highlighted.
There were 2 key metrics on that report: number of calls answered and average talk time. "Number of calls answered" means what you think it means: the number of calls that a particular agent answers in an hour. "Average talk time" tracked how long the agent was on the phone with each customer.
There was a particular agent who was consistently on the top of the report whenever she had a shift. But there was something that the manager didn't know: the agent was working strictly for the metrics.
This agent always took a desk at the far end of the room (away from the managers office). Then she would only answer every 3rd call -- meaning, she would answer and then immediately hang up on 2 out of 3 customers. This got the "number of calls answered" number up -- she was answering 3 times more calls than otherwise. This got the "average talk time" number down -- 2 out of 3 calls had "0" talk time so the average went down. Since the metrics were up, she could take extra breaks and no one would notice.
Not The Goal
So maybe I am overly cynical when it comes to metrics. But I have seen them horribly mis-used. We can have "short" drive thru times while making the experience longer for the customer. We can have "100%" code coverage without actually having useful tests. We can have "short" talk time because we hang up on our customers.
Measuring is important. This is how we can objectively track progress. But when the measurement becomes the goal, we only care about that number and not the reason we're tracking it.
Use those numbers wisely.
Happy Coding!