Friday, September 7, 2018

Code Samples: Simple or Correct?

Should code samples be "simple" or "correct"? Ideally, they should be both, but I'm running into a dilemma while updating my sample applications for "IEnumerable, ISaveable, IDontGetIt: Understanding .NET Interfaces" and "DI Why? Getting a Grip on Dependency Injection": asynchronous libraries.

These samples use the Repository Pattern to demonstrate the usefulness of interfaces and DI. The problem is that modern data access libraries are asynchronous by default (in particular HttpClient calls into a WebAPI service, and to a lesser-degree Entity Framework Core).

Choosing an Interface
The issue that I'm having revolves around whether the interfaces will be asynchronous or not. Here are the options:

"Simple" repository interface:

A "simple" interface

"Correct" repository interface:

A "correct" interface

The Problem
A lot of developers in the C# world aren't entirely comfortable with async programming. My introduction to Task and await in C# has been one of my more popular topics (a recording from a user group has over 18,000 views). That presentation deals with getting comfortable with some of the basic concepts that have scared developers away from the topic.

I'm really afraid that if I use the "correct" version of the interface that the "Task<T>" in the method signatures will add complexity that will be a distraction from the underlying topic (interfaces and/or DI).

When explaining complex topics, I strive to keep things as simple as possible so that we can focus on the thing that we're trying to learn. This means that I'll often put code in the code-behind for a form: a location where I normally wouldn't put things. I'm okay with that  particular compromise since it's not necessarily incorrect, it just has some drawbacks that we don't normally want in our code -- and it may even be just fine for small applications.

But async is different. When doing things the "simple" way, we end up with incorrect underlying code.

For example, here is an implementation for the "simple" interface using HttpClient:

"Simple" and incorrect use of async mehods

This uses "GetAwaiter().GetResult()" on the Tasks -- which is not really any better than simply using ".Result" on the Task (more on this in Stephen Cleary's article: A Tour of Task Part 6: Results).

The thing that makes this worse is that in my Task & await materials, I specifically say to not use ".Result" like this. It breaks the asynchrony and can lead to deadlocks and other nastiness. It seems very wrong to include code that I tell people to never ever use.

The "correct" implementation "await"s the Tasks and lets the Task bubble up into the function's return:

Correct use of async methods

Calling the Interface Methods
In addition to the interface method signatures, we also need to consider how the interface implementations are used.

Here is a View Model that uses the "simple" repository:

Calling the "simple" repository

This code is pretty easy to follow. The interface implementation comes in through the constructor, and the "GetPeople" function is used in the "RefreshPeople" method. Since "GetPeople" returns an enumerable of Person objects, we can assign it directly to the property on the class.

Here is a View Model that uses the "correct" repository:

Calling the "correct" repository

This code is not much more complex. The difference is that we need to "await" the call into the repository function. When developers are new to async, there is often confusion because "GetPeopleAsync" returns a "Task<IEnumerable<Person>>", and "await" magically lets us assign that to an "IEnumerable<Person>" property. ("await" is a bit magical, which is why it's so awesome.)

This also changes the signature of the "RefreshPeople" method which is now "async" and returns a "Task" instead of "void". So the asynchronous code propagates up. This is pretty common in real applications.

One Thing at a Time
Interfaces in themselves are confusing enough to new learners. I know because I struggled with them quite a bit before I finally "got" them. That struggle is the primary reason I like to teach this topics to others -- to help them get over some of the hurdles that I really struggled with.

But many of the people who are new to interfaces are also fairly new to programming and/or C# in general. This means that putting asynchronous programming into the mix is a bit of a jump.

In addition, being comfortable with interfaces is one of my recommended pre-requisites to learning about "Task" (as well as generics, delegates, lambdas, and a few other language constructs). By including async in the C# Interfaces materials, it leads to a circular dependency.

If the repository code was completely hidden from the learner, then I'd be more comfortable putting "bad" code in there. But it's not. The source code is readily available (and I'd like to keep it that way).

Pros & Cons
Here's a summary of the pros and cons of each approach.

"Simple" Pros:
  • Easier method signatures.
  • Keeps focus on the topic at hand.
  • Don't have to worry about Task as a distraction.
"Simple" Cons:
  • Implementations are bad code (really bad).
  • Implementations require big "DO NOT DO THIS" comment blocks.
"Correct" Pros:
  • Code is correct. 
  • Implementations can be used as-is (with a bit more hardening such as guard clauses and authorization).
"Correct" Cons:
  • "Task" complication distracts from the topic at hand.
  • "Task" may be too big of a hurdle (at this point) and may inhibit learning.
  • May require a bit of a detour to explain a bit about "Task".

Your Help Requested
I have a lot of friends in the development & teaching communities. I'd love to hear your thoughts on this. If you have opinions one way or another, please leave a comment.

Ideally, presentation code samples should be correct and complete -- code that comes from real, production applications. The problem is that these applications generally have a lot of complexity that takes a long time to explain, and even then the learners will probably get a bit lost (Trust me, I've been there).

Please leave a comment with your thoughts. I need to make a decision on how to move forward, and I'd like it to be the most effective way possible.

Thanks for your help!

2 comments:

  1. I prefer Simple code. Especially if you're talking about people new to programming. With the async example, I'd make it synchronous all the way - no Result or GetAwaiter/GetResult.

    I like to keep the comments minimal, something like "// In production code, this would be asynchronous" or "// Modern real-world code would use HttpClient instead of WebClient here". Avoid explaining "why" - they'll learn that when they're ready. You might refer back to it at the point they hook it up to a VM: "You might notice that our application pauses when it loads the people. In real-world production code, we would use asynchrony and HttpClient to avoid this pause. We'll learn more about those advanced topics later."

    ReplyDelete
  2. I agree with Stephen. I'd go synchronous all the way using WebClient with a comment saying modern code would be async and use HttpClient. This will keep your samples focused on what you want to convey.

    ReplyDelete