Monday, January 26, 2015

Fixing Problems with a (Specific) Fluent API

I'm still working on my home automation project. Recently, I've been dealing with the code that gets data for sunrise and sunset times -- and it's been getting a lot better. But in my last update to the code, I put in some extension methods so that we code using a fluent API.

I've since thought about this, and based on the problems with the current implementation, it needs to be updated. There are 2 directions we can take with this, so I created some branches and coded up both of them. After further consideration, I'll make a decision on which one will be merged back into the master branch.

The code (and articles) for this project are collected here: Rewriting a Legacy Application. The code is on GitHub: jeremybytes/house-control. And we'll be looking at several branches here.

The Problem Code
The problem code is in the FluentAPI-Original branch (we're specifically looking at the "SunriseSunsetOrg.cs" file in the "HouseControl.Sunset" project). The main idea behind this code is that we have a set of extension methods on the "string" class:

If you're not familiar with extension methods, check out Quick Byte: Extension Methods.

We're paying attention to the first parameter here (the one with the "this" keyword). Because the "GetSunsetString" takes a string and returns a string, and the "GetLocalTime" takes a string and returns a DateTime, we can write code like this (where "cacheData" is a string):

I like this syntax (and I've always liked fluent APIs such as LINQ). And this code works fine in isolation. But if we step back and take a broader look at what we've done with this code, it starts to look a little less brilliant.

Specifically, the extension methods extend the "string" class. This means that the methods really should work with *any* string that we give them. But the reality is that they only work with very specific strings: a JSON object and a date/time string.

But because these are extension methods on "string", there's nothing to stop us from using them on other string objects:

This is really bad. If we have a string that contains "Hello", we can call "GetLocalTime" on that. And the code completion in Visual Studio is almost encouraging us to do that. But it would just result in an error.

Falling into the Pit of Success
I really don't like code that lets developers get themselves into trouble so easily. When I create a library, I really want the developers to fall into the pit of success. This means that I need to remove any dangers that I can. This code is too easy to use incorrectly.

And there are really 2 directions that we can go with this. First, we can create more specific types that we use for our extension methods. Instead of using "string", we would use a custom type that contains the JSON data or the date/time data as appropriate.

The other option is to remove the extension methods (and along with it the fluent API). We'll explore both of these options.

Option #1: An Updated Fluent API
If I decide that I really like the fluent API, then we can use the code that makes the API a bit safer. This code is in the FluentAPI-Updated branch.

The extension methods have been updated to work with custom types (instead of string):

We have 2 new types -- "ResponseContentString" and "UTCTimeString" -- which are designed to hold the JSON result of our service and a string that represents a UTC time, respectively.

Notice that "GetSunriseString" and "GetSunsetString" both take a "ResponseContentString" as a parameter and return a "UTCTimeString". Then we can use that "UTCTimeString" with the "GetLocalTime" method to return the local date/time.

The custom classes themselves are pretty simple. In fact, I just added them to the top of the same "SunriseSunsetOrg.cs" file.

These just have a single property to hold a string value. But because they are custom types, we can target our methods a bit better.

When we have these in place, we can still use the same fluent syntax:

But in order to do this, we've had to change a few of our variables. Notice that the "cacheData" field is now a "ResponseContentString".

And if we look at the methods on this type, we see that they are a bit more focused:

So we see that we have "GetSunriseString" and "GetSunsetString" available to us (but not "GetLocalTime").

And if we tried to make a call on a "Hello" string (like we did above), we would not see any of our extension methods since they now operate on these custom types.

Note: As another alternative, we could move the extension methods so that they are part of the custom classes themselves. But before doing that, we need to think about whether it confuses the responsibility of each class. I could argue either way in this case. We'll stick with the separate extension methods here.

Advantages & Disadvantages
The biggest advantage to this approach is that we get to keep the fluent API. The biggest disadvantage is that we now have more custom types floating around our code.

So, we're making our code a bit more complex in order to make using the API a bit simpler. Like everything else we deal with when making decisions about our code, everything is a bit of give and take.

Now let's take a look at another option.

Option #2: Remove the Fluent API
Our other option is to get rid of the fluent API by removing the extension methods. This code is in the FluentAPI-Removed branch.

We're not removing the functionality that was in the extension methods, we're just turning them into normal methods:

These methods are now in the "SunriseSunsetOrg" class instead of the "SunriseSunsetOrgHelpers" class. And they are now just static methods. Notice that the "this" keywords have been removed from the parameters.

These have the same functionality, but we end up calling them a bit differently. Here's our calling method:

Notice that our fluent syntax is gone. Now we're calling methods with parameters and passing results as parameters to the next method. We could tighten up this code a little bit (by removing some of the intermediate variables), but it gets more difficult to read and debug.

For a look at why we may or may not want the intermediate variables, check out I Can Write That Method with One Line of Code.

Advantages  & Disadvantages
The biggest advantage to this approach is that we keep things simple. We don't add any new custom types (notice that the "cacheData" field is just a "string" like it was before).

The biggest disadvantage is that we lose the fluent syntax. Again, I like fluent APIs (where they make sense). And I'm okay with losing access to the fluent nature of the API if it's not appropriate here.

Which Would You Choose (and Why)?
I am going to merge one of these options back into the master branch of this project. So the question is, which one should I choose? And is there a good reason to choose one over the other?

The branches are all on GitHub:

I'm planning on leaving these branches in the repository even after I've merged the branch that I want. That way, the code will stay there to go along with this article.

So, check out the FluentAPI-Original branch to see the code in its problem state (with the extension methods on "string").

The FluentAPI-Updated branch has the code for Option #1 where we maintain the fluent API by adding custom types.

The FluentAPI-Removed branch has the code for Option #2 where we remove the extension methods and do not add any custom types.

[Update 01/28/2015: Based on the comments on this article, I've explored some options a little further: Reviewing Options: Building on the Experience of Other Developers.]

Wrap Up
I'll be considering the pros and cons of these options over the next few days. Neither one of these affect the operation of the application (in fact, the "problem" code is running just fine in production). But I'm using this application to experiment with different techniques. This lets me practice my skills so that I can more readily use them in other environments.

Coding practice is important, and it's something that many developers never do. When we go through exercises like this one, we think through techniques and approaches, and we find out what works well for us and what doesn't. That way, when we get into "crunch time" in a real application, we know exactly what approach will work best for us.

Happy Coding!


  1. Option #2 seems to be the better choice, but that's because I don't always strive for a fluent-chainable api. Option #1 would be acceptable, but only if the additional types had enough utility (other than a fluent api) to warrant the additional types. If they were only used in a couple places, I feel like it would be cognitive overhead with no real benefit.

    1. I agree with you, Mackenzie. You are spot on when you say the new types with Option #1 don't have enough utility. And there really isn't a need to expand those types, so they only exist to hold a single string value. Option #2 seems like the better choice here.

    2. At the same time there is something to be said for semantic typing ( Your GetSunsetDate doesn't accept "Hello", nor does it accept "Sometime after noon." It explicitly needs a properly formatted UTCTimeString.

      Rather than passing in just a raw string, you are faced with the awful extra step of calling the UTCTimeString constructor, which does support taking in a random string. However, it takes the random string and can perform validation to ensure that it is, in fact, a proper UTC time.

      You've lifted the concern of confirming the 'true' type of the string off of the function, have a better place to handle validation and preserve type safety throughout.

      We've implemented a few of these semantic types in our system that involves lots of specially formatted strings. It's helped to alleviate a few concerns in different places.

    3. Good points, Cliff. I was thinking about adding validation to the custom types in Option #1. Thanks for the link to the semantic typing article. It looks like I'll have to code it up to see what it looks like for this project. I suspect it's more that I really need here, but it will be good practice.