Monday, September 23, 2013

Applying the Interface Segregation Principle to a Repository

Repositories appear in many of my example applications. This is because it's a fairly easy problem to describe and understand. Last month, we took a look at whether we really need this in our applications: Do I Really Need a Repository?

My friend, Jim, replied with the following:
This is a great point, so let's take a closer look at this.

A Generic Repository
Before looking at the Interface Segregation Principle, let's update the existing interface to use generic parameters.  Here's our original interface from the last article:


This is an example of a CRUD Repository -- where CRUD stands for Create, Read, Update, and Delete. (Another common repository type is CQRS -- Command Query Responsibility Separation -- we'll talk about this in a bit.)

If we add generic parameters to this, we can get this interface to work with more than just Person objects. Everywhere we see "Person", we'll replace with "T", and everywhere we see "string" (which is the type for the primary key), we'll replace with "TKey". Then we just update the method names so that they make sense for any types.

Here's our updated generic repository interface:


This interface will work with any types we have (Customers, Orders, Products, etc.) as well as whatever key types we have (int, string, GUID, etc.).

Now let's look at how we can apply the Interface Segregation Principle to this.

The Interface Segregation Principle
The Interface Segregation Principle is one of the S.O.L.I.D. principles of Object-Oriented Design (specifically, the "I"). Here's what the principle states:
Clients should not be forced to depend upon methods that they do not use. Interfaces belong to clients, not to hierarchies.
What this means is that we should make our interfaces granular so that we only get the methods that we actually need.

When this repository interface is used in the sample applications, we generally only use the "Read" portions. Because of this, we really should not saddle our client object with the update methods (Create, Update, Delete) since it does not use them.

To apply the Interface Segregation Principle, we just need to segregate out the Read operations into a different interface.

Here's what that could look like:


This way, our application that only does Read operations uses this more granular interface. And it only needs to depend on the methods it actually uses.

If we do need a full CRUD repository, then we can simply inherit from this interface and add the other methods:


Because IRepository inherits IReadOnlyRepository, it includes the GetItems() and GetItem() methods, and then adds the other operations. This lets us be selective about the specific interface, and we only need to depend on the methods we actually use.

A CQRS Repository
But what about Jim's suggestion of having an ILoader and ISaver? This has a simplified set of methods. Here's what those interfaces look like:


We can easily imagine that the Loader would work very similar to our read-only repository. This would handle our "Read" (or "Query") operation. The Saver would handle the operations that update the data (the "Command" operations).

This actually follows the pattern of a CQRS repository. As mentioned above, CQRS stands for Command Query Responsibility Separation. This means that we keep the Command operations (the Saver) separate from the Query operations (the Loader).

What About Implementation?
One thing to notice is that the Saver has a much simpler interface. When we have the CRUD repository, we had separate methods for Create, Update, and Delete operations. With the ISaver interface, we have a single Save() method.

So, how do we know what operations we need to perform on the actual data store: Insert, Update, or Delete? That all depends on how we're storing our data.

One possibility is to only have Insert operations on our data store. We always insert the most recent record into the data store (with a current timestamp). If we have a deleted record, we insert the most recent record with a delete flag set to "true". When pulling data out, we make sure that we're fetching the most current record, but we have the full history in the database. This is fairly easy to implement in a document database. It can also be used with a relational database, however, the pattern isn't quite as common with RDBMS.

If we are using a typical RDBMS data store technique, we actually want to make Insert, Update, or Delete calls as appropriate (along with potentially logging the changes in separate change-tracking tables). But how do we know which operation we need to call on a particular record? That's where things get a bit more complex.

One solution that I've used is to use a business object framework that includes change tracking on the application objects. I've used CSLA.NET very successfully (http://www.cslanet.com/). This is a business object framework that works very well for the types of apps that we often build for companies (get data out of a database, let the user change it, put it back into the database). CSLA includes tons of useful features including change tracking, validation, authorization, and it works on a variety of platforms (web, desktop, mobile).

Using a framework such as this, we simply call "Save" on the business object, and the framework figures out which of the actual update operations to call depending on whether the record is new, changed, or deleted. (As a side note, this is a good example of the Facade design pattern where a complex API is hidden behind a simpler interface.)

Implementing this type of functionality yourself is not trivial, which is why we should look to see if there's an existing implementation that we can take advantage of.

Wrap Up
So, if we determine that we do need a repository (Do I Really Need a Repository?), our job is not necessarily done. We need to consider what type of repository will work best in our situation (CRUD vs. CQRS), and we also need to keep in mind whether we're including more methods in our interface than we actually need. If we find that we are not using all of the methods (or we find that we're using some of the methods in one class and other methods in others), we should think about splitting up our interface so that we have the right level of granularity. This is what the Interface Segregation Principle is all about.

This gives us quite a bit to think about. As always, we need to weigh the options and come up with the best solution for our particular environment.

Happy Coding!

2 comments:

  1. Nice write-up. In my experience the loader is usually sufficient for all your reading needs, and saver works great for insert and update. I usually don't end up inserting and updating from the same client, so the saver usually ends up implementing one or the other, but it is not hard to implement both. I don't delete records very often. But I think it would be weird to write a saver that implemented delete. So, IDeleter might be in order to round out the family of interfaces. It would probably be like the saver, but wouldn't return anything, or return a bool might be an option.

    ReplyDelete