February has been a busy month for me, and it looks like the first few weeks in March will be just as busy.
Tuesday, March 3, 2015
San Diego .NET User Group
San Diego, CA
Meetup Link
Topic: Abstract Art: Getting Things "Just Right"
I always have a lot of fun at the San Diego .NET User Group. I'm looking forward to seeing some folks I haven't seen for a while. We'll look at abstraction and how to figure out if we're getting the right level for our applications.
Saturday & Sunday, March 7 & 8, 2015
So Cal Code Camp
Fullerton, CA
Website
Lots of Topics:
o Clean Code: Homicidal Maniacs Read Code, Too!
o DI Why? Getting a Grip on Dependency Injection
o Get Func-y: Delegates in .NET
o IEnumerable, ISaveable, IDontGetIt: Understanding .NET Interfaces
o Learn the Lingo: Design Patterns
o Learn to Love Lambdas (and LINQ, Too!)
This is my local-est code camp (only a few miles from where I live). The Orange County edition of the code camp hasn't happened for a few years, so I'm really looking forward to having this back. 2 full-days worth of great information and networking opportunities -- all for FREE!
Tuesday, March 10, 2015 -- LATE ADDITION
Ontario C#, .NET, Web & Mobile Developers Group
Ontario, CA
www.NerdsLikeYou.org
Meetup Link
Topic: DI Why? Getting a Grip on Dependency Injection
This is a BRAND NEW user group in So Cal. I feel honored to speak at the inaugural meeting. I love to see the developer community grow and thrive, and this is a great opportunity for folks in San Bernardino County to interact with other devs nearby.
Thursday, March 12, 2015
EastBay.NET
Berkeley, CA
Meetup Link
Topic: Abstract Art: Getting Things "Just Right"
I'm heading back up to the San Francisco Bay Area for EastBay.NET. I'm looking forward to seeing some of my bay area friends that I haven't seen for a while and also meeting new people. This gives me another chance to talk about abstraction and help you figure out if you are an over-abstractor or an under-abstractor by nature.
Thursday - Saturday, March 19 - 21, 2015
Nebraska.Code()
Lincoln, NE
Website
Topic: Clean Code: Homicidal Maniacs Read Code, Too!
Nebraska.Code() is a brand new regional event that grew from a local code camp. I'm really looking forward to this event. Lots of great speakers, and I'm looking forward to meeting lots of new people since this is my first chance to speak in this part of the country.
If any of these events are nearby for you, be sure to check them out. And also be sure to stop by and say "Hi". Meeting new people is one of the big reasons to attend these events.
Happy Coding!
Thursday, February 26, 2015
Wednesday, February 25, 2015
The LINQ Join Method: Deciphering the Parameters
In the last few articles, I've been answering questions that have come up during my presentations of lambda expressions and LINQ, including whether we should use the methods with or without a predicate and how the OfType method actually works. To continue, we'll take a look at the "Join" method.
I'm going to spend a bit of time with "Join" because it's come up twice in the last few weeks -- once in an email and once during my presentation at the Las Vegas Code Camp last weekend. When I do my presentations, I use the fluent syntax of LINQ (also known as the method syntax where we "dot" methods together). This leads to the following question:
The completed code for this can be found in the "BONUSJoinSyntax" branch of the lambdas-and-linq repository on GitHub: BONUSJoinSyntax branch. The sample is in the "JoinSyntax" project.
[Update 4/24/2015: If you'd like to see a walkthough of this on video, check out Deciphering the Join Method]
The Sample Data
When we use "Join", we combine 2 separate but related pieces of data. For this, we'll use a collection of "Person" objects and a collection of "Order" objects. Let's take a look at the objects:
This shows our objects. Notice that the "Order" class has a "CustomerId" property. This matches up with the "Id" property of the "Person" class. So these are the properties we will use to join our data.
The data itself comes from the "People" and "Orders" classes. Notice that each of these has a single static method that returns an IEnumerable. This is just a hard-coded list of objects that we can use in our application.
Using "Join" With the Query Syntax
We'll start by looking at how we use "Join" with the query syntax -- it's a bit easier to understand. This code is in the "OrderReports.cs" file. The idea is that we have some methods to generate reports based on our data. Then we have a console application where we display the data.
Here's a simple LINQ query that joins data from our People and Orders:
Notice that this method returns an "IEnumerable<dynamic>". We'll come back to why we're using this in a bit. Before we get to that, let's look at the body of the method.
First we call "GetPeople" and "GetOrders" to get some data to work with.
Next we create a LINQ query using the query syntax. "from p in people" says that we want to get data from the "people" collection and give it the alias "p". Then we have the join: "join o in orders" says that we want to add data from the "orders" collection, and we'll give it an alias of "o". Then we have the "on" statement. This specifies how the data is related. In this case, we say that the "Id" property of a "Person" is the same as the "CustomerId" property of an "Order". (And this is exactly what we noted above).
Finally, we have a "select" statement. For this we are using an anonymous type, which is why we have the "new" keyword without a specific type name. Inside the curly braces, we specify that we want our anonymous type to include the "LastName", "FirstName", and "OrderDate" properties.
The "var" keyword was created just for this purpose: so that we could have types without names. This is still strongly-typed, but the type only has an internal name. We can see this by putting our mouse over the "var" next to "orderDates".
This shows that "orderDates" is an "IEnumerable<T>". Right in the middle we see that "T is 'a". "'a" is the internal name that the compiler gives to our anonymous type. And at the very bottom, we can see the "'a" consists of 3 properties: LastName, FirstName, and OrderDate. (Also notice that the types for the properties are filled in as well.)
Back to the return type for the method: "IEnumerable<dynamic>". The reason I used "dynamic" is so that we can use the anonymous type in our output. Using "dynamic" is not ideal (as we'll see in just a bit), and I'd probably make this a named type in a reporting library -- but we won't worry about this today.
Running the Report
Now that we have a query, let's use it to output some data. For this, we have a simple console application in our project: "Program.cs".
Here's the code:
This calls the "OrderDatesByCustomer1" method that we just created. Then it iterates through each item and outputs it to the console.
If we put the mouse over the "var" with "item", we'll see that this is a "dynamic" object:
This means that we get absolutely zero IntelliSense on this. So inside the "foreach" loop, when we type "item." we don't get any help from Visual Studio. We have to type in "OrderDate", "FirstName", and "LastName" ourselves (and spelling and capitalization count). This is the reason why I would not use "dynamic" in a production reporting library -- it is non-discoverable, meaning we have to have intimate knowledge of the data that is coming back. (I'll update this code in a future article, but let's not let it distract us from looking at the LINQ queries.)
Here's the output when we run the console application:
We have about 20 records, and the first thing I notice is that they would benefit from being sorted. If you're curious about where this order came from, the records are in the order of the people first (which is the "outer" part of our join) and then in date order secondarily (which is the "inner" part of our join). And these are simply the order that the data comes out of the methods (not alphabetical or by date).
Let's do a few things to make this data easier to look at.
Adding Sorting and Filtering
Let's start by putting our results into order by date. That makes most sense based on the data we get back from this report. For this, we'll add an "orderby" statement to our query:
This brings some order to our results:
Next we'll add a date range filter. For this, we'll add a "where" statement:
Notice that we added "startDate" and "endDate" parameters to our method. We use these in the query with a new "where" statement to filter our data.
We need to modify our console application a bit to add the parameters. In this case, we're looking for items in the month of December 2014:
And here's the output:
Now we have a smaller set of records to work with. And this gives us a result set for us to try to match by using the fluent syntax.
Using "Join" With the Fluent Syntax
Let's try to do this same thing using the fluent LINQ syntax. In my presentation for "Learn to Love Lambdas (and LINQ, Too)", I show how to read the method signatures for several LINQ methods, including "Where", "OrderBy", and "SingleOrDefault". I won't repeat all of that here because you can see that in the downloadable materials (and in a video series that I'll be publishing in the new few weeks).
But things get a bit confusing when we look at the method signature for "Join":
This has a lot of parameters. But they are related. The first 2 parameters describe the data collections that we are joining, the next 2 parameters describe how those collections are related, and the last parameter lets us transform the data into our resulting records.
Let's step through them one at a time.
IEnumerable<TOuter> outer
This is one of our collections of data. In our case, this will be our collection of "Person" objects. So wherever we see the generic type "TOuter", we can replace it with "Person". In addition, since this is an extension method, this will actually be the type that we extend. (For more information on Extension Methods, see "Quick Byte: Extension Methods".)
Note: I'm using the word "collection" in the general sense. These are technically enumerations which can also represent calculated sequences.
IEnumerable<TInner> inner
This is the other collection that we are joining to. In our case, this will be our collection of "Order" objects. So wherever we see the generic type "TInner", we can replace it with "Order".
Key Selectors
Next, we have the key selectors. These are one of the confusing parts. But this just gives us a way to specify which properties are related. So for our "Person" (remember, this is "TOuter"), we want to use the "Id" property, and for our "Order" (the "TInner"), we want to use the "CustomerId" property.
Func<TOuter, TKey> outerKeySelector
The syntax for this is a bit strange: "Func<Person, TKey>". From working with Func, we know that this is a method that should take a "Person" as a parameter and return a "TKey". What does this mean? It means that we just need to provide a property that we can use in "equals" comparisons.
Note: If you aren't familiar with Func<T>, then you can take a look at "Get Func<>-y: Delegates in .NET".
In technical terms, this means that we need something that implements the "IEquatable" interface. If we're using standard .NET types (such as int, string, date, etc.), we don't have to do anything special; this is all built in. But if we're using a custom type, then we may need to include the interface or provide a separate equality provider. But this is usually beyond what we need to do.
Since our "Person" class is keyed off of the "Id" property (which is an integer), the actual method signature will be "Func<Person, int>" -- a method that takes a "Person" as a parameter and returns an "integer". We'll see this in a bit.
Func<TInner, TKey> innerKeySelector
To associate our "Order" class with our "Person" class, we use the "CustomerId" property (an integer) of the Order. This means that our actual method signature for this is "Func<Order, int>" -- a method that takes an "Order" as a parameter and returns an "integer".
One thing to notice is that the generic type parameter for both of our key selectors is "TKey". This means that we must use the same type (in our case, it is an integer). This makes sense since we need to compare these based on equality. If they are different types, they they couldn't be equal.
Func<TOuter, TInner, TResult> resultSelector
This last parameter is also a bit confusing. The purpose of this is to specify what we want our output to look like. Just like with our "select" statement in the LINQ query, we need to specify what we want our output to look like. And we'll use an anonymous type for our output just like we did above.
If we fill in our generic types, this means that the method signature will be "Func<Person, Order, 'a>" -- a method that takes a "Person" and an "Order" as parameters and returns an anonymous type.
Putting This All Together
Here's our initial LINQ statement using the fluent syntax:
Let's go through these parameters one at a time.
people
The first parameter is the "people" object. This is the "IEnumerable<Person>" (a.k.a. "IEnumerable<TOuter>"). Since "Join" is an extension method, our first parameter has been moved to the front.
orders
The next parameter (first inside the parentheses) is the "orders" object. This is the "IEnumerable<Orders>" (a.k.a. "IEnumerable<TInner>").
p => p.Id
Next we have our first lambda expression. Remember, this is a "Func<Person, int>" in our case. So we have a lambda expression that takes a "Person" as a parameter (our "p") and returns an integer (the "Id" property).
With lambda expressions, we can name our parameters whatever we like; however, it is very common to use single character parameter names to keep lambdas short. I use "p" to remind me that this is a "Person" object.
This syntax is very similar to what we see when we use the "OrderBy" method to specify a property for sorting. (See "Learn to Love Lambdas" for details on this.)
o => o.CustomerId
Our next lambda expression is similar to the first lambda. This one is a "Func<Order, int>". So we have a lambda expression that takes an "Order" as a parameter (our "o") and returns an integer (the "CustomerId" property).
By returning these two properties ("Id" and "CustomerId"), we let our "Join" method know how to match up the records in the different collections.
(p, o) => new { p.LastName, p.FirstName, o.OrderDate }
Our last lambda expression is what we want our resulting data to look like. As noted above, this needs to match the signature "Func<Person, Order, 'a>" in this case. So we have 2 parameters: Person and Order, and I've named these "p" and "o" respectively.
The body of the lambda is the type that we want to return. And just like when we used the "select" statement above, we use the "new" keyword to generate an anonymous type. And this type specifies that we get the "LastName" and "FirstName" properties from the "Person", and the "OrderDate" property from the "Order".
A Lot of Pieces
So, it seems that we have a lot of pieces, but if we compare this to the query syntax, we see that we have the same information, just in a bit of a different format.
Once you get comfortable with lambda expressions, the fluent syntax becomes very readable. This is one reason why teach people about lambda expressions and encourage them to use them so that they become familiar.
Using the Method
Next we'll update our console application to use the new method:
This is much the same as we have with the other method. And our output is the same as our initial output:
Let's fix this up a bit like we did with our query syntax example.
Adding Sorting and Filtering
We'll add sorting to get things into date order. One nice thing about using the fluent syntax is that we simply "dot" our methods together.
So to add sorting, we just add an "OrderBy" method call:
The "OrderBy" method needs to return a property that we can sort on. In this case, we use the "OrderDate" property. Notice that the syntax for this lambda expression is similar to what we saw with the key selectors.
This makes sense when we look at the signature for "OrderBy":
This has a "keySelector" parameter which is a "Func<TSource, TKey>" -- just like the key selectors in the "Join" parameters. The difference is that this key does a comparison (testing whether something is "greater", "lesser", or "equal") instead of an equality comparison like we have with "Join".
As a side note, I used the parameter name "r" to stand for "result" -- this is the anonymous type that is returned from the "Join" method.
When we add the "OrderBy", our results are sorted:
Finally, we'll add a filter. For this, we'll use the "Where" method. This takes a predicate like we saw when we were looking at various LINQ methods and predicates earlier.
So we'll add some parameters to our method and use them in our "Where":
The predicate for "Where" is a "Func<TSource, bool>", meaning that we take a particular type as a parameter and return a true/false value. In our case, the "TSource" is the anonymous type that is returned from our "Join".
Again, I use "r" here to stand for "result" (which is our anonymous type), and the lambda expression body returns true or false depending on whether the "OrderDate" property falls within the specified date range.
The Fluent Syntax
Once we get several of these methods together, we start to see the fluent syntax. Our call ends up as "Join(...).Where(...).OrderBy(...)".
We end up "piping" the results from one method to another. "Join" returns an IEnumerable<T>, which we use as the input for "Where". "Where" returns an IEnumerable<T>, which we use as the input for "OrderBy". And "OrderBy" returns an IOrderedEnumerable<T> which we use as our result.
As we need more functionality (such as grouping or aggregation), we simply "dot" more methods together in the order that we need.
Updated Console Application
If we update our console application to pass in the date parameters, we can run our new method. The final console application runs both of our methods: the query syntax and the fluent syntax. This lets us see the results side-by-side (sort of):
And we can see that our results are the same.
As a reminder, the code is available on GitHub: BONUSJoinSyntax branch of jeremybytes/lambdas-and-linq.
Query Syntax or Fluent Syntax?
So the question you probably have is whether you should use the query syntax or the fluent syntax. The answer is that it is entirely up to you. As mentioned previously, it's most important that you are consistent.
My personal preference has been to use the fluent syntax. This is primarily because there are many, many LINQ methods to choose from that are not available in the query syntax. This means that we need to mix query syntax with fluent syntax to get full advantage of LINQ. Because of this, I have a tendency to use the fluent syntax. This lets me stay in the mindset of "Func" and lambda expressions. And since I really love lambdas, this isn't very hard for me.
Wrap Up
The good news is that "Join" is one of the more complex LINQ methods out there. That means if you can understand how to parse the parameters for "Join", you're probably all set to parse the parameters for all of the other LINQ methods as well.
"Join" does have a lot of parameters. But as we've seen, if we take things one step at a time (and don't get frightened off), it's not very hard to implement. Although there are 5 parameters, the first 2 are the collections that we are joining, the next 2 are lambda expressions to tell how the collections are related, and the last parameter is the data transformation to the output that we want.
With this under our belt, we're well on our way to using LINQ methods effectively in our code.
Happy Coding!
I'm going to spend a bit of time with "Join" because it's come up twice in the last few weeks -- once in an email and once during my presentation at the Las Vegas Code Camp last weekend. When I do my presentations, I use the fluent syntax of LINQ (also known as the method syntax where we "dot" methods together). This leads to the following question:
How do you use "Join" using the fluent syntax?The answer is that we just need to parse the parameters of the method. When we do this, we may find ourselves using 3 different lambda expressions as parameters. So lets take things step by step.
The completed code for this can be found in the "BONUSJoinSyntax" branch of the lambdas-and-linq repository on GitHub: BONUSJoinSyntax branch. The sample is in the "JoinSyntax" project.
[Update 4/24/2015: If you'd like to see a walkthough of this on video, check out Deciphering the Join Method]
The Sample Data
When we use "Join", we combine 2 separate but related pieces of data. For this, we'll use a collection of "Person" objects and a collection of "Order" objects. Let's take a look at the objects:
From People.cs |
From Orders.cs |
This shows our objects. Notice that the "Order" class has a "CustomerId" property. This matches up with the "Id" property of the "Person" class. So these are the properties we will use to join our data.
The data itself comes from the "People" and "Orders" classes. Notice that each of these has a single static method that returns an IEnumerable. This is just a hard-coded list of objects that we can use in our application.
Using "Join" With the Query Syntax
We'll start by looking at how we use "Join" with the query syntax -- it's a bit easier to understand. This code is in the "OrderReports.cs" file. The idea is that we have some methods to generate reports based on our data. Then we have a console application where we display the data.
Here's a simple LINQ query that joins data from our People and Orders:
Notice that this method returns an "IEnumerable<dynamic>". We'll come back to why we're using this in a bit. Before we get to that, let's look at the body of the method.
First we call "GetPeople" and "GetOrders" to get some data to work with.
Next we create a LINQ query using the query syntax. "from p in people" says that we want to get data from the "people" collection and give it the alias "p". Then we have the join: "join o in orders" says that we want to add data from the "orders" collection, and we'll give it an alias of "o". Then we have the "on" statement. This specifies how the data is related. In this case, we say that the "Id" property of a "Person" is the same as the "CustomerId" property of an "Order". (And this is exactly what we noted above).
Finally, we have a "select" statement. For this we are using an anonymous type, which is why we have the "new" keyword without a specific type name. Inside the curly braces, we specify that we want our anonymous type to include the "LastName", "FirstName", and "OrderDate" properties.
The "var" keyword was created just for this purpose: so that we could have types without names. This is still strongly-typed, but the type only has an internal name. We can see this by putting our mouse over the "var" next to "orderDates".
This shows that "orderDates" is an "IEnumerable<T>". Right in the middle we see that "T is 'a". "'a" is the internal name that the compiler gives to our anonymous type. And at the very bottom, we can see the "'a" consists of 3 properties: LastName, FirstName, and OrderDate. (Also notice that the types for the properties are filled in as well.)
Back to the return type for the method: "IEnumerable<dynamic>". The reason I used "dynamic" is so that we can use the anonymous type in our output. Using "dynamic" is not ideal (as we'll see in just a bit), and I'd probably make this a named type in a reporting library -- but we won't worry about this today.
Running the Report
Now that we have a query, let's use it to output some data. For this, we have a simple console application in our project: "Program.cs".
Here's the code:
This calls the "OrderDatesByCustomer1" method that we just created. Then it iterates through each item and outputs it to the console.
If we put the mouse over the "var" with "item", we'll see that this is a "dynamic" object:
This means that we get absolutely zero IntelliSense on this. So inside the "foreach" loop, when we type "item." we don't get any help from Visual Studio. We have to type in "OrderDate", "FirstName", and "LastName" ourselves (and spelling and capitalization count). This is the reason why I would not use "dynamic" in a production reporting library -- it is non-discoverable, meaning we have to have intimate knowledge of the data that is coming back. (I'll update this code in a future article, but let's not let it distract us from looking at the LINQ queries.)
Here's the output when we run the console application:
We have about 20 records, and the first thing I notice is that they would benefit from being sorted. If you're curious about where this order came from, the records are in the order of the people first (which is the "outer" part of our join) and then in date order secondarily (which is the "inner" part of our join). And these are simply the order that the data comes out of the methods (not alphabetical or by date).
Let's do a few things to make this data easier to look at.
Adding Sorting and Filtering
Let's start by putting our results into order by date. That makes most sense based on the data we get back from this report. For this, we'll add an "orderby" statement to our query:
This brings some order to our results:
Next we'll add a date range filter. For this, we'll add a "where" statement:
Notice that we added "startDate" and "endDate" parameters to our method. We use these in the query with a new "where" statement to filter our data.
We need to modify our console application a bit to add the parameters. In this case, we're looking for items in the month of December 2014:
And here's the output:
Now we have a smaller set of records to work with. And this gives us a result set for us to try to match by using the fluent syntax.
Using "Join" With the Fluent Syntax
Let's try to do this same thing using the fluent LINQ syntax. In my presentation for "Learn to Love Lambdas (and LINQ, Too)", I show how to read the method signatures for several LINQ methods, including "Where", "OrderBy", and "SingleOrDefault". I won't repeat all of that here because you can see that in the downloadable materials (and in a video series that I'll be publishing in the new few weeks).
But things get a bit confusing when we look at the method signature for "Join":
This has a lot of parameters. But they are related. The first 2 parameters describe the data collections that we are joining, the next 2 parameters describe how those collections are related, and the last parameter lets us transform the data into our resulting records.
Let's step through them one at a time.
IEnumerable<TOuter> outer
This is one of our collections of data. In our case, this will be our collection of "Person" objects. So wherever we see the generic type "TOuter", we can replace it with "Person". In addition, since this is an extension method, this will actually be the type that we extend. (For more information on Extension Methods, see "Quick Byte: Extension Methods".)
Note: I'm using the word "collection" in the general sense. These are technically enumerations which can also represent calculated sequences.
IEnumerable<TInner> inner
This is the other collection that we are joining to. In our case, this will be our collection of "Order" objects. So wherever we see the generic type "TInner", we can replace it with "Order".
Key Selectors
Next, we have the key selectors. These are one of the confusing parts. But this just gives us a way to specify which properties are related. So for our "Person" (remember, this is "TOuter"), we want to use the "Id" property, and for our "Order" (the "TInner"), we want to use the "CustomerId" property.
Func<TOuter, TKey> outerKeySelector
The syntax for this is a bit strange: "Func<Person, TKey>". From working with Func, we know that this is a method that should take a "Person" as a parameter and return a "TKey". What does this mean? It means that we just need to provide a property that we can use in "equals" comparisons.
Note: If you aren't familiar with Func<T>, then you can take a look at "Get Func<>-y: Delegates in .NET".
In technical terms, this means that we need something that implements the "IEquatable" interface. If we're using standard .NET types (such as int, string, date, etc.), we don't have to do anything special; this is all built in. But if we're using a custom type, then we may need to include the interface or provide a separate equality provider. But this is usually beyond what we need to do.
Since our "Person" class is keyed off of the "Id" property (which is an integer), the actual method signature will be "Func<Person, int>" -- a method that takes a "Person" as a parameter and returns an "integer". We'll see this in a bit.
Func<TInner, TKey> innerKeySelector
To associate our "Order" class with our "Person" class, we use the "CustomerId" property (an integer) of the Order. This means that our actual method signature for this is "Func<Order, int>" -- a method that takes an "Order" as a parameter and returns an "integer".
One thing to notice is that the generic type parameter for both of our key selectors is "TKey". This means that we must use the same type (in our case, it is an integer). This makes sense since we need to compare these based on equality. If they are different types, they they couldn't be equal.
Func<TOuter, TInner, TResult> resultSelector
This last parameter is also a bit confusing. The purpose of this is to specify what we want our output to look like. Just like with our "select" statement in the LINQ query, we need to specify what we want our output to look like. And we'll use an anonymous type for our output just like we did above.
If we fill in our generic types, this means that the method signature will be "Func<Person, Order, 'a>" -- a method that takes a "Person" and an "Order" as parameters and returns an anonymous type.
Putting This All Together
Here's our initial LINQ statement using the fluent syntax:
Let's go through these parameters one at a time.
people
The first parameter is the "people" object. This is the "IEnumerable<Person>" (a.k.a. "IEnumerable<TOuter>"). Since "Join" is an extension method, our first parameter has been moved to the front.
orders
The next parameter (first inside the parentheses) is the "orders" object. This is the "IEnumerable<Orders>" (a.k.a. "IEnumerable<TInner>").
p => p.Id
Next we have our first lambda expression. Remember, this is a "Func<Person, int>" in our case. So we have a lambda expression that takes a "Person" as a parameter (our "p") and returns an integer (the "Id" property).
With lambda expressions, we can name our parameters whatever we like; however, it is very common to use single character parameter names to keep lambdas short. I use "p" to remind me that this is a "Person" object.
This syntax is very similar to what we see when we use the "OrderBy" method to specify a property for sorting. (See "Learn to Love Lambdas" for details on this.)
o => o.CustomerId
Our next lambda expression is similar to the first lambda. This one is a "Func<Order, int>". So we have a lambda expression that takes an "Order" as a parameter (our "o") and returns an integer (the "CustomerId" property).
By returning these two properties ("Id" and "CustomerId"), we let our "Join" method know how to match up the records in the different collections.
(p, o) => new { p.LastName, p.FirstName, o.OrderDate }
Our last lambda expression is what we want our resulting data to look like. As noted above, this needs to match the signature "Func<Person, Order, 'a>" in this case. So we have 2 parameters: Person and Order, and I've named these "p" and "o" respectively.
The body of the lambda is the type that we want to return. And just like when we used the "select" statement above, we use the "new" keyword to generate an anonymous type. And this type specifies that we get the "LastName" and "FirstName" properties from the "Person", and the "OrderDate" property from the "Order".
A Lot of Pieces
So, it seems that we have a lot of pieces, but if we compare this to the query syntax, we see that we have the same information, just in a bit of a different format.
Query Syntax |
Fluent Syntax |
Once you get comfortable with lambda expressions, the fluent syntax becomes very readable. This is one reason why teach people about lambda expressions and encourage them to use them so that they become familiar.
Using the Method
Next we'll update our console application to use the new method:
This is much the same as we have with the other method. And our output is the same as our initial output:
Let's fix this up a bit like we did with our query syntax example.
Adding Sorting and Filtering
We'll add sorting to get things into date order. One nice thing about using the fluent syntax is that we simply "dot" our methods together.
So to add sorting, we just add an "OrderBy" method call:
The "OrderBy" method needs to return a property that we can sort on. In this case, we use the "OrderDate" property. Notice that the syntax for this lambda expression is similar to what we saw with the key selectors.
This makes sense when we look at the signature for "OrderBy":
This has a "keySelector" parameter which is a "Func<TSource, TKey>" -- just like the key selectors in the "Join" parameters. The difference is that this key does a comparison (testing whether something is "greater", "lesser", or "equal") instead of an equality comparison like we have with "Join".
As a side note, I used the parameter name "r" to stand for "result" -- this is the anonymous type that is returned from the "Join" method.
When we add the "OrderBy", our results are sorted:
Finally, we'll add a filter. For this, we'll use the "Where" method. This takes a predicate like we saw when we were looking at various LINQ methods and predicates earlier.
So we'll add some parameters to our method and use them in our "Where":
The predicate for "Where" is a "Func<TSource, bool>", meaning that we take a particular type as a parameter and return a true/false value. In our case, the "TSource" is the anonymous type that is returned from our "Join".
Again, I use "r" here to stand for "result" (which is our anonymous type), and the lambda expression body returns true or false depending on whether the "OrderDate" property falls within the specified date range.
The Fluent Syntax
Once we get several of these methods together, we start to see the fluent syntax. Our call ends up as "Join(...).Where(...).OrderBy(...)".
We end up "piping" the results from one method to another. "Join" returns an IEnumerable<T>, which we use as the input for "Where". "Where" returns an IEnumerable<T>, which we use as the input for "OrderBy". And "OrderBy" returns an IOrderedEnumerable<T> which we use as our result.
As we need more functionality (such as grouping or aggregation), we simply "dot" more methods together in the order that we need.
Updated Console Application
If we update our console application to pass in the date parameters, we can run our new method. The final console application runs both of our methods: the query syntax and the fluent syntax. This lets us see the results side-by-side (sort of):
And we can see that our results are the same.
As a reminder, the code is available on GitHub: BONUSJoinSyntax branch of jeremybytes/lambdas-and-linq.
Query Syntax or Fluent Syntax?
So the question you probably have is whether you should use the query syntax or the fluent syntax. The answer is that it is entirely up to you. As mentioned previously, it's most important that you are consistent.
My personal preference has been to use the fluent syntax. This is primarily because there are many, many LINQ methods to choose from that are not available in the query syntax. This means that we need to mix query syntax with fluent syntax to get full advantage of LINQ. Because of this, I have a tendency to use the fluent syntax. This lets me stay in the mindset of "Func" and lambda expressions. And since I really love lambdas, this isn't very hard for me.
Wrap Up
The good news is that "Join" is one of the more complex LINQ methods out there. That means if you can understand how to parse the parameters for "Join", you're probably all set to parse the parameters for all of the other LINQ methods as well.
"Join" does have a lot of parameters. But as we've seen, if we take things one step at a time (and don't get frightened off), it's not very hard to implement. Although there are 5 parameters, the first 2 are the collections that we are joining, the next 2 are lambda expressions to tell how the collections are related, and the last parameter is the data transformation to the output that we want.
With this under our belt, we're well on our way to using LINQ methods effectively in our code.
Happy Coding!
Tuesday, February 24, 2015
The LINQ OfType<TResult> Method: Cast or Filter?
So, we're continuing a bit more with LINQ questions that I've had come up recently. Last time, we looked at the difference between using a LINQ method with a predicate or without. Today, we move on to another useful filtering tool: OfType<TResult>.
I use this method in one of my samples and have had some questions come up:
OfType in Action
Last time, we looked at the SingleOrDefault method, and we used the following sample code that saves off a selected item in a list box and then re-assigns it after the list box is repopulated:
Notice that we're using "OfType<Person>()" on the contents of the list box. This is because the "Items" property is not strongly typed.
Let's look at the signature of "OfType":
This is an extension method (just like every other LINQ method). And if we take a closer look, we see this takes an "IEnumerable" as a parameter. Notice that the parameter does *not* have a generic type parameter. So this is an IEnumerable that contains "object" elements. The return value is an IEnumerable *with* a generic type parameter. So the result is strongly typed.
Cast or Filter?
If we just look at the method signature, we might be inclined to believe that this method is doing a cast -- coercing the contained items from "object" to the type of our choice. But this is not what is happening.
No Exceptions
This also means that if the type does not match the object in the IEnumerable, it does *not* throw an exception. Instead, it simply does not return that record.
A Useful Tool
I find the "OfType<TResult>" method to be a useful little tool when I want to convert an object-based collection into a strongly-typed collection. (I'm using "collection" in the general sense here. Technically, we're dealing with iterators (IEnumerable), and these do not need to represent a collection of objects; they could just a easily be a sequence that is generated on-the-fly as the elements are requested.)
In our case, I want to use LINQ methods against the "Items" property of a WPF list box. If we look at the "Items" property, we see that is an "ItemCollection":
And if we look at "ItemCollection", we see that it implements "IEnumerable" (which represents items of type "object"), but not "IEnumerable<T>" (a strongly-typed enumeration):
Pretty much all of the LINQ methods work on "IEnumerable<T>", like we saw with our "SingleOrDefault" method last time:
This means that we need to turn our "IEnumerable" into a strongly-typed "IEnumerable<T>". And that's exactly what "OfType<TResult>" does for us.
Where I Use OfType<TResult>
I find myself using the "OfType" method when dealing with UI controls, like in the examples we saw today. This is because the collection controls are designed to handle pretty much anything we can throw at them -- and that's one of the really cool things about XAML controls and XAML data binding.
But this also means that if we need to pull data out of a UI control, we need to cast it in order to work with it more easily. And that's exactly what "OfType" lets us do.
Again, this isn't doing a cast, but the outcome is pretty much the same. We put an IEnumerable of object in one end and get a strongly-typed IEnumerable<T> out the other end.
When we're using data binding, we often don't need to worry about this. If we have a strongly-typed collection in our view model, we can data bind it to the UI control with no problem at all. And the data binding infrastructure takes care of any type changes for us.
But there are times when we need to pull things out manually in order to work with them. In those situations, the "OfType" method makes things much easier.
Wrap Up
I tell people all the time that they need to take a look at all of the LINQ extension methods that are available to us. Go take another look right now: IEnumerable<T> Extension Methods. This list contains lots of useful methods. And there are a few gems in there that you may not know exist.
LINQ is one of my favorite things. That's probably because I deal with collections of in-memory objects all the time. I get data from a data source, and then I need to manipulate it a little bit for the UI. LINQ lets me do that very easily -- and the end result is happier users.
Next time, we'll take a look at the "Join" method. This is a little more complex, but it's not too hard to understand the parameters when we take them one at a time. So stay tuned for more on LINQ.
Happy Coding!
I use this method in one of my samples and have had some questions come up:
When you use OfType, does it do a cast? What happens if the type doesn't match?Let's take a closer look at OfType, what it really does, and where I find it useful.
OfType in Action
Last time, we looked at the SingleOrDefault method, and we used the following sample code that saves off a selected item in a list box and then re-assigns it after the list box is repopulated:
Saving the Selected Item |
Re-assigning the Selection |
Notice that we're using "OfType<Person>()" on the contents of the list box. This is because the "Items" property is not strongly typed.
Let's look at the signature of "OfType":
This is an extension method (just like every other LINQ method). And if we take a closer look, we see this takes an "IEnumerable" as a parameter. Notice that the parameter does *not* have a generic type parameter. So this is an IEnumerable that contains "object" elements. The return value is an IEnumerable *with* a generic type parameter. So the result is strongly typed.
Cast or Filter?
If we just look at the method signature, we might be inclined to believe that this method is doing a cast -- coercing the contained items from "object" to the type of our choice. But this is not what is happening.
OfType<TResult>() is a Filter (not a cast).The reality is that the "OfType" method is simply a filter, just like many other LINQ methods.
No Exceptions
This also means that if the type does not match the object in the IEnumerable, it does *not* throw an exception. Instead, it simply does not return that record.
A Useful Tool
I find the "OfType<TResult>" method to be a useful little tool when I want to convert an object-based collection into a strongly-typed collection. (I'm using "collection" in the general sense here. Technically, we're dealing with iterators (IEnumerable), and these do not need to represent a collection of objects; they could just a easily be a sequence that is generated on-the-fly as the elements are requested.)
In our case, I want to use LINQ methods against the "Items" property of a WPF list box. If we look at the "Items" property, we see that is an "ItemCollection":
And if we look at "ItemCollection", we see that it implements "IEnumerable" (which represents items of type "object"), but not "IEnumerable<T>" (a strongly-typed enumeration):
Pretty much all of the LINQ methods work on "IEnumerable<T>", like we saw with our "SingleOrDefault" method last time:
This means that we need to turn our "IEnumerable" into a strongly-typed "IEnumerable<T>". And that's exactly what "OfType<TResult>" does for us.
Where I Use OfType<TResult>
I find myself using the "OfType" method when dealing with UI controls, like in the examples we saw today. This is because the collection controls are designed to handle pretty much anything we can throw at them -- and that's one of the really cool things about XAML controls and XAML data binding.
But this also means that if we need to pull data out of a UI control, we need to cast it in order to work with it more easily. And that's exactly what "OfType" lets us do.
Again, this isn't doing a cast, but the outcome is pretty much the same. We put an IEnumerable of object in one end and get a strongly-typed IEnumerable<T> out the other end.
When we're using data binding, we often don't need to worry about this. If we have a strongly-typed collection in our view model, we can data bind it to the UI control with no problem at all. And the data binding infrastructure takes care of any type changes for us.
But there are times when we need to pull things out manually in order to work with them. In those situations, the "OfType" method makes things much easier.
Wrap Up
I tell people all the time that they need to take a look at all of the LINQ extension methods that are available to us. Go take another look right now: IEnumerable<T> Extension Methods. This list contains lots of useful methods. And there are a few gems in there that you may not know exist.
LINQ is one of my favorite things. That's probably because I deal with collections of in-memory objects all the time. I get data from a data source, and then I need to manipulate it a little bit for the UI. LINQ lets me do that very easily -- and the end result is happier users.
Next time, we'll take a look at the "Join" method. This is a little more complex, but it's not too hard to understand the parameters when we take them one at a time. So stay tuned for more on LINQ.
Happy Coding!
Monday, February 23, 2015
LINQ Methods With or Without Predicates: What's the Difference?
I really like lambda expressions, so I talk about them quite a bit in my live presentations, including Learn to Love Lambdas (and LINQ, Too!). I often get questions about different parts of LINQ, so it's time to put some articles together about them.
For the first question:
The Declarative Programming Approach
In my sample code, I show the difference between imperative programming (where we tell the computer how to do something) and declarative programming (where we tell the computer what we want and let it figure out how to do it). You can see an explanation of this here: LINQ and the Functional Approach.
In this block of code, we want to save off the currently selected item of a list box so that we can re-assign the selection after reloading data in the list box. Here are the relevant bits of code:
In this code, we use the overload of "SingleOrDefault" that takes a predicate as a parameter. Here's the method signature from the documentation:
The "predicate" parameter is a "Func<TSource, bool>". This means that we need to provide a delegate that takes a "Person" as a parameter (since our collection has "Person" objects in it), and it needs to return a "bool" value: true or false. In our example, we use a lambda expression (because lambdas are really awesome).
So, this is a filter on our collection. The "SingleOrDefault" method expects to find a single matching item or no matching item. If it finds a single item, it returns that; if it finds no matching items, it returns the default value for the type ("null" for reference types and bitwise 0 for value types). If it happens to find more than one matching item, it throws an exception.
Another Way to Call SingleOrDefault
Now, there is another overload for "SingleOrDefault" -- in this case, it takes no parameters at all. (Well, technically one parameter since it's an extension method; but no additional parameters.) Here's the method signature:
On it's own, this isn't very useful, but it's often used in conjunction with the "Where" method. Here's our example from above re-written in this format:
We call "Where" using the same lambda expression as we used above. But notice that after we call "Where", we make a call to "SingleOrDefault" with no parameters.
Here's the method signature for "Where":
As we see, this uses the same "Func<TSource, bool>" as "SingleOrDefault". That means we can use the same lambda expression with the "Where" method, and we'll end up with the same result.
What's the Difference?
So the question that I get from time to time is "What's the difference?" Does it really matter which of these options we use?
Differences in Readability
There are some differences when we talk about readability. I prefer to use the method syntax for LINQ (also often referred to as the "fluent" syntax). This is where we "dot" our methods together.
When using the fluent syntax, I prefer to put the filter conditional into the method itself (and not have the separate "Where"). This results in a shorter syntax since we have one less method in our chain.
But it's also common to use the query syntax. Here's how we would do the same thing using the query syntax in LINQ:
The query syntax lets us write something that looks more like a SQL query. Here we use the keywords "from", "in", "where", and "select". (And there are others as well. We'll be looking at "join" in an upcoming article.)
One of the limitations of using the query syntax is that not all LINQ functionality is available -- "SingleOrDefault" is one of these unavailable items. So if we want to use this, we wrap our entire query in parentheses and then we can use the method syntax to call "SingleOrDefault" (or "First" or "Count" or many others).
When we use the query syntax, it's very common to use the LINQ method that does *not* take a predicate. It's a bit more readable because we can put more things into the "query" part of the statement, and then add on "SingleOrDefault" right at the end.
Similar LINQ Methods
I really love LINQ. There are so many methods that are very easy to use. Be sure to look through the whole list sometime: IEnumerable<T> Extension Methods.
If we go through this list, we find quite a few that have overloads that take no parameter or a "Func<TSource, bool>". Here are some of my favorites:
"Single" will return a single item. If it finds no items or more than one item, it throws an exception. And we've already seen how "SingleOrDefault" works.
This also has a useful cousin:
"First" will return a single item. If it finds more than one item, it returns the first one it comes to (but unlike "Single", it does not throw an exception if there is more than one match). If it finds no items, it throws an exception. You can probably figure out what "FirstOrDefault" does based on what we've seen so far.
If you're looking for something a little bit different, be sure to look up "Last" and "LastOrDefault".
In addition, we have "Count" and "Any":
"Count" tells us how many items are in the collection, and "Any" returns "true" if the collection is not empty and "false" if it is. This is very useful if we want to know if there is at least one matching item (but we don't really care what the item is).
None of these methods have matching keywords in the LINQ query syntax. This means that if we want to use them, we need to tack them on to the end of our query like we did in the example above. In those instances, it makes sense to use the versions with no parameters (and put the filter inside the query part).
But if we're using the fluent syntax, it's often easier to read if we use the versions that take the predicate as a parameter.
Wrap Up
So behaviorally, it doesn't really matter which version we choose. That means we need to think about readability. This often comes down to preference. There are many developers who prefer the query syntax, and there are many developers who prefer the fluent syntax.
The best advice I can give is be consistent. If we use similar methods in similar ways throughout our code, it will be easier to read and follow. But if we mix things up and use different syntax in different parts of our code, our brains have to do a context switch to understand what's going on. It's best to avoid this if we can.
I have a few more questions to answer about LINQ. Specifically, we'll take a quick look at the "OfType" method, and we'll also take a detailed look at the parameters of the "Join" method. If we use the fluent syntax with "Join", we usually end up with 3 separate lambda expressions as parameters. It's not difficult to understand once we break it down a bit. So, stayed tuned for that.
I really love LINQ. Once you learn to decode the method signatures from the documentation, it's very easy to take advantage of the power and flexibility that these methods provide.
Happy Coding!
For the first question:
What's the difference between calling Single with a parameter and calling Where and then Single without a parameter?To answer this, let's first look at the code that inspired this.
The Declarative Programming Approach
In my sample code, I show the difference between imperative programming (where we tell the computer how to do something) and declarative programming (where we tell the computer what we want and let it figure out how to do it). You can see an explanation of this here: LINQ and the Functional Approach.
In this block of code, we want to save off the currently selected item of a list box so that we can re-assign the selection after reloading data in the list box. Here are the relevant bits of code:
Saving the Selected Item |
Re-assigning the Selection |
In this code, we use the overload of "SingleOrDefault" that takes a predicate as a parameter. Here's the method signature from the documentation:
The "predicate" parameter is a "Func<TSource, bool>". This means that we need to provide a delegate that takes a "Person" as a parameter (since our collection has "Person" objects in it), and it needs to return a "bool" value: true or false. In our example, we use a lambda expression (because lambdas are really awesome).
So, this is a filter on our collection. The "SingleOrDefault" method expects to find a single matching item or no matching item. If it finds a single item, it returns that; if it finds no matching items, it returns the default value for the type ("null" for reference types and bitwise 0 for value types). If it happens to find more than one matching item, it throws an exception.
Another Way to Call SingleOrDefault
Now, there is another overload for "SingleOrDefault" -- in this case, it takes no parameters at all. (Well, technically one parameter since it's an extension method; but no additional parameters.) Here's the method signature:
On it's own, this isn't very useful, but it's often used in conjunction with the "Where" method. Here's our example from above re-written in this format:
We call "Where" using the same lambda expression as we used above. But notice that after we call "Where", we make a call to "SingleOrDefault" with no parameters.
Here's the method signature for "Where":
As we see, this uses the same "Func<TSource, bool>" as "SingleOrDefault". That means we can use the same lambda expression with the "Where" method, and we'll end up with the same result.
What's the Difference?
So the question that I get from time to time is "What's the difference?" Does it really matter which of these options we use?
Behaviorally, there is no difference.Whether we put the predicate as a parameter in the specific method or use a separate "Where" method, the results will be the same.
Differences in Readability
There are some differences when we talk about readability. I prefer to use the method syntax for LINQ (also often referred to as the "fluent" syntax). This is where we "dot" our methods together.
When using the fluent syntax, I prefer to put the filter conditional into the method itself (and not have the separate "Where"). This results in a shorter syntax since we have one less method in our chain.
But it's also common to use the query syntax. Here's how we would do the same thing using the query syntax in LINQ:
The query syntax lets us write something that looks more like a SQL query. Here we use the keywords "from", "in", "where", and "select". (And there are others as well. We'll be looking at "join" in an upcoming article.)
One of the limitations of using the query syntax is that not all LINQ functionality is available -- "SingleOrDefault" is one of these unavailable items. So if we want to use this, we wrap our entire query in parentheses and then we can use the method syntax to call "SingleOrDefault" (or "First" or "Count" or many others).
When we use the query syntax, it's very common to use the LINQ method that does *not* take a predicate. It's a bit more readable because we can put more things into the "query" part of the statement, and then add on "SingleOrDefault" right at the end.
Similar LINQ Methods
I really love LINQ. There are so many methods that are very easy to use. Be sure to look through the whole list sometime: IEnumerable<T> Extension Methods.
If we go through this list, we find quite a few that have overloads that take no parameter or a "Func<TSource, bool>". Here are some of my favorites:
"Single" will return a single item. If it finds no items or more than one item, it throws an exception. And we've already seen how "SingleOrDefault" works.
This also has a useful cousin:
"First" will return a single item. If it finds more than one item, it returns the first one it comes to (but unlike "Single", it does not throw an exception if there is more than one match). If it finds no items, it throws an exception. You can probably figure out what "FirstOrDefault" does based on what we've seen so far.
If you're looking for something a little bit different, be sure to look up "Last" and "LastOrDefault".
In addition, we have "Count" and "Any":
"Count" tells us how many items are in the collection, and "Any" returns "true" if the collection is not empty and "false" if it is. This is very useful if we want to know if there is at least one matching item (but we don't really care what the item is).
None of these methods have matching keywords in the LINQ query syntax. This means that if we want to use them, we need to tack them on to the end of our query like we did in the example above. In those instances, it makes sense to use the versions with no parameters (and put the filter inside the query part).
But if we're using the fluent syntax, it's often easier to read if we use the versions that take the predicate as a parameter.
Wrap Up
So behaviorally, it doesn't really matter which version we choose. That means we need to think about readability. This often comes down to preference. There are many developers who prefer the query syntax, and there are many developers who prefer the fluent syntax.
The best advice I can give is be consistent. If we use similar methods in similar ways throughout our code, it will be easier to read and follow. But if we mix things up and use different syntax in different parts of our code, our brains have to do a context switch to understand what's going on. It's best to avoid this if we can.
I have a few more questions to answer about LINQ. Specifically, we'll take a quick look at the "OfType" method, and we'll also take a detailed look at the parameters of the "Join" method. If we use the fluent syntax with "Join", we usually end up with 3 separate lambda expressions as parameters. It's not difficult to understand once we break it down a bit. So, stayed tuned for that.
I really love LINQ. Once you learn to decode the method signatures from the documentation, it's very easy to take advantage of the power and flexibility that these methods provide.
Happy Coding!
Monday, February 16, 2015
More TDD Practice: Finishing Up the Library
Yesterday, I did some coding practice: using test-driven development to implement a library that makes a service call and then parses the data. There were a few things left undone, so today I coded those up (also using TDD).
You can get the code for this in the sunset-tdd branch of the GitHub project: jeremybytes/house-control. And a collection of all the articles for this project are available here: Rewriting a Legacy Application.
As a reminder, the classes that we are working with are SunsetTDD and SunsetTDDTest.
Implementing GetSunrise
The first step is pretty easy: we just need to implement the "GetSunrise" method of our class. This is pretty similar to our "GetSunset" method.
We'll start with a test:
This looks like our test for "GetSunset" that we saw yesterday. It uses a mock object to get the service data, and our expected output is February 15, 2015 at 6:35:18 a.m.
Of course, this fails. That's because our method is still not implemented:
But we'll grab the functionality from our other method (with appropriate changes for sunrise):
With this code in place, we have a passing test. (And we can see this in the "1/1 passing" note of the Code Lens information.)
Caching Functionality
Now we need to move on to something a bit more difficult: caching. We don't want to make a service call every single time we need the sunrise or sunset data. So, we'll make the service call and save off that data (with the date that the data refers to). Since we're generally only dealing with on day at a time, this is sufficient to reduce the number of service calls that the application needs to make.
Caching Test #1
We'll start by writing a unit test to test for the caching functionality:
Just like with other tests, we create a mock object to supply us with the data. But unlike the other tests, we're calling the "GetSunset" method twice with the same date parameter. Ideally, we should only make one service call in this scenario.
And that's exactly what Moq allows us to check for. Notice the last line of our method. We're using our mock object ("serviceMock") and using the "Verify" method to see how many times a particular method is called. In this case, we want to know how many times the "GetServiceData" method is called. This is the method that actually gets data from the service.
We can see from the 2nd parameter of the "Verify" method that we are expecting this to be called only one time. But that's not what's happening (as we can see from the failing unit test).
If we look at the unit test message, we see why the test failed:
This tells us exactly what we expect to find. Our unit test is expecting that the service will be called once, but it is actually called 2 times. No surprise since we have not yet implemented the cache.
Cache Implementation
To implement the cache, We'll add 2 fields to our class:
These will hold the data that actually comes from the service as well as the date that was used as a parameter to get that data. Again, since we're only dealing with one date at a time (generally), this simple cache will work for us.
To populate the cache, we've created a new method:
This method is fairly straight-forward. It checks to see if the cache date is the same as the date that we're looking for. If it is *not* the same, then it calls the service to get fresh data and populates our 2 cache fields. If the dates do match, then we simply return our cached data.
The last step is to use this in our "GetSunset" method. This is as simple as swapping our our call to "SunsetService.GetServiceData" (that actually calls the service) with our new method "GetServiceData" (which will use the cache).
With this in place, our test passes. But we have a few more scenarios to test.
Caching Test #2
Our next scenario is to test the cache by calling "GetSunrise" multiple times. Here's our test:
And this test fails. That's because we still need to update the "GetSunrise" method to use our new caching method.
And that's easy enough to do:
Now our test passes. Now in my normal coding, I would have a tendency to have updated *both* the "GetSunset" and "GetSunrise" methods at the same time. But since I'm practicing my TDD, I've been resisting the urge to update code before I've written a test for it.
Additional Tests
Just because our cache is working doesn't mean that we're done with testing.
Caching Test #3
As another scenario, I want to check that "GetSunset" and "GetSunrise" both share the same cache. Here's the test:
Notice that instead of calling the same method twice, we call "GetSunrise" one time and "GetSunset" one time (both with the same date parameter). Our expectation is that our service only gets called once due to the cache.
And that's exactly what we see. This test passes without us needing to modify the code.
Caching Test #4
As our last test, I want to make sure that the cache is *not* used when we call methods with different date parameters. Here's that test:
Here we have 2 different date variables ("date1" and "date2"). And we use these to call the "GetSunset" method with different parameters. This time, our "Verify" is a little different: it expects that our service will be called exactly 2 times.
And that's the behavior that we get. This test passes without needing to modify any of our code.
"Test First" Does Not Mean We Stop When the Code is Written
The moral of this is that we aren't necessarily done creating tests after our code is in place. We need to make sure that we test various scenarios to make sure that we have a good set of tests in place. So even if we have "TDD'd" all of our code creation, we need to check for thoroughness in our tests.
There are probably a few more test scenarios that I need for this library. And I'll review the tests some more to look for gaps. (This is one of the things I hope that Smart Unit Tests will be able to help us with once it is released.)
Wrap Up
More coding practice is good. I'm glad that I kept going with this. I wasn't quite sure how I would handle the caching functionality. And it turns out that I ended up with similar code to my original implementation. The methods are a little bit different (and I think a little bit cleaner).
The more I do this, the more I understand the advantages and disadvantages of using TDD. I still need to try this with different types of code (including libraries that call into a database or do data validation). But this is a good start, and I'll keep exploring.
I encourage you to explore different techniques and technologies. Pick the ones that are most useful to you and incorporate them into your development process.
Happy Coding!
You can get the code for this in the sunset-tdd branch of the GitHub project: jeremybytes/house-control. And a collection of all the articles for this project are available here: Rewriting a Legacy Application.
As a reminder, the classes that we are working with are SunsetTDD and SunsetTDDTest.
Implementing GetSunrise
The first step is pretty easy: we just need to implement the "GetSunrise" method of our class. This is pretty similar to our "GetSunset" method.
We'll start with a test:
This looks like our test for "GetSunset" that we saw yesterday. It uses a mock object to get the service data, and our expected output is February 15, 2015 at 6:35:18 a.m.
Of course, this fails. That's because our method is still not implemented:
But we'll grab the functionality from our other method (with appropriate changes for sunrise):
With this code in place, we have a passing test. (And we can see this in the "1/1 passing" note of the Code Lens information.)
Caching Functionality
Now we need to move on to something a bit more difficult: caching. We don't want to make a service call every single time we need the sunrise or sunset data. So, we'll make the service call and save off that data (with the date that the data refers to). Since we're generally only dealing with on day at a time, this is sufficient to reduce the number of service calls that the application needs to make.
Caching Test #1
We'll start by writing a unit test to test for the caching functionality:
Just like with other tests, we create a mock object to supply us with the data. But unlike the other tests, we're calling the "GetSunset" method twice with the same date parameter. Ideally, we should only make one service call in this scenario.
And that's exactly what Moq allows us to check for. Notice the last line of our method. We're using our mock object ("serviceMock") and using the "Verify" method to see how many times a particular method is called. In this case, we want to know how many times the "GetServiceData" method is called. This is the method that actually gets data from the service.
We can see from the 2nd parameter of the "Verify" method that we are expecting this to be called only one time. But that's not what's happening (as we can see from the failing unit test).
If we look at the unit test message, we see why the test failed:
This tells us exactly what we expect to find. Our unit test is expecting that the service will be called once, but it is actually called 2 times. No surprise since we have not yet implemented the cache.
Cache Implementation
To implement the cache, We'll add 2 fields to our class:
These will hold the data that actually comes from the service as well as the date that was used as a parameter to get that data. Again, since we're only dealing with one date at a time (generally), this simple cache will work for us.
To populate the cache, we've created a new method:
This method is fairly straight-forward. It checks to see if the cache date is the same as the date that we're looking for. If it is *not* the same, then it calls the service to get fresh data and populates our 2 cache fields. If the dates do match, then we simply return our cached data.
The last step is to use this in our "GetSunset" method. This is as simple as swapping our our call to "SunsetService.GetServiceData" (that actually calls the service) with our new method "GetServiceData" (which will use the cache).
With this in place, our test passes. But we have a few more scenarios to test.
Caching Test #2
Our next scenario is to test the cache by calling "GetSunrise" multiple times. Here's our test:
And this test fails. That's because we still need to update the "GetSunrise" method to use our new caching method.
And that's easy enough to do:
Now our test passes. Now in my normal coding, I would have a tendency to have updated *both* the "GetSunset" and "GetSunrise" methods at the same time. But since I'm practicing my TDD, I've been resisting the urge to update code before I've written a test for it.
Additional Tests
Just because our cache is working doesn't mean that we're done with testing.
Caching Test #3
As another scenario, I want to check that "GetSunset" and "GetSunrise" both share the same cache. Here's the test:
Notice that instead of calling the same method twice, we call "GetSunrise" one time and "GetSunset" one time (both with the same date parameter). Our expectation is that our service only gets called once due to the cache.
And that's exactly what we see. This test passes without us needing to modify the code.
Caching Test #4
As our last test, I want to make sure that the cache is *not* used when we call methods with different date parameters. Here's that test:
Here we have 2 different date variables ("date1" and "date2"). And we use these to call the "GetSunset" method with different parameters. This time, our "Verify" is a little different: it expects that our service will be called exactly 2 times.
And that's the behavior that we get. This test passes without needing to modify any of our code.
"Test First" Does Not Mean We Stop When the Code is Written
The moral of this is that we aren't necessarily done creating tests after our code is in place. We need to make sure that we test various scenarios to make sure that we have a good set of tests in place. So even if we have "TDD'd" all of our code creation, we need to check for thoroughness in our tests.
There are probably a few more test scenarios that I need for this library. And I'll review the tests some more to look for gaps. (This is one of the things I hope that Smart Unit Tests will be able to help us with once it is released.)
Wrap Up
More coding practice is good. I'm glad that I kept going with this. I wasn't quite sure how I would handle the caching functionality. And it turns out that I ended up with similar code to my original implementation. The methods are a little bit different (and I think a little bit cleaner).
The more I do this, the more I understand the advantages and disadvantages of using TDD. I still need to try this with different types of code (including libraries that call into a database or do data validation). But this is a good start, and I'll keep exploring.
I encourage you to explore different techniques and technologies. Pick the ones that are most useful to you and incorporate them into your development process.
Happy Coding!