Wednesday, May 23, 2012

Next, Please! - A Closer Look at IEnumerable (Part 3 - Fibonacci Sequence)

This is a continuation of "Next, Please! - A Closer Look at IEnumerable".  The articles in the series are collected together here: JeremyBytes - Downloads.

Last time, we took a look at Extension Methods and the added functionality we get from LINQ when using classes that implement IEnumerable<T>.  Today, we'll create our own class that implements IEnumerable<T> (a couple of classes, actually).  We'll see what it takes to implement the interface, a really cool shortcut that we can use, and how we can use LINQ with our newly created classes.

Implementing IEnumerable<T>: IntegerSequence
A few articles back, we explored using the BackgroundWorker Component with MVVM.  While the article did not focus on it, our Model (the ProcessModel object) implemented the IEnumerable<T> interface.  This was so that we could use the model with a foreach loop to simulate a long-running process.

Today, we'll be creating a very similar object: the IntegerSequence.  When we use this class with a foreach loop, it will return a sequence of consecutive positive integers.  A property (NumberOfValues) will specify how many integers should be part of the sequence.  The sample code can be downloaded here: JeremyBytes - Downloads.  Note: the code download includes the sample code for this article as well as the following article, so if you open up the sample projects, you will see some items that we will discuss next time.

Project Setup
The download contains both "Starter" and "Completed" solutions.  If you open the "Starter" solution, you can follow along with the steps here.  The "Completed" solution has all of the code already in place.  Each solution contains 2 projects: SequenceLibrary contains our classes; SequenceLibrary.ConsoleApp contains a console application that we will use to view the output.  So far, both of these projects are nearly empty.  So, let's write some code!

IntegerSequence Class
We'll start by adding a new class to the SequenceLibrary project called "IntegerSequence".  (Just right-click the project, select "Add", then "Class", then type "IntegerSequence.cs" in the dialog.)

First steps, add "using System.Collections" at the top of the file (we'll see why in just a bit).  Then make the class "public" and specify that it implements "IEnumerable<int>":

Visual Studio will help us implement the interface (we've seen this previously: Updating Interface Implementation).  All we need to do is right-click on "IEnumerable", select "Implement Interface" and then "Implement Interface" again.  This will stub out the interface members for us:

This gives us the 2 members that we saw in Part 1: the generic version of GetEnumerator(), and the non-generic version of GetEnumerator().  Notice that the non-generic version (the 2nd one) uses explicit implementation of the IEnumerable (non-generic) interface.  For more information, refer to Explicit Interface Implementation.

The reason that we added "using System.Collections" is because IEnumerable (non-generic) is in this namespace.  Without this using statement, the non-generic versions of IEnumerable and IEnumerator would need to be fully qualified.

But we have a problem: GetEnumerator needs to return IEnumerator<int>.  That means that we need an class that implements this interface.  (Well, technically, there's a shortcut that we'll see a little later; let's run through the full implementation for now.)

IntegerEnumerator Class
Now we'll add the IntegerEnumerator class that will implement IEnumerator<int>,  Let's add a new class to the project: right-click the project, select "Add", then "Class", then type "IntegerEnumerator.cs" in the dialog.  Add "using System.Collections" at the top of the file; then make the class "public" and specify that it implements "IEnumerator<int>":

Then we'll run through the same steps as above to implement the interface.  I've re-arranged the methods just a little bit (to keep the 2 "Current" properties together and move the "Dispose()" method to the end):

Again, we'll see the properties and methods that we saw in Part 1.  Let's run through the logic: we need to iterate through integer values starting with 1 and ending with the total number of values.  To do this, we'll need a couple of internal fields.

Let's start by adding the fields, a constructor, and the "Current" properties:

"_numberOfValues" will tell us when to end our enumeration, and we'll initialize this with a constructor.  "_position" will hold the current position in the enumeration.

Notice that we have 2 "Current" properties: one of type "int" and one of type "object".  The "int" version returns the value of the "_position" field.  The "object" version returns the "int" version of "Current".  This makes IEnumerator<T> backward compatible with IEnumerator.  Although we could return "_position" for both calls to "Current", it's generally considered a best practice to have the non-generic version simply reference the generic version (we'll see this again with IEnumerable<T> and IEnumerable).

The MoveNext() method is where we do the work for our iterator.  Here is MoveNext() along with Reset() and Dispose():

MoveNext() increments the _position field (which defaults to 0 since it is an integer), this will have the effect of making the "Current" value one unit higher than the previous value.  As a reminder, MoveNext() needs to return true or false depending on whether it was successful.  When it hits the end of the enumeration, it needs to return false.  To accomplish this, we compare the current _position to _numberOfValues.  These would be equal for the last value since we're doing a simple integer sequence.

Reset() simply puts the _position back to its initial state.  Dispose() allows us to free any resources that we might be using.  In our case, we don't have anything that needs to be disposed, so we'll just call the Garbage Collector's SuppressFinalize() method.  (This technically isn't necessary, but it will save an extra iteration of the garbage collector.)

Now that we have a fully-implemented enumerator, we can go back to our IntegerSequence class.

Finishing IntegerSequence
As we noted initially, we need to have a NumberOfValues property.  Let's add that property and create a constructor to initialize the value.  Then we just need to implement the GetEnumerator methods:

The generic version of GetEnumerator() will create a new IntegerEnumerator, passing in the NumberOfValues property.  The non-generic version of GetEnumerator simply calls the generic version of GetEnumerator.  Just like we saw with "Current" (above), this keeps our IEnumerable<T> compatible with IEnumerable.

We now have a working implementation of IEnumerable<T>!

Using IntegerSequence
To use IntegerSequence, we'll flip over to our SequenceLibrary.ConsoleApp project.  The "Starter" project already has a reference to the SequenceLibrary project and has the SequenceLibrary namespace in the "using" section.

So, let's "new" up an IntegerSequence, run it through a foreach loop, and output the results:

And the output:

Since IntegerSequence implements IEnumerable<T>, we can use it in a foreach loop.  But, we can also use the IEnumerable<T> extension methods that we saw in Part 2.

Let's say that we only want to output the even numbers between 1 and 12.  We can accomplish this by using the "Where()" extension method to filter our results:

Note how we can simply use the "Where()" method with the sequence variable where our foreach is defined.   The lambda expression (the parameter of the Where) uses the modulo operator (%) to see if the current value (i) is evenly divisible by 2.  If so, then the number is even, and it will be included.

The Shortcut: yield return
There is a much shorter way that we can implement IEnumerable<T>: using the "yield return" statement.  "yield return" implements an enumerator and saves the current state of the method.  When the next item is asked for, then the method continues where it left off.

This sounds a little confusing, so let's look at a sample.  Back to our IntegerSequence:

Take a look at the GetEnumerator() method.  Instead of explicitly returning an IEnumerator<T>, we use the "yield return" statement.  The "while" loop has the same effect as "MoveNext" from our IntegerEnumerator class (by comparing the current position to the total number of values).  The "return ++position" statement has 2 purposes.  First, it pre-increments the position variable, then it returns that value.  Since we have the "yield" statement, the state of the position variable is preserved.

Notice that with this version, we no longer need the IntegerEnumerator class.  But, since we went through the exercise of explicitly implementing the IEnumerator<T> interface, we have a much better idea of what the "yield return" statement is actually doing.

Implementing IEnumerable<T>: FibonacciSequence
The IntegerSequence class is pretty trivial.  Since it simply returns sequential values, it could easily be replaced by a "for" loop.  So, let's take a look at something a little more complicated: implementing the Fibonacci Sequence.

The Fibonacci Sequence has the following values: 1, 1, 2, 3, 5, 8, 13, 21...

I won't go into the mathematical definition (you can look that up if you're curious).  Each number in the sequence is calculated by adding the 2 previous values.  So, 1 + 1 = 2, 1 + 2 = 3, 2 + 3 = 5, 3 + 5 = 8, and so forth.

We'll start with the explicit implementation (both IEnumerable<T> and IEnumerator<T>), then we'll switch it to use the "yield return" version.  (There's a reason for doing the full implementation first; it will lead us directly into the next article.)

FibonacciSequence Class
Just like with the IntegerSequence class, we'll create a new class (calling the file "FibonacciSequence.cs"), add "using System.Collections", stubbing out the implementation of IEnumerable<T>, adding the NumberOfValues property and the constructor.

This looks almost exactly like our IntegerSequence class.  Before we can continue, we need to create the enumerator class.

FibonacciEnumerator Class
Next, we'll create a FibonacciEnumerator class.  We'll run through the same initial steps as with the IntegerEnumerator class: create a new class, add "using System.Collections", add the IEnumerator<int> interface, and stub out the implementation:

Since calculating the Fibonacci Sequence is a bit more complicated than simply incrementing integers, we'll need a few more fields to handle our calculations:

We have "_numberOfValues" (just like before) to keep track of how many values we will enumerate.  "_position" will keep track of our current position in the sequence so that we'll know when we are done.  "_previousTotal" and "_currentTotal" will be used for our calculations.  Notice that the "Current" property returns the "_currentTotal" field.

Reset() and Dispose() are pretty straight forward:

MoveNext() is a bit more complicated:

The "if (_position == 0)" condition will handle our initial state (before we have 2 previous values to add together).  The "else" will handle all the other numbers by adding together the _currentTotal and _previousTotal (the last 2 numbers), and then updating the _previousTotal and _currentTotal to the new values.

Our return value determines whether we have completed our enumeration.  Now we can complete our FibonacciSequence.

Finishing FibonacciSequence
Now we can fill in the the GetEnumerator() methods of the FibonacciSequence class:

Using FibonacciSequence
Now, we can update the console application to use the FibonacciSequence class:

And the output:

We can use the extension methods with the FibonacciSequence as well.  We could return every other value, or only values that are divisible by 3, or any number of creative uses of the extension methods.

Error Handling
Because of the nature of the Fibonacci Sequence, we may run into some issues.  Consider what happens if we create our sequence with 50 values: "new FibonacciSequence(50)" instead of 12.  Here's the output:

It looks like we have a problem toward the end of our values.  Why do we have negative numbers?  Those should not be part of the Fibonacci Sequence.

Our problem is that we are overflowing the standard Int32 that we are using for our values.  The overflow is surfacing as the negative values (since int is signed).  And once we try to continue from there, our values are completely useless.

We'll take care of this by doing a check in our FibonacciEnumerator class.  If we have a possible overflow in the MoveNext() method, we will throw an exception:

What we do is convert the _previousTotal and _currentTotal to long integers, and then see if adding them together is greater than Int32 can hold.  If so, then we throw an exception.

This means that we need to handle this exception in our console app:

Here, we're checking for the OverflowException and printing the message to the console.  Here's the output:

So, now we have some appropriate error handling to make sure that we do not get inappropriate values.  Note: this limits the valid values of our FibonacciSequence class to 47 (before we get an overflow).  If we wanted to make this class more robust, we could update it to use "long" instead of "int".  But we'll keep this as "int" for now; this will make more sense when we look at Part 4.

FibonacciSequence with yield return
Finally, we can update the GetEnumerator() function with the "yield return" statement to eliminate the separate FibonacciEnumerator class.  This is a bit more complex than the previous implementation but still not too difficult.  Basically, we combine the MoveNext() and Current into a single method:

Now we have local variables for _previousTotal and _currentTotal.  We use a "for" loop to return the right number of values.  The rest is similar to what we had in the MoveNext() method.

Next Time
This time, we took a look at how to implement IEnumerable<T> and IEnumerator<T>.  We saw that when we create classes that implement IEnumerable<T>, we can use foreach and the LINQ extension methods.  We also saw how we can use the "yield return" statement to eliminate the need for a separate enumerator class.

But take another look at our IteratorSequence and FibonacciSequence when we *are* using the separate IEnumerator class:

Notice anything about these two classes?  They are exactly the same with the exception of what specific enumerator class they "new" up in the GetEnumerator() method.  This strikes me as a great place for us to add an abstraction.  Using the Strategy Pattern, we can create a generic Sequence class to which we can pass a specific enumerator class.  Next time, we'll take a look at the Strategy Pattern and how to implement it with these classes.

Happy Coding!

No comments:

Post a Comment