Thursday, July 14, 2016

Functional Practice: Euler Problem #2 in F#

I'm still moving forward with my exploration of F#. I decided to take a shot of Euler Problem #2. I've previously looked at this problem using Haskell (back in January 2014). Now it's time to take a shot at it with F#.

I'm not completely happy with the solution I came up with. I'm sure there are better approaches. But it's a good starting point, and I've learned quite a bit about sequences in the process.

[Update: Collected articles for the first 10 Euler Problems in F# are available here: Jeremy Explores Functional Programming.]

Euler Problem #2
The second Euler problem is not all that complex, but it does have a few steps to it:
Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be: 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ... By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.
The biggest challenge is to create a Fibonacci sequence. I've run into issues with this before, so I knew that this would be the hardest part of this challenge.

The Fibonacci Sequence
I started looking for a Fibonacci solution the same way that I did previously: by creating a recursive function. When I was first getting into functional programming, this felt like the right way. But I've since learned that recursion has some issues in this case.

Recursive Implementation
Here is that naive implementation (this shows the script file along with the output from F# Interactive):


This gives us correct values -- the 10th item in the Fibonacci sequence is "55". However, it becomes *very slow* very quickly. This may doesn't seem like a big issue at first, but when we're trying to generate an entire sequence, it is way too slow.

A Better Solution
I spent a little time trying to figure out if I could somehow translate the Haskell solution to F#. It may be possible, but it's beyond my capability at this point.

I went online to look for solutions. My big condition, though, is that I won't use a solution that I do not understand. Here's the solution that I ended up settling on:


This gives us the same answer, but it doesn't suffer from the performance issue that the recursive solution has.

That's because rather than using recursion, it uses a fold. And the fold calculation is simply addition. But let's try to understand this. To do that, we'll pick out 2 of the lines:


I've replaced the "n" (the parameter) with the value of 3. So this starts with a list that consists of the values 1, 2, and 3.

A fold is a way to use an accumulator to calculate an aggregate of some sort. We can think of the accumulator as a running total (although it doesn't have to be a sum). We then have a function to calculate that running total for the list.

The last value on the second line "(0,1)" is our accumulator. This is a tuple that consists of 2 integers: 0 and 1.

The function that we're using "fun (a,b) _ -> b, a + b" is a bit more interesting. The first parameter "(a,b)" represents the accumulator. The second parameter (the underscore) represents the current item from the list. The underscore denotes that we don't really care about that value; we're not using it in the body of the function.

The body of the function "b, a + b" updates the accumulator by giving us a new tuple. The first integer will be whatever the second integer was in our accumulator. And the second integer will be the sum of both integers from the accumulator.

To understand this, let's take a look at the values of the accumulator as we go:

Initial value:          (0, 1)
List item 1: (1, 0+1) = (1, 1)
List item 2: (1, 1+1) = (1, 2)
List item 3: (2, 1+2) = (2, 3)
List item 4: (3, 2+3) = (3, 5)
List item 5: (5, 3+5) = (5, 8)

This gives us the correct values for the Fibonacci sequence.

Back to the "fibonacci" function that we have above, the last function is "fst". This is a built-in function that takes the first value of a tuple. So if we call "fst (2,3)", the result is "2". And that's exactly what's happening (although the parameter order is a bit different since we're using pipelining here).

Shortcomings
This function does have a problem: it only works for 46 items and then the integer overflows:


We won't worry about that here. But if needed, we could change this to use something bigger than an 32-bit integer.

Solving Euler #2
Now that we have this, we can solve Euler problem #2 pretty easily. First, let's use our "fibonacci" function to create a sequence of numbers. I don't have much experience with sequences, So this code will be a bit verbose.

The Fibonacci Sequence
Here's how we can get a Fibonacci sequence from our function:


A sequence in F# is like an IEnumerable in C#. With an iterator, we only get a value when we ask for one. Because of that, we can create an infinite sequence without things getting out of hand.

The "initInfinite" function lets us run a function based on a 0-based index. So this will run "fibonacci 0", then "fibonacci 1", "fibonacci 2", and so on.

If you notice, the output just displays the first few values of the sequence. This keeps us from generating the entire, infinite sequence.

Let's put a limiter on this by asking for just the first 20 values:


The "take" function gives us just the first 20 items from our sequence. I call the "toList" function to force the evaluation of the sequence so that we can see all of the values in the output.

Since the Fibonacci sequence doesn't start with "0", we'll skip the first value:


This gives us the more traditional Fibonacci sequence.

The Starting Value
The Euler problem describes the beginning of the sequence a bit differently. Notice that it shows the values "1, 2, 3, 5, 8". So, we actually want to skip the first "1" that is usually included in the sequence.

That's easy enough by changing our "skip" call:


Now let's move on to the other parts of the problem.

Less Than Four Million
The next part of the problem is that we want to get the terms that are less than four million. For this, we can just update our "take" function to a "takeWhile":


This lets us add a conditional function to determine when we should stop iterating the sequence.

Even Values Only
The next conditional is that we only want the even values. For this, we'll add a "filter":


You'll notice that I've used "x", "y", and "z" for the function parameter names. The primary reason I did this is because these represent different values, and I wanted to make sure that it was clear that they were different.

In this code, "x" represents the numbers from 0 to infinity. "y" represents the values from the Fibonacci sequence. "z" also represents the values from the Fibonacci sequence (so I guess "y" and "z" could probably use the same name here -- that's something to think about).

As a side note, you notice that the "1" was eliminated from the list (since it's not even), so it really doesn't matter if our sequence is "1, 2, 3, 5" or "1, 1, 2, 3, 5".

Summing the Values
The last part of the problem is to sum up all of the values:


We just replaced our "toList" with "sum". And we know the output is correct based on previously looking at this problem.

Final(?) Solution
So here's the solution that I ended up with:


One interesting thing to note here is in the output. F# recognizes that "euler2" has a specific value: 4613732. That's because we don't have any parameters that can alter the outcome. So F# Interactive gives us the evaluated value.

Further Work
There's a few things that I'm not quite happy with here. First, I feel like this is a pretty naive approach to using sequences. I have a feeling that a couple of these functions can be combined. I'll need a bit more experience to work those out.

It's not quite tweetable. Technically the "skip 2" line can be taken out without changing the output. But if I take it out, it will use a different sequence than the one specified by the problem. I don't feel right about that.

Wrap Up
This isn't really about getting to the solution, though. After all, we already know what the answer is, and there are plenty of people who have posted their own implementations online. The point is to get some coding practice and get more experience with language features in F#.

And that's what this has done for me. I did quite a bit of exploring of the "Seq" documentation to try to figure out which functions I should use. The "initInfinite" is pretty cool; that's brand new to me, and I'll be looking at other ways that I can use that.

I'm sure that I'll get quite a bit of feedback on different ways to do this. And I'm looking forward to that. We get better as a community by sharing and learning from each other.

Exploration and practice is a big part of how we learn and how we get better. Also, it's quite a bit of fun.

Happy Coding!

2 comments:

  1. Hi Jeremy, inspired by your enthousiasm I took a stab at this as well - starting from your solution. One "sudo apt-get install fsharp" later:

    First, if the Euler problem states "By starting with 1 and 2", that would be the more correct seed values for List.fold, removing the need to "skip 2".

    As for combining the two functions, it is possible to keep track of the sum (of even elements) by adding a third element to the tuple. This function outputs the tuple (current element, next element, sum):

    let fibonacci2 n =
    [1..n]
    |> List.fold (fun (a,b,sum) _ -> b, a + b, if b % 2=0 then sum + b else sum) (1,2,0)

    Moving [1..n] to be the third argument for List.fold, it becomes a oneliner which can later be inlined in the second function.

    Then, instead of takeWhile, I used Seq.find to find the first tuple where the *next* element exceeds 4 million. And finally, a custom function is needed to extract the output value from the tuple.

    let euler2 =
    Seq.initInfinite (fun x -> fibonacci2 x)
    |> Seq.find (fun (_,next,_) -> next > 4000000)
    |> fun (_,_,sum) -> sum

    The minified combination of these functions is just 140 characters, but for a tweet @fsibot you might have to cut off the tuple extraction:

    Seq.initInfinite(fun x->List.fold(fun(a,b,s)_->b,a+b,if b%2=0 then s+b else s)(1,2,0)[1..x])|>Seq.find(fun(_,n,_)->n>4000000)|>fun(_,_,s)->s

    I have rarely commented on a blog post, but in the spirit of your Social Developer posts, I thought it would be a shame not trying to return the favor of inspiration.

    Happy Coding to you too!

    ReplyDelete
    Replies
    1. I'm glad it inspired you to do some exploration. And thanks for sharing your solution (and the thoughts behind it). One thing that I've found a bit of a challenge when working with FP is thinking about problems and solutions a bit differently. This gives me some good ideas, and I can see the approach creeping into other things I'm working on.
      Thanks!

      Delete