Friday, July 22, 2016

Struggling with Readability in Functional Programming (with Euler Problem #6 in F#)

I've been working through Project Euler as a learning tool to get better with F#. When I looked at Euler Problem #6, I thought, "That will be pretty easy" (I even mentioned that at the end of my last article). But I ended up thinking about readability and barriers that make it difficult for folks to learn functional programming.

This goes to show that we can learn from even "easy" problems.

Here's Euler Problem #6:
The sum of the squares of the first ten natural numbers is,
     12 + 22 + ... + 102 = 385
The square of the sum of the first ten natural numbers is,
     (1 + 2 + ... + 10)2 = 552 = 3025
Hence the difference between the sum of the squares of the first ten natural numbers and the square of the sum is 3025 − 385 = 2640.
Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.
This seems trivial compared to the other problems that we've looked at so far.

[Update: Collected articles for the first 10 Euler Problems in F# are available here: Jeremy Explores Functional Programming.]

A Simple Solution
We basically need two functions: one to get the sum of the squares for a range, and one to get the square of the sum for a range. Then we just take the difference of running those functions.

Sum of the Squares
To calculate the sum of the squares, I used pipelining to run a range through a couple simple functions (showing snippets from a script file and the output in the F# Interactive window):

This maps each item in our range through the "pown" function. This raises a number to a specified power (in this case, 2). So this will calculate the square of each number.

Then we pipe this list of squares through the "sum" function. And we can see that when we do this on our sample range (1 to 10), we get the expected result: 385.

Square of the Sum
Calculating the square of the sum is also pretty easy. Here's my first pass at that:

We can't use pipelining quite the same as in our previous function. That's because with the "pown" function, we want to pipe in the first parameter (as opposed to the last parameter). So instead, I created a local "square" function that would only take a single parameter.

Then we take our range, pipe it to "sum" (to get the sum of our items) and then pipe it into our "square" function.

For our sample range (1 to 10), this gives us the expected output of 3025.

Getting the Difference
Then we just have to get the difference of these two functions:

This subtracts the square of the sums from the sum of the squares. Then I run it through the "abs" function to get the absolute value. The reason I do this is because I don't want to worry about which of the values is bigger (it actually varies based on the range). By taking the absolute value, we get the difference between the two values, regardless of which is bigger.

Trivial. Easy. Obvious. Time to write the article.

Nope. Then I started thinking about other options.

Other Options
I wasn't very happy with the "squareOfSums" function. I felt like I should be able to write this more cleanly. I came up with a few different options, and I wasn't happy with any of them:

I know that I'm not very good at function composition yet (I need to get my brain used to working that way). So, I sent out a tweet to see if someone could point me in a better direction.

I got a response back from Mark Heath (@mark_heath - who also provided direction when I was looking at prime factors):

Using function composition like this seemed much cleaner than the options I came up with, so I gave it a try:

This gives us the same results as our other functions. And there's really no need to use the "pown" function when we're just squaring the value; x * x works just as well.

But where did the "range" parameter go?

Let's compare the signature to our original version:

These both have the same signature: int list -> int -- the parameter is a list of integers, and the output is an integer.

To get a better idea of what's going on, let's make a similar change to our other function.

Losing the Parameter
As a reminder, here is our original "sumOfSquares" function:

The "map" and "sum" functions can be combined into a single "fold" function:

The "fold" goes through each item in our list, squares it (x * x) and then adds the value to our accumulator (which ends up as our sum).

Note: As Mathias Brandewinder (@brandewinder) points out in the comments "sumBy" is a more direct combination of "map" and "sum"Mathias' Comment. I like this because the intent is more clear (the result is a sum), and the code is cleaner since we don't have to deal with the accumulator.

Rather than using pipelining, we can move "range" to the last parameter of the "fold":

Then we can make things a bit more interesting. Rather than providing the "range" parameter explicitly, we can remove it entirely:

This makes "sumOfSquares" a partial application of the "List.fold" function. "fold" takes 3 parameters (the function, the accumulator, and the list), but we're only supplying the first 2 parameters (the function and the accumulator).

Because of this, our "sumOfSquares" function still needs an "int list" parameter to complete the evaluation. But rather than being an explicit parameter in our code, it becomes an implicit parameter needed to complete the partial application.

When we look at the signatures of all of these methods, we see they are identical: int list -> int -- the parameter is a list of integers and the result is an integer.

What About Readability?
This made me really start thinking about readability. I've already been thinking about readability in F#, but for other reasons -- specifically, type inference. But this gave me something else to think about.

I'm not questioning whether this is a "right" way of doing functional programming; I've seen a lot of code that is similar to this while I've been doing my exploration. But I'm wondering if other people have concerns about readability.

Mark was nice enough to provide his view:

This seems to be a common theme when talking to folks about functional programming: It seems weird at first, but you get used to it.

Removing Barriers
For OO developers, there are a lot of barriers to get past when learning functional programming. To be successful with functional programming, you have to think about problems much differently than with object-oriented programming.

I'm not trying to change the way people do functional programming. But I'm wondering if there's a way to make things a bit easier for OO developers.

I'm always thinking about ways to make things more approachable; that's why I've been so successful in teaching people about things like lambda expressions, interfaces, and dependency injection. As I'm learning about functional programming, I'm thinking about ways to make things easier to learn -- to smooth out some of the steep barriers to make more of an even slope.

I don't have answers at this point. But this "easy" problem brought these to the forefront of my mind. I'll definitely be working on this more as I dig deeper into the functional world.

Completing Euler #6
This kind of got off-track for "solving Euler Problem #6". For those who are still interested, here's the combined function:

There's probably a better way to do the difference part of this function (the last two lines), and I'm open to suggestions for that. But we see that this is working, and we do get the correct answer for the sample case: 2640.

Last step is to run this against the target case, from 1 to 100:

We can verify online that the answer to this is correct.

Wrap Up
Just because something is "easy" doesn't mean that we won't still learn something from going through the process. In this case, I learned a bit more about function composition (which I need to work on much more).

More importantly, I'm really thinking about readability and barriers-to-entry for developers who are new to functional programming. It's probably going to take much thought and a lot of conversations with people who have been doing functional programming for a while.

If you have any ideas, I'd love to hear them. The folks I know in the functional community are very helpful and want to bring more developers into the community. There are quite a few barriers that make things difficult. Let's have the conversation about how we can make things easier.

Happy Coding!


  1. Nice! A couple of thoughts / remarks:

    1) if you want to pipe but the arguments are not in the right order, you can also create a function with the right order, for instance let npow n x = pown x n - which is more general than the square function, but 'pipeable', with any exponent.

    2) it's very likely not helping readability, but this might give you some ideas - you could then do something like this:

    let npow n x = pown x n
    let sumOfSquares = List.sumBy (npow 2)
    let squareOfSum = List.sum >> (npow 2)
    let euler6 x = abs (sumOfSquares x - squareOfSum x)

    euler6 [1..100]

    Happy coding :)

    1. Thanks Mathias!
      I didn't think of "sumBy" instead of "fold". I like that because it makes the intent more clear. "npow" is interesting, too. Lots to think about.

  2. The style of composition of functions with partial application is called point-free. You can read more about it in Wikipedia

  3. there are shortcuts!

    let square (x:int64) = x * x

    let sumSquares n = n * (n + 1L) * (2L*n + 1L) / 6L

    let squareSum n = n * (n+1L) / 2L |> square