Monday, March 6, 2017

Rediscovering Fire: Effective Learning?

So I've got a bit of a dilemma. I ran across an article today (thanks to Tomas Petricek (@tomaspetricek) for pointing it out), and after drilling into it a bit, I felt like I've been rediscovering fire while I've been learning functional programming.

So here's the question:
"Does the rediscovery make the learning more effective?"
The Article
The article in question is from Eirik Tsarpalis (@eiriktsarpalis): F# and Purity. This is a great read, particularly if you've been trying to reconcile functional and imperative programming (like I have). Go read it now.

But as I read the article (and started following some of the links), I found quite a few things that looked familiar.

Down the Rabbit Hole
When I started learning functional programming, I tried to be "pure", meaning that I wanted to use functional concepts all the way down. Eirik points to an article about Fibonacci Sequences by Adam Esterline: Fibonacci Sequence in Haskell. If you've tried to write Fibonacci sequences in Haskell, go read this now.

Haskell by it's nature is a "pure" functional language in that it doesn't let you do those imperative things that we have access to in things like F#. But there are still different ways to implement the Fibonacci sequence.

Adam points to the "Slow but Easy" method using recursion. This looked very familiar, because I created something very similar, thinking that recursion was the "right" solution to this problem.

Here's my code (more on this here: A Functional Fibonacci Sequence (as learned by an imperative programmer)):

This code works, but it is *SLOW*. It takes over 20 seconds to calculate the 35th place. And it just gets slower from there.

Adam shows a similar slowness and points to a solution in Haskell using "zipWith". This is a solution that I ran across as well:

This solution is *FAST*. I thought I understood how this worked (and tried to explain it a bit in my article), but Adam points to a better explanation on StackOverflow from someone who understands how Haskell works: Understanding a Recursively Design List. (You should definitely read this as well.)

Granted, Adam's article was published after mine, but the StackOverflow answer was published years before mine. So this information has been around for a while.

So I struggled through this problem (speed of the recursive implementation), and I asked questions from functional devs who have much more experience than I do. This questioning led me to look for other solutions, and it made me understand that I wasn't "doing it wrong" because I didn't use a standard recursive function implementation.

Popping the Stack
So back to Eirik's article. The core of the article talks about how "pure" functional programming is all based on mathematical functions, but our CPUs and environments aren't really set up to deal with these things. Instead, our environments (in my case, Windows and .NET) are really set up to deal with imperative programming.

The result is that if we program only with pure functions, we often run into issues of performance, heap usage, and even stack overflows (which I'm pretty good at generating).

The solution is a compromise: we keep our functions externally "pure", meaning discrete input leads to deterministic output, no mutation or shared state, etc. But we can use imperative code inside of those functions. We just need to make sure the imperative bits don't leak out of our functions.

Again, this sounded really familiar. When I was working on my Digit Recognition application, I was using code from Mathias Brandewinder's book on machine learning. I ran into some performance issues. The functions that he created were great for batch processing, but I was running the functions in a different way, which meant I was running them over and over again.

Here's the original function (you can read more about it here: Exploring the Digit Recognizer with F#):

To me, this seemed like the "right" way of coding this. It was using the functions on the Array object, and it was piping results from one function to another. But this was *SLOW*.

I had a chance to talk to Mathias about this. I showed him my F# code (above) and also the C# code that I had (which used "for" loop). The C# code was significantly faster. So he said that we could do the same thing in the F# code:

This made things significantly faster. I'm really glad that Mathias recommended this because he is a serious functional programmer, and it was like it gave me permission to cheat.

But when I looked at it more closely, it really wasn't cheating -- for the same reasons that Eirik gives in his article. In this case, there is mutable state inside the function, but that mutation never leaks out. It is contained in the function itself. From the outside, this function is "pure" it takes 2 arrays and returns a deterministic result.

(If you're interested in the Digit Recognizer project, here's where you can find all the articles: Machine Learning (sort of)).

Rediscovering Fire
I feel like quite a bit of my learning in programming has been about rediscovering fire. These are just two examples from the functional programming world. And that brings us back to the original question:
Does the rediscovery make the learning more effective?
I know that in my case, going through the pain of finding out that my solution had serious performance issues is what spurred me to look for other solutions.

Pretty much every talk that I do is based on some pain or hurdle that I had in my development career. My goal is to help other developers over those hurdles. I get stuck easily, and I also get easily frustrated. I don't want other folks to get stuck and frustrated.

Finding the Balance
If we are constantly rediscovering fire, then we won't make progress beyond those who came before us. We'll just keep walking the same road over and over again.

If we don't go through the pain ourselves, then we don't fully appreciate the solutions that we are using. I know that when I was a young programmer, I didn't think that I needed to follow "best practices" (okay, so I really hate that term, but I'll use it anyway). When I got myself into trouble too many times, I finally learned that "best practices" are a good place to start, particularly if I'm new to an environment.

There's a balance somewhere here. A place where we can learn from our own pain, but we can also learn from the people who came before us.

I need to keep this balance in mind for myself and also for the developers that I teach on a regular basis.

Happy Coding!


  1. Note that performance in the original Manhattan distance implementation can be radically improved merely by using `Array.fold2` instead of that pipeline. `` creates unnecessary tuple allocations which are the primary reason why this implementation is slow.

    1. Hi Eirik. Thanks for the help. (This is one thing I really love about the F# community.) Using "Array.fold2" is significantly faster than the "" implementation. With the current code/dataset (, "" takes 143 seconds and "Array.fold2" only takes 27 seconds. Using the for loop is a little bit faster with this code/data; it takes 23 seconds.

      I'll definitely keep "Array.fold2" in my toolbox!