Thursday, September 30, 2021

Coding Practice: Learning Rust with Fibonacci Numbers

In my exploration of Rust, I built an application that calculates Fibonacci numbers (this was a suggestion from the end of Chapter 3 of The Rust Programming Language by Steve Klabnik and Carol Nichols).

It helped me learn a bit more about the language and environment.
  • for loops
  • Statements vs. expressions
  • Function returns (expressions)
  • checked_add to prevent overflow
  • Option enum (returned from checked_add)
  • Pattern matching on Option
  • Result enum (to return error rather than panic)
  • .expect with Result
  • Pattern matching on Result
So let's walk through this project.

The code is available on GitHub: https://github.com/jeremybytes/fibonacci-rust, and branches are set up for each step along the way. We will only be looking at the "main.rs" file in each branch, so all of the links will be directly to this file.

Fibonacci Numbers

The task is to calculate the nth Fibonacci number. The Fibonacci sequence is made by adding the 2 previous number in the sequence. So the sequence starts: 1, 1, 2, 3, 5, 8, 13, 21, 34. The 7th Fibonacci number (13) is the sum of the previous 2 numbers (5 and 8).

For our application, we will create a function to generate the nth Fibonacci number based on an input parameter. We'll call this function with multiple values and output the results.

Step 1: Creating the Project

Branch: 01-creation
The first step is to create the project. We can do this by typing the following in a terminal:
    cargo new fib
This will create a new folder called "fib" along with the Rust project file, a "src" folder to hold the code, and a "main.rs" file (in src) which is where we will be putting our code.

The "main.rs" file has placeholder code:
    fn main() {
        println!("Hello, world!");
    }
But we can use "cargo run" to make sure that everything is working in our environment.
    C:\rustlang\fib> cargo run
       Compiling fib v0.1.0 (C:\rustlang\fib)
        Finished dev [unoptimized + debuginfo] target(s) in 0.63s
         Running `target\debug\fib.exe`
    Hello, world!
Going forward, I'll just show the application output (without the compiling and running output).

Step 2: Basic Fibonacci

Branch: 02-basic
Now that we have the shell, let's create a function to return a Fibonacci number. Here's the completed function:
    fn fib(n: u8) -> u64 {
        let mut prev: u64 = 0;
        let mut curr: u64 = 1;
        for _ in 1..n {
            let next = prev + curr;
            prev = curr;
            curr = next;
        }
        curr
    }
There are several interesting bits here. Let's walk through them.

Declaring a Function
Let's start with the function declaration:
    fn fib(n: u8) -> u64 {

    }
The "fn" denotes that this is a function. "fib" is the function name. "n: u8" declares a parameter called "n" that is an unsigned 8-bit integer. And the "u64" after the arrow declares that this function returns an unsigned 64-bit integer.

When declaring parameters and return values for functions, the types are required. Rust does use type inference in some places (as we'll see), but function declarations need to have explicit types.

Declaring Variables
Next, we have some variables declared and assigned:
    let mut prev: u64 = 0;
    let mut curr: u64 = 1;
"let" declares a variable.

By default, variables are immutable. This means that once we assign a value, we cannot change it. For these variables, we use "mut" to denote that they are mutable. So we will be able to change the values later.

The variable names are "prev" and "curr". These will hold the "previous number" and the "current number" in the sequence.

The ": u64" declares these as unsigned 64-bit integer values. Fibonacci numbers tend to overflow very quickly, so I used a fairly large integer type.

Finally, we assign initial values of 0 and 1, respectively.

Looping with "for"
There are several ways to handle the loop required to calculate the Fibonacci number. I opted for a "for" loop:
    for _ in 1..n {

    }
"1..n" represents a range from 1 to the value of the incoming function argument. So if the argument is "3", this represents the range: 1, 2, 3.

The "for" statement will loop once for each value in the range. In this case the "_" denotes that we are not using the actual range value inside the loop. All we really need here is to run the loop 3 times. All of the calculation is done inside the loop itself.

Implicit Typing
Inside the "for" loop we do our calculations:
    let next = prev + curr;
    prev = curr;
    curr = next;
This creates a new variable called "next" inside the loop and assigns it the sum of "prev" and "curr". A couple of things to note. First, this variable is immutable (so we do not have the "mut" keyword). The value is assigned here and then it is not changed. Second, the "next" variable is implicitly typed. Instead of having a type declaration, it is set based on what is assigned to it. Since we are assigning the sum of two u64 values, "next" will also be a u64.

The next two lines update the "prev" and "curr" values. We needed to mark them as mutable when we declared them so that we could update them here.

This is a fairly naïve way of calculating Fibonacci numbers. If you'd like to see more details on how the calculation works, you can take a look at this article: TDDing into a Fibonacci Sequence with C#.

Statements vs. Expressions
The last line of the function is a bit interesting:
    curr
This returns the current value ("curr") from the function.

Rust does not use a "return" keyword to return a value. Instead, the last expression in a function is what is returned. (As a side note, this is similar to how F# works.)

What's the difference between a statement and an expression? A statement does some type of work; an expression returns a value.

In Rust, a statement ends with a semi-colon, and an expression does not end with a semi-colon. To make things more interesting, these are often combined.

Let's take a look back at a line of code:
    let next = prev + curr;
Overall, this is a statement: it declares and assigns a value to a variable called "next". And it ends with a semi-colon.
    prev + curr
"prev + curr" is an expression that returns the result of adding 2 values. So we really have a statement that includes an expression. (We can technically break this down further, but we won't do that here.)

So, let's get back to the return value of the function. The "fib" function returns a u64 value. The last expression in the function is:
    curr
It is important to note that this line does not end with a semi-colon. Because of this, the value of the "curr" variable (which is a u64) is returned for this function.

Because of my coding history, I'm used to putting semi-colons at the end of lines. So I'm sure that I'll mess this up many times before I get used to it. If you get an error that says a function is returning "()" instead of a particular type, it probably means that there's a semi-colon at the end of the expression you meant to return.

Here's the full function:
    fn fib(n: u8) -> u64 {
        let mut prev: u64 = 0;
        let mut curr: u64 = 1;
        for _ in 1..n {
            let next = prev + curr;
            prev = curr;
            curr = next;
        }
        curr
    }
Using the "fib" Function
Now that we have a function that returns a Fibonacci number, it's time to update the "main" function to use it.
    fn main() {
        println!("Fibonacci 1st = {}", fib(1));
        println!("Fibonacci 2nd = {}", fib(2));
        println!("Fibonacci 3rd = {}", fib(3));
        println!("Fibonacci 4th = {}", fib(4));
        println!("Fibonacci 5th = {}", fib(5));
    }
This uses the "println!" macro to output a string to the standard output. On each line, the set of curly braces represents a placeholder in the string. So in the first statement, the curly braces will be replaced by the value that comes back from calling "fib(1)".

So let's run and check the output:
    Fibonacci 1st = 1
    Fibonacci 2nd = 1
    Fibonacci 3rd = 2
    Fibonacci 4th = 3
    Fibonacci 5th = 5
It works!

Well, it mostly works. We'll see a shortcoming in a bit.

Step 3: Testing More Values

Branch: 03-mainloop
Before looking at where we have a problem in the "fib" function, let's make it easier to test for different values. For this, we'll add an array of numbers to test, and then loop through them.

Here's an updated "main" function:
    fn main() {
        let nths = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

        for nth in nths {
            println!("Fibonacci {} = {}", nth, fib(nth));
        }
    }
Creating an Array
The first line sets up an array called "nths" and initializes it with the values of 1 through 10. I use this name because our original task was to calculate the "nth" Fibonacci number. This is a collection of all the ones we want to calculate.

We're using type inference to let the compiler pick the type for "nths". In this case, it determines that it is an array with 10 elements of type u8. It decides on u8 because the values are used as arguments for the "fib" function, and that takes a u8.

As an interesting note, if you comment out the "println!" statement, the "nths" variable is an array with 10 elements of type i32 (a signed 32-bit integer). This is the default integer type.

Type inference works as long as it can be determined at compile time. If it cannot be determined at compile time, then an explicit type needs to be added.

Another "for" Loop
We use a "for" loop to go through the array. Instead of discarding the value from the "for" loop (like we did above), we capture it in the "nth" variable.

Inside the loop, we have a "println!" with 2 placeholders, one for the loop value and one for the result of the "fib" function.

Here's what that output looks like:
    Fibonacci 1 = 1
    Fibonacci 2 = 1
    Fibonacci 3 = 2
    Fibonacci 4 = 3
    Fibonacci 5 = 5
    Fibonacci 6 = 8
    Fibonacci 7 = 13
    Fibonacci 8 = 21
    Fibonacci 9 = 34
    Fibonacci 10 = 55
And now we can more easily test values by adding to the array.

Overflow!
As I noted at the beginning, Fibonacci sequences tend to overflow pretty quickly (they increase the value by half for each item). We can see this by adding a "100" to our array.
    let nths = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100];
Here's the output when we run with these values:
    Fibonacci 1 = 1
    Fibonacci 2 = 1
    Fibonacci 3 = 2
    Fibonacci 4 = 3
    Fibonacci 5 = 5
    Fibonacci 6 = 8
    Fibonacci 7 = 13
    Fibonacci 8 = 21
    Fibonacci 9 = 34
    Fibonacci 10 = 55
    thread 'main' panicked at 'attempt to add with overflow', src\main.rs:13:20
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    error: process didn't exit successfully: `target\debug\fib.exe` (exit code: 101)
This creates a "panic" in our application, and it exits with an error.

With a little experimentation, we will find that the 93rd Fibonacci number is fine, but the 94th will overflow the u64 value.

Checking for Overflow

Branch: 04-checkedadd
Now we can work on fixing the overflow. Our problem is with this line:
    let next = prev + curr;
If "prev" and "curr" are near the upper limits of the u64 range, then adding them together will go past that upper limit.

Some other languages will "wrap" the value (starting over again at 0). Rust will generate an error instead (in the form of a panic). If you do want to wrap the value, Rust does offer a "wrapped_add" function that does just that.

But we do not want to wrap, we would like to catch the error and give our users a better experience.

checked_add
Instead of using the default "+" operator, we can use the "checked_add" function. Here is that code:
    let result = prev.checked_add(curr);
"checked_add" does not panic if the value overflows. This is because it uses the Option enum.

Option Enum
The Option enum lets us return either a valid value or no value. "Some<T>" is used if there is a valid value, otherwise "None" is used.

For example, let's say that "prev" is 1 and "curr" is 2. The "result" would be "Some(3)".

If "prev" and "curr" are big enough to cause an overflow when added together, then "result" would be "None".

Pattern Matching
The great thing about having an Option as a return type is that we can use pattern matching with it.

Here is the inside of the updated "for" loop:
    let result = prev.checked_add(curr);
    match result {
        Some(next) => {
            prev = curr;
            curr = next;
        }
        None => {
            curr = 0;
            break;
        }
    }
The "match" keyword sets up the pattern matching for us.

The first "arm" has "Some(next)" as the pattern. The "next" part lets us assign a name to the value that we can use inside the block. In this case, "next" will hold the same value that it did in the earlier version ("prev" + "curr"), so inside the block, we can assign the "prev" and "curr" values like we did before.

The second "arm" has "None" as the pattern. This will be used if there is an overflow. If there is an overflow, then we set the "curr" variable to 0 and then break out of the "for" loop.

Here is the updated "fib" function:
    fn fib(n: u8) -> u64 {
        let mut prev: u64 = 0;
       let mut curr: u64 = 1;
        for _ in 1..n {
            let result = prev.checked_add(curr);
            match result {
                Some(next) => {
                    prev = curr;
                    curr = next;
                }
                None => {
                    curr = 0;
                    break;
                }
            }
        }
        curr
    }
Here's an updated array to test valid values and overflow values:
    let nths = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 90, 91, 92, 93, 94, 95, 96];
And here's the output:
    Fibonacci 1 = 1
    Fibonacci 2 = 1
    Fibonacci 3 = 2
    Fibonacci 4 = 3
    Fibonacci 5 = 5
    Fibonacci 6 = 8
    Fibonacci 7 = 13
    Fibonacci 8 = 21
    Fibonacci 9 = 34
    Fibonacci 10 = 55
    Fibonacci 90 = 2880067194370816120
    Fibonacci 91 = 4660046610375530309
    Fibonacci 92 = 7540113804746346429
    Fibonacci 93 = 12200160415121876738
    Fibonacci 94 = 0
    Fibonacci 95 = 0
    Fibonacci 96 = 0
Setting "curr" to 0 is not a great way to handle our error state. For now, it is fine because it gets rid of the panic, and our application keeps running.

Next up, we'll work on getting a real error that we can handle.

Using the Result Enum

Branch: 05-result
In the last article, I wrote a bit about my first impressions of error handling in Rust: Initial Impressions of Rust. A big part of that involves the Result enum.

Similar to the Option enum, the Result enum represents exclusive states. For Result, the options are "Ok" and "Err". Each of these can have their own type.

To update our "fib" function to return a Result, we'll need to make 2 updates.

Updating the Function Declaration
First, we'll need to update the signature of the function to return a Result. Here's the new signature:
    fn fib(n: u8) -> Result<u64, &'static str> {

    }
The "Result" enum has 2 generic type parameters. The first represents the type for the "Ok" value; the second represents the type for the "Err".

In this case, the "Ok" will be a u64.

The "Err" is a bit more confusing. We want to return a string, but if we try to use just "str", we get an error that the compiler cannot determine the size at compile time. And as we've seen, Rust needs to be able to determine things at compile time.

Instead of using "str", we can use "&str" to use the address of a string (Rust does use pointers; we won't talk too much about them today). The address is a fixed size, so that gets rid of the previous error. But we get a new error that there is a "missing lifetime specifier".

UPDATE Technical Note: '&str' is a string slice. This allows the Err to borrow the value of the string without taking ownership of it. (I've learned more about ownership since I wrote this article. It's pretty interesting.)

The good news is that the error also gives you a hint to consider using the "static" lifetime with an example.

I'm using Visual Studio Code with the Rust extension, so I get these errors and hints in the editor. But these same messages show up if you build using "cargo build".

Returning a Result
Now that we've updated the function signature, we need to actually return a Result. We can do this with some pattern matching.

Replace the previous expression at the end of the function:
    curr
with a "match":
    match curr == 0 {
        false => Ok(curr),
        true => Err("Calculation overflow")
    }
This looks at the value of the "curr" variable and compares it to 0. (Again, this isn't the best way to handle this, but we'll fix it a bit later).

If "curr" is not 0 (meaning there is a valid value), then we hit the "false" arm and return an "Ok" with the value.

If "curr" is 0 (meaning there was an overflow), then we hit the "true" arm and return an "Err" with an appropriate message.

Here's the updated "fib" function:
    fn fib(n: u8) -> Result<u64, &'static str> {
        let mut prev: u64 = 0;
        let mut curr: u64 = 1;
        for _ in 1..n {
            let result = prev.checked_add(curr);
            match result {
                Some(next) => {
                    prev = curr;
                    curr = next;
                }
                None => {
                    curr = 0;
                    break;
                }
            }
        }
        match curr == 0 {
            false => Ok(curr),
            true => Err("Calculation overflow")
        }
    }
Side Note: In thinking about this later, I could have done the pattern matching more elegantly. I started with an if/else block (which is why I'm matching on a boolean value). But we could also write the pattern matching to use the "curr" value directly:
    match curr {
        0 => Err("Calculation overflow"),
        _ => Ok(curr),
    }
This is more direct (but not really less confusing since 0 is a magic number here). Now the match is on the "curr" value itself. If the value is "0", then we return the Err. For the second arm, the underscore represents a catch-all. So if the value is anything other than "0", we return "Ok". Notice that I did have to reverse the order of the arms. The first match wins with pattern matching, so the default case needs to be at the end.

Both of these matches produce the same results. We won't worry about them too much because we'll be replacing this entirely in just a bit.

But since we changed the return type, our calling code needs to be updated.

Using ".expect"
One way that we can deal with the Result enum is to use the "expect()" function.

Here is the updated code from the "main" function:
    println!("Fibonacci {} = {}", nth, fib(nth).expect("Fibonacci calculation failed"));
After the call to "fib(nth)", we add an ".expect()" call and pass in a message.

"expect()" works on a Result enum. If the Result is "Ok", then it pulls out the value and returns it. So if there is no overflow, then the expected Fibonacci number is used for the placeholder.

But if Result is "Err", then "expect" will panic. That's not exactly what we want here, but this gets us one step closer.

With the "expect" in place, here is our output:
    Fibonacci 1 = 1
    Fibonacci 2 = 1
    Fibonacci 3 = 2
    Fibonacci 4 = 3
    Fibonacci 5 = 5
    Fibonacci 6 = 8
    Fibonacci 7 = 13
    Fibonacci 8 = 21
    Fibonacci 9 = 34
    Fibonacci 10 = 55
    Fibonacci 90 = 2880067194370816120
    Fibonacci 91 = 4660046610375530309
    Fibonacci 92 = 7540113804746346429
    Fibonacci 93 = 12200160415121876738
    thread 'main' panicked at 'Fibonacci calculation failed: "Calculation overflow"', src    \main.rs:5:53
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    error: process didn't exit successfully: `target\debug\fib.exe` (exit code: 101)
Because we get a panic when trying to fetch #94, the application halts and does not process the rest of the values (95 and 96).

If we look at the output, we see both of the messages that we added. "Fibonacci calculation failed" is what we put into the "expect", and "Calculation overflow" is what we put into the "Err" Result.

But I'd like to get rid of the panic. And we can do that with more pattern matching.

Matching on Result

Branch: 06-matchresult
Just like we used pattern matching with Option in the "fib" function, we can use pattern matching with Result in the "main" function.

Pattern Matching
Here's the updated "for" loop from the "main" function:
    for nth in nths {
        match fib(nth) {
            Ok(result) => println!("Fibonacci {} = {}", nth, result),
            Err(e) => println!("Error at Fibonacci {}: {}", nth, e),
        }
    }
Inside the "for" loop, we match on the result of "fib(nth)".

If the Result is "Ok", then we use "println!" with the same string that we had before.

If the Result is "Err", then we output an error message.

Adding an Overflow Flag
The last thing I want to do is get rid of the "curr = 0" that denotes an overflow. Even though this works, it's a bit unclear. (And it can cause problems since some implementations of Fibonacci consider "0" to be a valid value.)

For this, we'll add a new variable called "overflow" to the "fib" function. Here's the completed function with "overflow" in place:
    fn fib(n: u8) -> Result<u64, &'static str> {
        let mut prev: u64 = 0;
        let mut curr: u64 = 1;
        let mut overflow = false;
        for _ in 1..n {
            let result = prev.checked_add(curr);
            match result {
                Some(next) => {
                    prev = curr;
                    curr = next;
                }
                None => {
                    overflow = true;
                    break;
                }
            }
        }
        match overflow {
            false => Ok(curr),
            true => Err("Calculation overflow")
        }
  }
A new mutable "overflow" variable is created and set to "false". Then if there is an overflow, it is set to "true". Finally, "overflow" is used in the final pattern matching to determine whether to return "Ok" or "Err".

Final Output
With these changes in place, here is our final output:
    Fibonacci 1 = 1
    Fibonacci 2 = 1
    Fibonacci 3 = 2
    Fibonacci 4 = 3
    Fibonacci 5 = 5
    Fibonacci 6 = 8
    Fibonacci 7 = 13
    Fibonacci 8 = 21
    Fibonacci 9 = 34
    Fibonacci 10 = 55
    Fibonacci 90 = 2880067194370816120
    Fibonacci 91 = 4660046610375530309
    Fibonacci 92 = 7540113804746346429
    Fibonacci 93 = 12200160415121876738
    Error at Fibonacci 94: Calculation overflow
    Error at Fibonacci 95: Calculation overflow
    Error at Fibonacci 96: Calculation overflow
This version no longer panics if there is an overflow. If we do have an overflow, it gives us an error message. And all of the values will get calculated, even if an overflow occurs in the middle.

Wrap Up

Calculating Fibonacci numbers is not a very complex task. But in this walkthrough, we got to use several features of Rust and understand a bit more about how the language works.
  • for loops
  • Statements vs. expressions
  • Function returns (expressions)
  • checked_add to prevent overflow
  • Option enum (returned from checked_add)
  • Pattern matching on Option
  • Result enum (to return error rather than panic)
  • .expect with Result
  • Pattern matching on Result
Quite honestly, I didn't expect to get this much out of this exercise. I've calculate Fibonacci sequences lots of times before. I was surprised about what I learned.

It's okay to do "simple" exercises. And it's okay to be surprised when they don't go quite as you expected.

Happy Coding!

No comments:

Post a Comment