The Fibonacci problem is a well known mathematical problem that models population growth and was conceived in the 1200s. Leonardo of Pisa aka Fibonacci decided to use a recursive equation: $x_{n} = x_{n-1} + x_{n-2}$ with the seed values $x_0 = 0$ and $x_1 = 1$. Implementing this recursive function is straightforward:

def fib(n):
    if n==0: return 0
    if n==1: return 1
    else: return fib(n-1) + fib(n-2)

Since the Fibonacci sequence was conceived to model population growth, it would seem that there should be a simple equation that grows almost exponentially. Plus, this recursive calling is expensive both in time and memory.1. The cost of this function doesn’t seem worthwhile. To see the surprising formula that we end up with, we need to define our Fibonacci problem in a matrix language.2

$$ \begin{bmatrix} x_{n} \\ x_{n-1} \end{bmatrix} = \bm{x}_{n} = \bm{A}\cdot \bm{x}_{n-1} = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix} \cdot \begin{bmatrix} x_{n-1} \\ x_{n-2} \end{bmatrix} $$

Calling each of those matrices and vectors variables and recognizing the fact that $\bm{x}_{n-1}$ follows the same formula as $\bm{x}_n$ allows us to write

$$ \begin{aligned} \bm{x}_n &= \bm{A} \cdot \bm{x}_{n-1} \\ &= \bm{A} \cdot \bm{A} \cdots \bm{A} \cdot \bm{x}_0 \\ &= \bm{A}^n \cdot \bm{x}_0 \end{aligned} $$

where we have used $\bm{A}^n$ to mean $n$ matrix multiplications. The corresponding implementation looks something like this:

def fib(n):
    A   = np.asmatrix('1 1; 1 0')
    x_0 = np.asmatrix('1; 0')
    x_n = np.linalg.matrix_power(A, n).dot(x_0)
    return x_n[1]

While this isn’t recursive, there’s still an $n-1$ unnecessary matrix multiplications. These are expensive time-wise and it seems like there should be a simple formula involving $n$. As populations grow exponentially, we would expect this formula to involve scalars raised to the $n$th power. A simple equation like this could be implemented many times faster than the recursive implementation!

The trick to do this rests on the mysterious and intimidating eigenvalues and eigenvectors. These are just a nice way to view the same data but they have a lot of mystery behind them. Most simply, for a matrix $\bm{A}$ they obey the equation

$$ \bm{A}\cdot\bm{x} = \lambda \cdot\bm{x} $$

for different eigenvalues $\lambda$ and eigenvectors $\bm{x}$. Through the way matrix multiplication is defined, we can represent all of these cases. This rests on the fact that the left multiplied diagonal matrix $\bm{\Lambda}$ just scales each $\bm{x}_i$ by $\lambda_i$. The column-wise definition of matrix multiplication makes it clear that this is represents every case where the equation above occurs.

$$ \bm{A} \cdot \begin{bmatrix} \bm{x}_1 & \bm{x}_2\\ \end{bmatrix} = \begin{bmatrix} \bm{x}_1 & \bm{x}_2\\ \end{bmatrix} \cdot \begin{bmatrix} \lambda_{1} & 0\\ 0 & \lambda_2 \end{bmatrix} $$

Or compacting the vectors $\bm{x}_i$ into a matrix called $\bm{X}$ and the diagonal matrix of $\lambda_i$’s into $\bm{\Lambda}$, we find that

$$ \bm{A}\cdot\bm{X} = \bm{X}\cdot\bm{\Lambda} $$

Because the Fibonacci matrix is diagonalizable

$$ \bm{A} = \bm{X}\cdot\bm{\Lambda}\cdot\bm{X}^{-1} $$

And then because a matrix and it’s inverse cancel

$$ \begin{aligned} \bm{A}^n &= \bm{X}\cdot\bm{\Lambda}\cdot\bm{X}^{-1} \cdot\ldots\cdot \bm{X}\cdot\bm{\Lambda}\cdot\bm{X}^{-1}\\ &= \bm{X}\cdot\bm{\Lambda}^n\cdot\bm{X}^{-1} \end{aligned} $$

$\bm{\Lambda}^n$ is a simple computation because $\bm{\Lambda}$ is a diagonal matrix: every element is just raised to the $n$th power. That means the expensive matrix multiplication only happens twice now. This is a powerful speed boost and we can calculate the result by substituting for $\bm{A}^n$

$$ \bm{x}_n = \bm{X}\cdot \bm{\Lambda}^n \cdot \bm{X}^{-1}\cdot\bm{x}_0 $$

For this Fibonacci matrix, we find that $\bm{\Lambda} = \textrm{diag}\left(\frac{1+\sqrt{5}}{2}, \frac{1-\sqrt{5}}{2}\right)= \textrm{diag}\left(\lambda_1, \lambda_2\right)$. We could define our Fibonacci function to carry out this matrix multiplication, but these matrices are simple: $\bm{\Lambda}$ is diagonal and $\bm{x}_0 = \left[1; 0\right]$. So, carrying out this fairly simple computation gives

$$ x_n = \frac{1}{\sqrt{5}}\left(\lambda_{_1}^n - \lambda_{_2}^n\right) \approx \frac{1}{\sqrt{5}} \cdot 1.618034^n $$

We would not expect this equation to give an integer. It involves the power of two irrational numbers, a division by another irrational number and even the golden ratio phi $\phi \approx 1.618$! However, it gives exactly the Fibonacci numbers – you can check yourself!

This means we can define our function rather simply:

def fib(n):
    lambda1 = (1 + sqrt(5))/2
    lambda2 = (1 - sqrt(5))/2
    return (lambda1**n - lambda2**n) / sqrt(5)
def fib_approx(n)
    # for practical range, percent error < 10^-6
    return 1.618034**n / sqrt(5)

As one would expect, this implementation is fast. We see speedups of roughly $1000$ for $n=25$, milliseconds vs microseconds. This is almost typical when mathematics are applied to a seemingly straightforward problem. There are often large benefits by making the implementation slightly more cryptic!

I’ve found that mathematics3 becomes fascinating, especially in higher level college courses, and can often yield surprising results. I mean, look at this blog post. We went from a expensive recursive equation to a simple and fast equation that only involves scalars. This derivation is one I enjoy and I especially enjoy the simplicity of the final result. This is part of the reason why I’m going to grad school for highly mathematical signal processing. Real world benefits $+$ neat theory $=$ <3.

  1. Yes, in some languages some compilers are smart enough to get rid of recursion for some functions. 

  2. I’m assuming you have taken a course that deals with matrices. 

  3. Not math. Courses beyond calculus deserve a different name.