Since computers were invented we have spent a lot of time generating uniform random numbers. A quick search on Google Scholar for “Generating a uniform random variable” gives 850,000 results. But what if we want to generate another random variable? Maybe a Gaussian random variable or a binomial random variable? These are both extremely useful.1

We’ve spent so long focusing on generating uniform random variables they must be useful. That is, computers try to generate

$$ U \sim \textrm{uniform}(0, 1) $$

which is a random number between 0 and 1 with equal probability of any number happening. It is exactly the variable that has received so much attention and has seen many algorithms developed to generate this variable. We’ve developed pseudo-random number generators for this and it’s what rand implements in many programming languages.

The cumulative distribution function (CDF) defines our random variable. It gives the probability that the random variable it defines lies below a certain value. It gives a probability out, and the cumulative distribution function of $X$ is defined to be

$$ F_X(x) = \Pr(X \le x) $$

Then, let’s define our variable to be the inverse of the CDF we’d like, $F(x)$.

$$ X = F^{-1}(U) $$

Because we defined our random variable as $X = F^{-1}(U)$, we see that our estimate of the CDF for $X$, $F_X$ is

$$ \begin{aligned} F_X(x) &= \Pr(X \le x)\\ &= \Pr(F^{-1}(U) \le x)\\ &= \Pr(U \le F(x))\\ &= F(x) \end{aligned} $$

where in the last step we use that $\Pr(U \le t) = t$ because it is a uniform random variable between 0 and 1, and the CDF is a linear line with a slope of 1 (that is, between 0 and 1).

We see that $F_X(x) = F(x)$, exactly as we wanted! This means that we only need a uniform random number and $F_X^{-1}$ to generate any random variable. After we generate $F^{-1}(y)$ (which may be difficult) we can generate any random variable!

For an example, let’s use the exponential random variable. The cumulative distribution function of this random variable is defined to be

$$ F_X(x) = (1 - e^{-\lambda x}) u(x) $$

To generate this random variable, we would define

$$ X = F_X^{-1}(U) = -\frac{1}{\lambda}\log(1 - U) $$

The following Python generates that random variable, and tests against theory.

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# our uniform random variable
u = np.random.uniform(0, 1, 1e4)

# function to compute inverse CDF or F_X^{-1}
F_x_inv = lambda u, lambda_: -np.log(1-u) / lambda_

# our exponential random variable
lambda_ = 40
x = F_x_inv(u, lambda_)

t = np.linspace(0, 0.15)
# to test against theory; this is Pr(X = t)
f_x = lambda_ * np.exp(-lambda_ * t)

And we can view the results with a histogram:

A quick check at np.random yields that they generate random variables the same way.

  1. I won’t cover this here, but the Gaussian random variable is useful almost everywhere and the binomial random variable can represent performing many tests that can either pass or fail.