next up previous
Next: November 13 Up: October 8 Previous: Continuous distributions


Generating a distribution

A subtle point is that $ f_X$ describes the distribution we want, but it doesn't directly help create it. Programming languages often provide a way of generating a sequence of pseudo-random integers in the range $ 0\ldots m$ by starting at some integer $ x_0$ and then continuing using the relation:

$\displaystyle x_{n+1} = (ax_n + b) \bmod m.
$

With a good choice of $ a$, $ b$, and $ m$ this will generate a sequence of integers that satisfy some criteria of randomness, and not repeat any for $ m$ steps. If $ m$ is chosen fairly large, we can divide by it to scale these values into $ [0,1)$ (the half-open interval from zero to 1). But there's more to do to generate our desired distributions.

We need a way to associate each number we generate in $ [0,1)$ with a number in our desired distribution. The first step is to construct the cumulative distribution function $ F_X(x)$ which represents the probability that $ X \leq x$:

$\displaystyle F_X(x) = \int_{-\infty}^x f_X(t) dt.
$

$ F_X$ has the nice property that its values are increasing and in the interval $ [0,1]$. If we apply it to our exponential distribution, we get:

$\displaystyle F_X(x) = \int_0^x r e^{-rt} dt = \left. r(e^{-rt}/(-r)) \right\vert _0^x =
1 - e^{-rx}.
$

Here's the picture:

\resizebox{0.5\textwidth}{!}{
\includegraphics{cumulativeDist.eps}}

Now we can invert $ F_X(x)$ -- turn the graph sideways! For each $ y \in [0,1)$ we generate, we can solve $ y = 1 - e^{-rx}$ in order to find the corresponding $ x$. Rearrange the equation and take $ \ln$ (natural log) of both sides, and you have $ x = -\ln(1-y)/r$. We don't have to worry about taking $ \ln(0)$, since we rigged things so that $ y$ is never 1. A more straight-forward task is to invert a uniform continuous distribution on the interval $ [a,b]$. If $ x \in [a,b]$ the cumulative distribution is $ F_X(x)$ = $ (x-a)/(b-a)$, so now we solve for $ y =
(x-a)/(b-a)$, or $ x = a + y(b-a)$.


next up previous
Next: November 13 Up: October 8 Previous: Continuous distributions
Danny Heap 2002-12-16