The Poisson distribution


The Poisson distribution

The Poisson distribution models the number of times an event occurs in an interval of space or time. For example, a Poisson random variable \(X\) may be:

  • The number earthquakes greater than 6 Richter occurring over the next 100 years.

  • The number of major floods over the next 100 years.

  • The number of patients arriving at the emergency room during the night shift.

  • The number of electrons hitting a detector in a specific time interval.

The Poisson is a good model when the following assumptions are true:

  • The number of times an event occurs in an interval takes values \(0,1,2,\dots\).

  • Events occur independently.

  • The probability that an event occurs is constant per unit of time.

  • The average rate at which events occur is constant.

  • Events cannot occur at the same time.

When these assumptions are valid, we can write:

\[ X\sim \operatorname{Poisson}(\lambda), \]

where \(\lambda>0\) is the rate with each the events occur. You read this as:

The random variable \(X\) follows a Poisson distribution with rate parameter \(\lambda\).

The PMF of the Poisson is:

\[ p(X=k) = \frac{\lambda^ke^{-\lambda}}{k!}. \]

Let’s look at a specific example. Historical data show that at a given region a major earthquake occurs once every 100 years on average. What is the probability that \(k\) such earthquakes will occur within the next 100 years. Let \(X\) be the random variable corresponding to the number of earthquakes over the next 100 years. Assuming the Poisson model is valid, the rate parameter is \(\lambda = 1\) and we have:

\[ X\sim \operatorname{Poisson}(1). \]

Here is a Poisson random variable:

import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set(rc={"figure.dpi":100, 'savefig.dpi':300})
from IPython.display import set_matplotlib_formats
set_matplotlib_formats('retina', 'svg')
import numpy as np
import scipy.stats as st
X = st.poisson(1.0)

And some samples from it:

array([1, 1, 1, 1, 0, 1, 0, 0, 1, 1])

Let’s plot the PMF:

ks = range(6)
fig, ax = plt.subplots(dpi=150), X.pmf(ks))
ax.set_xlabel('Number of major earthquakes in next 100 years')
ax.set_ylabel('Probability of occurance');

You will investigate the Poisson distribution more in {ref}`lecture09:homework


  • How would the rate parameter \(\lambda\) change if the rate with each major earthquakes occured in the past was 2 every 100 years? Plot the pmf of the new Poisson random variable. You may have to add more points in the x-axis.