The Binomial Distribution
Contents
The Binomial Distribution¶
Suppose that we are dealing with an experiment with two outcomes 0 (faillure) and 1 (success) and that the probability of success is \(\theta\). We are interested in the random variable \(X\) tha counts the number of successful experiments in \(n\) trials. This variable is called a Binomial random variable. We write:
It can be shown (but beyond the scope of this class), that the probability of \(k\) successful experiments is given by the PMF:
where \({n\choose{k}}\) is the number of \(k\) combinations out of \(n\) elements, i.e.:
Here is how to define the binomial in scipy.stats:
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set(rc={"figure.dpi":100, 'savefig.dpi':300})
sns.set_context('notebook')
sns.set_style("ticks")
from IPython.display import set_matplotlib_formats
set_matplotlib_formats('retina', 'svg')
import numpy as np
import scipy.stats as st
n = 5 # Performing the experiment n times
theta = 0.6 # Probability of sucess each time
X = st.binom(n, theta) # Number of successes
Here are some samples:
X.rvs(100)
array([4, 4, 3, 3, 3, 3, 2, 5, 4, 5, 4, 3, 3, 2, 4, 3, 5, 2, 1, 3, 4, 4,
2, 4, 5, 3, 3, 5, 4, 5, 4, 4, 2, 4, 3, 3, 3, 4, 4, 2, 4, 3, 3, 3,
1, 4, 3, 0, 3, 2, 1, 3, 2, 4, 3, 3, 2, 3, 4, 2, 5, 4, 2, 3, 1, 2,
0, 4, 5, 3, 3, 4, 2, 3, 5, 1, 4, 2, 4, 2, 4, 4, 4, 2, 5, 4, 2, 4,
2, 3, 1, 4, 2, 3, 2, 2, 1, 2, 1, 5])
Let’s draw the PMF:
fig, ax = plt.subplots(dpi=100)
xs = range(n + 1) # You can have anywhere between 0 and n successes
ax.bar(xs, X.pmf(xs))
ax.set_xlabel('$x$')
ax.set_ylabel('$p(x)$');
Questions¶
Start increasing the number of trials \(n\). Gradually take it up to \(n=100\). How does the resulting pmf look like? This starts to look like a bell curve. And indeed it is! We will learn more about this in Lecture 11: Expectations, variances, and their properties.