# The Binomial Distribution¶

Suppose that we are dealing with an experiment with two outcomes 0 (faillure) and 1 (success) and that the probability of success is $$\theta$$. We are interested in the random variable $$X$$ tha counts the number of successful experiments in $$n$$ trials. This variable is called a Binomial random variable. We write:

$X\sim \text{Binomial}(n, \theta).$

It can be shown (but beyond the scope of this class), that the probability of $$k$$ successful experiments is given by the PMF:

$p(X = k) = {n\choose{k}}\theta^k(1-\theta)^{n-k},$

where $${n\choose{k}}$$ is the number of $$k$$ combinations out of $$n$$ elements, i.e.:

${n\choose{k}} = \frac{n!}{k!(n-k)!}.$

Here is how to define the binomial in scipy.stats:

import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set(rc={"figure.dpi":100, 'savefig.dpi':300})
sns.set_context('notebook')
sns.set_style("ticks")
from IPython.display import set_matplotlib_formats
set_matplotlib_formats('retina', 'svg')
import numpy as np
import scipy.stats as st

n = 5       # Performing the experiment n times
theta = 0.6 # Probability of sucess each time
X = st.binom(n, theta) # Number of successes


Here are some samples:

X.rvs(100)

array([4, 4, 3, 3, 3, 3, 2, 5, 4, 5, 4, 3, 3, 2, 4, 3, 5, 2, 1, 3, 4, 4,
2, 4, 5, 3, 3, 5, 4, 5, 4, 4, 2, 4, 3, 3, 3, 4, 4, 2, 4, 3, 3, 3,
1, 4, 3, 0, 3, 2, 1, 3, 2, 4, 3, 3, 2, 3, 4, 2, 5, 4, 2, 3, 1, 2,
0, 4, 5, 3, 3, 4, 2, 3, 5, 1, 4, 2, 4, 2, 4, 4, 4, 2, 5, 4, 2, 4,
2, 3, 1, 4, 2, 3, 2, 2, 1, 2, 1, 5])


Let’s draw the PMF:

fig, ax = plt.subplots(dpi=100)
xs = range(n + 1) # You can have anywhere between 0 and n successes
ax.bar(xs, X.pmf(xs))
ax.set_xlabel('$x$')
ax.set_ylabel('$p(x)$');