The Categorical distribution

The Categorical distribution

We are now going to generalize the six-sided die experiment. A Categorical random variable is used to model an experiment with taking \(K\) different possibilities coded, for example, \(1, 2,\dots,K\), each with a different probability. We can write:

\[\begin{split} X = \begin{cases} 1,&\;\text{with probability}\;p_1,\\ 2,&\;\text{with probability}\;p_2,\\ \vdots&\\ K,&\;\text{with probability}\;p_K. \end{cases} \end{split}\]

Of course, we can also write:

\[ p(X=x) = p_x. \]

Another way, we can write this is:

\[ X\sim \text{Categorical}(p_1,\dots,p_K), \]

which we read as:

the random variable \(X\) follows a Categorical distribution with \(K\) possibilities each with probability \(p_1, p_2\) to \(p_K\).

The six-sided, fair, die is a particular example of a Categorical. This one in particular:

\[ p(X=x) = \frac{1}{6}, \]

if \(x=1,2,\dots,6\).

A specific example

Let’s now make a specific choice for the probabilities, make a Categorical, and sample from it. We are going to play with this one which has four possibilities:

\[ X \sim \text{Categorical}(0.1, 0.3, 0.4, 0.2). \]
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set(rc={"figure.dpi":100, 'savefig.dpi':300})
sns.set_context('notebook')
sns.set_style("ticks")
from IPython.display import set_matplotlib_formats
set_matplotlib_formats('retina', 'svg')
import numpy as np
import scipy.stats as st
# The probabilities:
ps = [0.1, 0.3, 0.4, 0.2] # this has to sum to 1
# And here are the corresponding values:
xs = np.array([1, 2, 3, 4])
# Here is how you can define a categorical rv:
X = st.rv_discrete(name='Custom Categorical', values=(xs, ps))

You can evaluate the PMF anywhere you want:

X.pmf(2)
0.3
X.pmf(3)
0.4

And you can sample from it like this:

X.rvs(size=10)
array([3, 3, 3, 4, 3, 4, 4, 4, 3, 2])

Let’s plot the PMF:

fig, ax = plt.subplots(dpi=150)
ax.bar(xs, X.pmf(xs))
ax.set_xlabel('$x$')
ax.set_ylabel('$p(x)$');
../_images/the-categorical-distribution_9_0.svg

Okay. Now let’s find the probability that \(X\) takes the value \(2\) or \(4\). It is:

\[ p(X=2\;\text{or}X=4) = p(X=2) + p(X=4). \]

So:

X.pmf(2) + X.pmf(4)
0.5

Questions

  • Rerun all code segements above for the Categorical \(X\sim \operatorname{Categorical}(0.1, 0.1, 0.4, 0.2, 0.2)\) taking values \(1, 2, 3, 4\) and \(5\).