The uniform distribution#

In Example: Uncertainties in steel ball manufacturing, we had a random variable for which all values between a given interval were equally probable. This is a very common situation covered by the so-called uniform distribution.

Let’s start with the simplest case: a random variable taking values between 0 and 1 with constant probability density. We write:

XU([0,1]),

and we read

X follows a uniform distribution taking values in [0,1].

The probability density of the uniform is constant in [0,1] and zero outside it. We have:

p(x):={c,0x1,0,otherwise.

What should the constant c be? Just like in Example: Uncertainties in steel ball manufacturing, you can find it by ensuring the PDF integrates to one (see PDF Property 5:

01p(x)dx=1c=1.

So, the PDF is:

p(x):={c,0x1,0,otherwise.

To find the CDF, we can use PDF Property 3: $F(x)=p(Xx)=0xp(x~)dx~=0xdx~=x.Obviously,wehaveF(x) = 0forx < 0andF(x) = 1forx > 1$.

Using this result, we can find the probability that X takes values in [a,b] for a<b in [0,1]. It is: $p(aXb)=F(b)F(a)=ba.$

Instantiating the uniform using scipy.stats#

Let me know show you how you can make a uniform random variable using scipy:

Hide code cell source
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set(rc={"figure.dpi":100, 'savefig.dpi':300})
sns.set_context('notebook')
sns.set_style("ticks")
from IPython.display import set_matplotlib_formats
set_matplotlib_formats('retina', 'svg')
import numpy as np
import scipy.stats as st
X = st.uniform()

Here is how you can get some samples:

X.rvs(size=10)
array([0.33842053, 0.57010738, 0.1323871 , 0.58357996, 0.68875828,
       0.67923023, 0.84656861, 0.75593012, 0.25791332, 0.13214875])

You can evalute the PDF at any point like this:

X.pdf(0.5)
1.0
X.pdf(-0.1)
0.0
X.pdf(1.5)
0.0

Let’s plot the PDF:

fig, ax = plt.subplots()
xs = np.linspace(-0.1, 1.1, 200)
ax.plot(xs, X.pdf(xs))
ax.set_xlabel('$x$')
ax.set_ylabel('$p(x)$');
../_images/f84c1c985209cfdb99429440eceec1b248a24b345daa6f520746e02b9b752643.svg

You can evaluate the CDF like this:

X.cdf(-0.5)
0.0
X.cdf(0.5)
0.5
X.cdf(1.2)
1.0

Let’s plot the CDF:

fig, ax = plt.subplots()
ax.plot(xs, X.cdf(xs))
ax.set_xlabel('$x$')
ax.set_ylabel('$F(x)$');
../_images/87fbbda117fda18d6d27ba4599d0cc9a470dee30b4be94ec9ae8359f8a125882.svg

Finally, let’s find the probability that X is between two numbers a and b. For the uniform it is, of course, trivial, but let’s see how it is done using the scipy functionality:

a = -1.0
b = 0.3
prob_X_is_in_ab = X.cdf(b) - X.cdf(a)
print('p({0:1.2f} <= X <= {1:1.2f}) = {2:1.2f}'.format(a, b, prob_X_is_in_ab))
p(-1.00 <= X <= 0.30) = 0.30

The uniform distribution over an arbitrary interval [a,b]#

The uniform distribution can also be defined over an arbitrary interval [a,b]. We write: $XU([a,b]).$ We read:

X follows a uniform distribution on [a,b].

The PDF of this random variable is:

p(x)={c,x[a,b],0,otherwise,

where c is a positive constant. This simply tells us that the probability density of finding X in [a,b] is something positive and that the probability density of findinig outside is exactly zero. This is exactly the situation we had in Example: Uncertainties in steel ball manufacturing. The positive constant c is determined by imposing the normalization condition:

+p(x)dx=1.

This gives:

1=+p(x)dx=abcdx=cabdx=c(ba).

From this we get:

c=1ba,

and we can now write:

p(x)={1ba,x[a,b],0,otherwise,

From the PDF, we can now find the CDF for x[a,b]:

F(x)=p(Xx)=xp(x~)dx~=ax1badx~=1baaxdx~=xaba.

Instantiating the generic uniform using scipy.stats:#

Let’s instantiate using scipy.stats:

a = -2.0
b = 5.0
X = st.uniform(loc=a, scale=(b-a))

Some samples:

X.rvs(size=10)
array([-1.63839392, -1.08615066, -1.9249188 , -0.86574048, -0.01377033,
        0.91441   ,  2.94486672, -0.90602192, -0.44926847,  4.28172985])
The PDF:
fig, ax = plt.subplots()
xs = np.linspace(a - 0.2, b + 0.2, 200)
ax.plot(xs, X.pdf(xs))
ax.set_xlabel('$x$')
ax.set_ylabel('$p(x)$');
../_images/34ba6d1248eb8f0e7765629502b124bda34313efc319663e052f37ccb12340d0.svg

The CDF:

fig, ax = plt.subplots()
ax.plot(xs, X.cdf(xs))
ax.set_xlabel('$x$')
ax.set_ylabel('$F(x)$');
../_images/e36d06300684e9ef766bed1d7c6ed83e37e760ac6ab5a015b6692c7f1605f996.svg

Questions#

  • Repeat the code above so that the random variable is U([1,10]).