Getting the Binomial RV from the Bernoulli

In class we defined the Binomial\((n,p)\) random variable as the sum of \(n\) independent Bernoulli\((p)\) random variables. In other words, the Binomial\((n,p)\) equals the total number of successes (ones) in \(n\) independent Bernoulli trials, each with probability of success (one) equal to \(p\). The point of this document is to convince you that this definition actually makes sense and really does lead to the formulas from class.

R doesn’t have a built-in function to simulate Bernoulli Random Variables since it treats them as a Binomial\((n=1,p)\) random variable. Let’s make our own:

The argument prob = c(1-p, p) is new. It tells sample to draw 0 with probability 1-p and 1 with probability p. This is exactly what we want. Let’s test it out:

rbern(30, 0.1)

## [1] 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

rbern(30, 0.5)

## [1] 0 1 0 0 0 0 1 1 0 1 0 1 0 0 1 1 1 1 0 1 1 1 0 0 0 0 1 0 0 0

rbern(30, 0.9)

## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1

The function rbern returns all the individual Bernoulli draws. What we need to make a connection with the Binomial RV is a function that returns the sum of these draws instead. This is easy: we just call rbern and take its sum:

The Binomial RV

It turns out that our function rbern.sum makes a single random draw from a Binomial\((n,p)\) random variable. How do I know this? We constructed it exactly following the definition of the Binomial from class: draw the some Bernoullis and sum them up. But don’t take my word for it. Let’s verify this with a simulation.

100,000 Binomial\((n = 10, p = 0.5)\) Draws

To do this, we’ll use rbern.sum and replicate

binom.sims <- replicate(10^5, rbern.sum(10, 0.5))
head(binom.sims)

## [1] 5 4 7 3 4 2

Let’s make a plot of the relative frequencies from this simulation experiment:

As you can see, the differences are tiny. They’d be even smaller if we used more replications.

I hope this example has given you a bit more intuition about the Binomial RV. I’m not going to assign any exercises here, but it would be a good idea to try some simulation experiments of your own with different values for \(n\) and \(p\).