Canonical derivation of the mean and variance of the binomial distribution

The mean and variance of the binomial distribution can be derived most easily using a simple indicator function approach but, in the course of studying a stochastic process involving the binomial distribution, I became interested in deriving the mean and variance from the canonical probability-weighted sum of random variable values. I found it instructive to work out how to do this and want to record the relevant manipulations here.

For the binomial distribution B(N, p) with N a positive integer and 0 < p < 1, the underlying random variable X is one that counts the number of successes in the first N trials of a Bernoulli sequence with probability p of success on any trial and a probability q = 1 - p of failure. The probability function for n successes in N trials is

P(X = n) = C(N, n) p^n q^{N-n} \quad \quad \quad \quad (1)

where

C(N, n) = \frac{N!}{n! (N-n)!} \quad \quad \quad \quad (2)

The mean and variance of the random variable X can be obtained almost trivially by expressing X as a sum of indicator functions I_k for k = 0, 1, \ldots, N, where each indicator takes the value 1 meaning success with probability p and the value 0 meaning failure with probability q:

X = \sum_{k=1}^{N} I_k \quad \quad \quad \quad (3)

Given that

E[I_k] = p \quad \quad \quad \quad (4)

and

V[I_k] = E[I_k^2] - E^2[I_k] = pq \quad \quad \quad \quad (5)

it follows immediately that

E[X] = Np \quad \quad \quad \quad (6)

and

V[X] = Npq \quad \quad \quad \quad (7)

However, it is also possible to obtain these expressions by manipulating the usual defining sums for E[X] and E[X^2] using the probability functions in (1) above, i.e.,

E[X] = \sum_{n=0}^N n \cdot P(X = n)

= \sum_{n=1}^N n \cdot C(N, n) p^n q^{N-n} \quad \quad \quad \quad (8)

and

E[X^2] = \sum_{n=0}^N n^2 \cdot P(X = n)

= \sum_{n=1}^N n^2 \cdot C(N, n) p^n q^{N-n} \quad \quad \quad \quad (9)

In the case of E[X], we can manipulate n \cdot C(N, n) as follows:

n \cdot C(N, n) = n \cdot \frac{N!}{n! (N-n)!}

= N \cdot \frac{(N-1)!}{(n-1)! (N-n)!} = N \cdot C(N-1, n-1) \quad \quad \quad \quad (10)

Then we have

E[X] = \sum_{n=1}^N n \cdot C(N, n) p^n q^{N-n}

= Np \sum_{n=1}^N C(N-1, n-1) p^{n-1} q^{N-n}

= Np (p + q)^{N-1}

= Np \quad \quad \quad \quad (11)

where the penultimate line follows from the binomial expansion theorem.

In the case of E[X^2], we need to manipulate n \cdot C(N-1, n-1) as follows:

n \cdot C(N-1, n-1) = n \cdot \frac{(N-1)!}{(n-1)! (N-n)!}

= [(n-1) + 1] \cdot \frac{(N-1)!}{(n-1)! (N-n)!}

= \frac{(N-1)!}{(n-2)! (N-n)!} + \frac{(N-1)!}{(n-1)! (N-n)!}

= (N-1) \cdot \frac{(N-2)!}{(n-2)! (N-n)!} + \frac{(N-1)!}{(n-1)! (N-n)!} \quad \quad \quad \quad (12)

Then we have

E[X^2] = \sum_{n=1}^N n^2 \cdot C(N, n) p^n q^{N-n}

= Np \sum_{n=1}^N n \cdot C(N-1, n-1) p^{n-1} q^{N-n}

= Np \sum_{n=2}^N (N-1)p \cdot C(N-2, n-2) p^{n-2} q^{N-n}

+ Np \sum_{n=1}^N n \cdot C(N-1, n-1) p^{n-1} q^{N-n}

= Np (N-1)p \sum_{n=2}^N C(N-2, n-2) p^{n-2} q^{N-n} + Np

= N^2p^2 - Np^2 + Np \quad \quad \quad \quad (13)

Therefore,

V[X] = E[X^2] - E^2[X]

= N^2p^2 - Np^2 + Np - N^2p^2

= Np - Np^2

= Npq

Published by Dr Christian P. H. Salas

Mathematics Lecturer

Leave a comment