Introduction to Probability Models

Lecture 33

Qi Wang, Department of Statistics

Nov 10, 2017

Normal Approximation to the Binomial

If a Binomial distribution has a large enough combination of n and p, it behaves much like a Normal distribution, which means we can use the Normal distribution to approximate the original Binomial distribution

  • If $X \sim Bin(n, p)$, and $np > 5, n(1 - p) > 5$
  • Then we can use $X^\star \sim N(\mu = np, \sigma = \sqrt{np(1-p)})$, to approximate $X$

You may notice that Binomial is Discrete, and Normal is Continuous. This means the approximation comes at a cost of accuracy that we must try to correct. When we use the approximation, we need to perform a continuity correction:

  • If you’re looking for: $P(a \le X \le b)$
  • Use $P(a - 0.5 < X^\star < b + 0.5)$

Example 1

If all conditions are satistified, find the Normal approximation to the following probability statement where $X$ follows a Binomial distribution

  1. $P(4 \le X \le 10)$
  2. $P(4 < X < 10)$
  3. $P(X \le 6)$
  4. $P(X < 5)$
  5. $P(X \ge 9)$
  6. $P(X > 8)$

Three Approximations in this Course

  1. The Binomial approximation to the Hypergeometric: If $X \sim Bin(n, p)$, and $np > 5, n(1 - p) > 5$, we can use $X^\star \sim N(\mu = np, \sigma = \sqrt{np(1-p)})$, to approximate $X$
  2. The Poisson approximation to the Binomial: If $X \sim Bin(n, p)$ with $n>100$ and $p<0.01$, we can use $X^\star \sim Poisson(\lambda = np)$, to approximate $X$
  3. The Normal approxmation to the Binomial: If $X \sim Bin(n, p)$, and $np > 5, n(1 - p) > 5$, then we can use $X^\star \sim N(\mu = np, \sigma = \sqrt{np(1-p)})$, to approximate $X$

Five Number Summary

  • Minimum(Min)
  • First Quartile(Q1)
  • Median(M)
  • Third Quartile(Q3)
  • Maximum(Max)

Calculate quartiles using the indexing method: i = np, where p is the percentile written as a decimal. If i is not an integer, round i up to the next integer. This is the position of that percentile. If i is an integer, take an average of the $i_{th}$ and $(i + 1)_{st}$ values. That average is the percentile of interest.

Example 2

Calculate Five Number Summary for the following dataset: $${4.1, 6.2, 10.4, 5.5, 9.7, 21.3, 7.1}$$