## Introduction to Probability Models

Lecture 35

Qi Wang, Department of Statistics

Nov 15, 2017

• Range
• Variance
• Standard deviation
• $p_{th}$ percentile
• Interquartiles Range(IQR)

### Range

• Range = max - min

### Variance

Variance: based on the difference between each observation and the mean

• Population variance: $$\sigma^2 = \frac{\sum(x_i - \mu)^2}{N}$$
• Sample variance: $$s^2 = \frac{\sum(x_i - \bar{x})^2}{n - 1}$$

### Standard Deviation

Standard deviation: most commonly used for measuring how far observation are from the mean

• Population version: $$\sigma = \sqrt{\sigma^2}$$
• Sample version: $$s = \sqrt{s^2}$$

### $p_{th}$ percentile

$p_{th}$ percentile: value such that p% of the observation fall at or below it

• Median: $M = 50_{th}$ percentile
• First quartile: $Q_1 = 25_{th}$ percentile
• Third quartile: $Q_3 = 75_{th}$ percentile

### How to Find a Percentile for Data

1. Order the data in increasing order
2. Calculate $i=\frac{np}{100}$, where $n$ is the sample size, $p$ is the percentile
• If $i$ is not an integer, round $i$ up to the next integer. Then take the $i_{th}$ value
• If $i$ is an integer, take an average of the $i_{th}$ and $(i + 1)_{th}$ values

Example: -20, 1, 23, 25, 32.5, 33, 67

### Interquartiles Range(IQR)

• IQR = $Q_3 - Q_1$
• Outliers: an observation is said to be a suspected outlier if it is $$> Q_3 + 1.5*IQR$$ OR $$< Q_1 - 1.5 * IQR$$

### Boxplot

Boxplot is a graphic depiction of the 5 number summary

1. Draw a horizontal or vertical axis that is evenly spaced and well-labeled(make sure it covers the full range of the data)
2. Locate $Q_1$ and $Q_3$. There are the "ends" of your box. Draw the box.
3. With the box, locate the Median and mark it
4. Locate and mark the Minimum and Maximum. Extend a line("whisker") from each end of the box to the Max or Min

### Modified Boxplot

step 1, 2, 3 are the same. BUT we indicate the outliers with a $o$ or a $\star$. Then draw the line from the ends of the box ot the highest or lowest data point that is NOT an outlier. Most software generate boxplots are modified boxplots.