Example. Solutions of alcohol are used for calibrating Breathalyzers. The following data show the alcohol concentrations of samples of alcohol solutions taken from six bottles of alcohol solution randomly selected from a large batch. The objective is to determine if all bottles in the batch have the same alcohol concentrations.
\[ \begin{array}{ccccc} \text { Bottle } & \text { Concentration } \\ \hline 1 & 1.4357 &1.4348 &1.4336 &1.4309 \\ 2 & 1.4244 & 1.42321 & 1.42131 & 1.4256 \\ 3 & 1.4153 &1.4137 &1.4176 &1.4164 \\ 4 & 1.4331 &1.4325 &1.4312 &1.4297 \\ 5 & 1.4252 &1.4261 &1.4293 &1.4272 \\ 6 & 1.4179 &1.4217 &1.4191 &1.4204 \end{array} \]
We are not interested in any difference between the six bottles used in the experiment. We therefore treat the six bottles as a random sample from the population and use a random effects model. If we were interested in the six bottles, we would use a fixed effects model.
For a completely randomized design, with \(v\) randomly selected levels of a treatment factor \(T\), the random-effects one-way model is \[ \begin{gathered} Y_{i t}=\mu+T_i+\epsilon_{i t}, \\ \epsilon_{i t} \sim N\left(0, \sigma^2\right), \quad T_i \sim N\left(0, \sigma_T^2\right), \end{gathered} \] \(\epsilon_{i t}\) ’s and \(T_i\) ’s are all mutually independent , \[ t=1, \ldots, r_i, \quad i=1, \ldots, v . \]
Note the model parameters are \(\mu\), \(\sigma^2\) and \(\sigma_T^2\).
\[ E\left[Y_{i t}\right]=E[\mu]+E\left[T_i\right]+E\left[\epsilon_{i t}\right]=\mu . \] The variance of \(Y_{i t}\) is \[ \operatorname{Var}\left(Y_{i t}\right)=\operatorname{Var}\left(\mu+T_i+\epsilon_{i t}\right)=\operatorname{Var}\left(T_i\right)+\operatorname{Var}\left(\epsilon_{i t}\right)+2 \operatorname{Cov}\left(T_i, \epsilon_{i t}\right)=\sigma_T^2+\sigma^2, \] since \(T_i\) and \(\epsilon_{i t}\) are mutually independent and so have zero covariance. Therefore, the distribution of \(Y_{i t}\) is \[ Y_{i t} \sim N\left(\mu, \sigma_T^2+\sigma^2\right) . \] The two components \(\sigma_T^2\) and \(\sigma^2\) of the variance of \(Y_{i t}\) are known as variance components. Observations on the same treatment are correlated, with \[ \operatorname{Cov}\left(Y_{i t}, Y_{i s}\right)=\operatorname{Cov}\left(\mu+T_i+\epsilon_{i t}, \mu+T_i+\epsilon_{i s}\right)=\operatorname{Var}\left(T_i\right)=\sigma_T^2 . \]
\[ \text { SSE }=\sum_{i=1}^v \sum_{t=1}^{r_i} Y_{i t}^2-\sum_{i=1}^v r_i \bar{Y}_{i .}^2 . \] Remember that the variance of a random variable \(X\) is calculated as \(\operatorname{Var}(X)=E\left[X^2\right]-(E[X])^2\). So, we have \[ E\left[Y_{i t}^2\right]=\operatorname{Var}\left(Y_{i t}\right)+\left(E\left[Y_{i t}\right]\right)^2=\left(\sigma_T^2+\sigma^2\right)+\mu^2 . \] Now, \[ \bar{Y}_{i .}=\mu+T_i+\frac{1}{r_i} \sum_{t=1}^{r_i} \epsilon_{i t}, \] so \[ \operatorname{Var}\left(\bar{Y}_{i .}\right)=\sigma_T^2+\frac{\sigma^2}{r_i} \text { and } E\left[\bar{Y}_{i .}\right]=\mu . \] Consequently, \[ E\left[\bar{Y}_{i .}^2\right]=\left(\sigma_T^2+\frac{\sigma^2}{r_i}\right)+\mu^2 . \] Thus, \[ \begin{aligned} E[S S E] &=\sum_{i=1}^v \sum_{t=1}^{r_i}\left(\sigma_T^2+\sigma^2+\mu^2\right)-\sum_{i=1}^v r_i\left(\sigma_T^2+\frac{\sigma^2}{r_i}+\mu^2\right) \\ &=n \sigma^2-v \sigma^2 \quad\left(\text { where } n=\sum_{i=1}^v r_i\right) \\ &=(n-v) \sigma^2, \end{aligned} \] giving \[ E[M S E]=E[S S E /(n-v)]=\sigma^2 . \]
Therefore the MSE is an unbiased estimator of \(\sigma^2\).
We can show that \(SSE/\sigma^2\) has a \(\chi^2_{n-v}\) distribution. Hence the confidence bound for \(\sigma^2\) can be computed as under fixed-effects models, that is, \[ \sigma^2 \leq \frac{s s E}{\chi_{n-v, 1-\alpha}^2}, \] where \(\chi_{n-v, 1-\alpha}^2\) is the percentile of the chi-squared distribution with \(n-v\) degrees of freedom and with probability of \(1-\alpha\) in the right-hand tail.
\[ S S T=\sum_{i=1}^v r_i \bar{Y}_{i .}^2-n \bar{Y}_{. .}^2 \] Using the same type of calculation as in Sect. 17.3.2 above, we have \[ \bar{Y}_{. .}=\mu+\frac{1}{n} \sum_i r_i T_i+\frac{1}{n} \sum_{i=1}^v \sum_{t=1}^{r_i} \epsilon_{i t} . \] So \[ E\left[\bar{Y}_{. .}\right]=\mu \text { and } \operatorname{Var}\left(\bar{Y}_{. .}\right)=\frac{\sum r_i^2}{n^2} \sigma_T^2+\frac{n}{n^2} \sigma^2 . \] Also, from (17.3.3), \[ E\left[\bar{Y}_{i .}\right]=\mu \text { and } \operatorname{Var}\left(\bar{Y}_{i .}\right)=\sigma_T^2+\frac{\sigma^2}{r_i} . \] Therefore, \[ \begin{aligned} E[S S T] &=\sum_{i=1}^v r_i\left(\sigma_T^2+\frac{\sigma^2}{r_i}+\mu^2\right)-n\left(\frac{\sum r_i^2}{n^2} \sigma_T^2+\frac{\sigma^2}{n}+\mu^2\right) \\ &=\left(n-\frac{\sum r_i^2}{n}\right) \sigma_T^2+(v-1) \sigma^2 \end{aligned} \]
Therefore \[ E\left[\frac{M S T-M S E}{c}\right]=\sigma_T^2. \]
Note this unbiased estimator for \(\sigma_T^2\) is not always positive.
Consider testing \[ H_0: \sigma_T^2=0 \text { agains } H_1: \sigma_T^2>0. \]
It can be shown that \[ S S T /\left(c \sigma_T^2+\sigma^2\right) \sim \chi_{v-1}^2 \] and \[ \mathrm{SSE} / \sigma^2 \sim \chi_{n-v}^2 \] and that \(S S T\) and \(S S E\) are independent. Consequently, we have \[ \frac{M S T /\left(c \sigma_T^2+\sigma^2\right)}{M S E / \sigma^2} \sim \frac{\chi_{v-1}^2 /(v-1)}{\chi_{n-v}^2 /(n-v)} \sim F_{v-1, n-v} \]
Therefore under \(H_0\), we have \[ \frac{M S T}{M S E} \sim F_{v-1, n-v} . \] Hence, \[ \text { reject } H_0^T \text { if } \frac{m s T}{m s E}>F_{v-1, n-v, \alpha} \] where \(\alpha\) is the level of significance.
ANOVA Table \[ \begin{array}{llllll} \hline \begin{array}{l} \text { Source of } \\ \text { variation } \end{array} & \begin{array}{l} \text { Degrees of } \\ \text { freedom } \end{array} & \text { Sum of squares } & \text { Mean squares } & \text { Ratio } & \begin{array}{l} \text { Expected } \\ \text { mean } \\ \text { square } \end{array} \\ \hline \text { Treatments } & v-1 & s s T & \frac{s s T}{v-1} & \frac{m s T}{m s E} & c \sigma_T^2+\sigma^2 \\ \text { Error } & n-v & s s E & \frac{s s E}{n-v} & \sigma^2 & \\ \text { Total } & n-1 & \text { sstot } & & \\ \hline \end{array} \]
Computational formulae \[ \begin{array}{ll} s s \mathrm{ss}=\sum_i r_i \bar{y}_{i .}^2-n \bar{y}_{. .}^2 & \mathrm{ssE}=\sum_i \sum_t y_{i t}^2-\sum_i r_i \bar{y}_{i .}^2 \\ s s t o t=\sum_i \sum_t y_{i t}^2-n \bar{y}_{. .}^2 & c=\frac{n^2-\sum_i^2}{n(v-1)} \end{array} \]