Confidence Intervals for Variance Components

Confidence Intervals for \(\sigma_T^2 / \sigma^2\)

From (17.3.7), p. 623, we know that \[ \frac{M S T}{\operatorname{MSE}\left(c \sigma_T^2 / \sigma^2+1\right)} \sim F_{v-1, n-v} \] where \(c=\left(n^2-\Sigma r_i^2\right) /(n(v-1))\), and if the \(r_i\) are all equal to \(r\), then \(c=r\). From this, we can write down an interval in which \(M S T / M S E\) lies with probability \(1-\alpha\); that is, \[ P\left(F_{v-1, n-v, 1-\alpha / 2} \leq \frac{M S T}{M S E\left(c \sigma_T^2 / \sigma^2+1\right)} \leq F_{v-1, n-v, \alpha / 2}\right)=1-\alpha . \] If we rearrange the left-hand inequality, we find that \[ c \sigma_T^2 / \sigma^2 \leq \frac{M S T}{M S E F_{v-1, n-v, 1-\alpha / 2}}-1 \] and similarly for the right-hand inequality, \[ c \sigma_T^2 / \sigma^2 \geq \frac{M S T}{M S E F_{v-1, n-v, \alpha / 2}}-1 . \] So, replacing the random variables by their observed values, we obtain a \(100(1-\alpha) \%\) confidence interval for \(\sigma_T^2 / \sigma^2\) as \[ \frac{1}{c}\left[\frac{m s T}{m s E F_{v-1, n-v, \alpha / 2}}-1\right] \leq \frac{\sigma_T^2}{\sigma^2} \leq \frac{1}{c}\left[\frac{m s T}{m s E F_{v-1, n-v, 1-\alpha / 2}}-1\right] \text {. } \]

Confidence Intervals for \(\sigma_T^2\)

Note the following unbiased estimator of \(\sigma_T^2\), \[ U=c^{-1}(M S T-M S E), \] where \(c=\left(n^2-\Sigma r_i^2\right) /(n(v-1))\), and \(c=r\) when the sample sizes are equal. Let \[x=\frac{(m s T-m s E)^2}{m s T^2 /(v-1)+m s E^2 /(n-v)}.\] Then \(xU/\sigma_T^2\) approximately has the \(chi^2\) distribution with \(x\) degrees of freedom. Hence \[ P\left(\chi_{x, 1-\alpha / 2}^2 \leq \frac{x U}{\sigma_T^2} \leq \chi_{x, \alpha / 2}^2\right) \approx 1-\alpha \] and \[ P\left( \frac{x U}{\chi_{x, \alpha / 2}^2} \leq \sigma_T^2 \leq \frac{x U}{\chi_{x, 1-\alpha / 2}^2} \right) \approx 1-\alpha. \] We have therefore obtained a \((1-\alpha)100%\) confidence interval for \(\sigma_T^2\).

Two or More Random Effects

In a factorial experiment when both factors have random effects, the two-way complete model is as follows. \[ \begin{gathered} Y_{i j t}=\mu+A_i+B_j+(A B)_{i j}+\epsilon_{i j t} \\ A_i \sim N\left(0, \sigma_A^2\right), B_j \sim N\left(0, \sigma_B^2\right) \\ (A B)_{i j} \sim N\left(0, \sigma_{A B}^2\right), \epsilon_{i j t} \sim N\left(0, \sigma^2\right) \\ A_i \text { 's, } B_j \text { 's, }(A B)_{i j} \text { 's and } \epsilon_{i j t} \text { 's are mutually independent } \\ t=1, \ldots, r_{i j}, \quad i=1, \ldots, a, j=1, \ldots, b . \end{gathered} \] If \(\sigma_{A B}^2\) is positive, then there are \(A B\) effects present-namely, main effects and interactions for the factors \(A\) and \(B\). If \(\sigma_A^2\) or \(\sigma_B^2\) is positive, then the corresponding main effects are present.

ANOVA Table:

\[ \begin{array}{lccccc} \text { Source } & \text { DF } & {\text { SS }} & \text { MS } & \text { EMS } & \text { F-ratio } \\ \text { A } & a-1 & S S A & M S A & b r \sigma_A^2+r \sigma_{A B}^2+\sigma^2 & \frac{M S A}{M S A B} \\ \text { B } & b-1 & S S B & M S B & a r \sigma_B^2+r \sigma_{A B}^2+\sigma^2 & \frac{M S B}{M S A B} \\ \text { AB } & (a-1)(b-1) & S S A B & M S A B & r \sigma_{A B}^2+\sigma^2 & \frac{M S A B}{M S E} \\ \text { Error } & a b(r-1) & S S E & M S E & & \end{array} \] A Rule for Determining an Expected MS:

First note the subscripts on the term representing the factor in the model. Write down a variance component for the factor of interest, for the error, and for every interaction involving the factor. Multiply each variance components by the corresponding number of observations. Add up terms.

Hypothesis Testing:

\(H_0^{A B}: \sigma_{A B}^2=0\) (No interaction) against \(H_1^{A B}: \sigma_{A B}^2>0\).

Reject \(H_0^{A B}\) if \(F=\frac{M S A B}{M S E}>F_{(a-1)(b-1), n-a b, \alpha}\).

\(H_0^A: \sigma_A^2=0\) against \(H_1^A: \sigma_A^2>0\)

Reject the null hypothesis if \(F=\frac{M S A}{M S A B}>F_{a-1,(a-1)(b-1), \alpha}\).

\(H_0^B: \sigma_B^2=0\) against \(H_1^B: \sigma_B^2>0\).

Reject the null hypothesis if \(F=\frac{M S B}{M S A B}>F_{b-1,(a-1)(b-1), \alpha}\).

Note the denominator is not \(M S E\) !

Mixed-Effects Models or Mixed Models

Models that contain both random and fixed treatment effects are called mixed models. The analysis of random effects proceeds in exactly the same way as described in the previous sections. All that is needed is a way to write down the expected mean squares. The fixed effects can be analyzed as before, except that, here, too, we may need to replace the mean square for error by a different one.

As an example, consider a completely randomized design with three factors \(A\), \(B\) and \(D\). We choose a model containing the main effects of factors \(A, B\), and \(D\) and the interactions \(A B\) and \(B D\). No other interactions are included in the model.Suppose that factors \(A\) and \(B\) have fixed effects, and factor \(D\) has random effects. Then interaction \(A B\) is a fixed effect, but interaction \(B D\) is a random effect.

\[ \begin{array}{ll} \text { Effect } & \text { MSE } & F-ratio\\ \hline A & Q(A, A B)+\sigma^2 & msA/msAB \\ B & Q(B, A B)+a r \sigma_{B D}^2+\sigma^2 & msB/msBD\\ D & a b r \sigma_D^2+a r \sigma_{B D}^2+\sigma^2 & msD/msBD \\ A B & Q(A B)+\sigma^2 & msAB/msE\\ B D & a r \sigma_{B D}^2+\sigma^2 & msBD/msE \\ \text { Error } & \sigma^2 & \\ \hline \end{array} \] For example, the decision rule for testing \(H_0^B\) against the alternative hypothesis that the \(\beta_j^*\) are not all equal is \[ \text { reject } H_0^B \text { if } \frac{m s B}{m s(B D)}>F_{(b-1),(b-1)(d-1), \alpha} \text {. } \]

To test the hypothesis \(H_0^D:\left\{\sigma_D^2=0\right\}\) against the alternative hypothesis \(H_A^D:\left\{\sigma_D^2>0\right\}\), the decision rule is \[ \operatorname{reject} H_0^D \text { if } \frac{m s D}{m s(B D)}>F_{d-1,(b-1)(d-1), \alpha} \text {. } \]

Confidence Intervals in Mixed Models

For fixed effects in a mixed model with equal sample sizes, confidence intervals (including the simultaneous confidence intervals ) can be calculated exactly as if there were no random effects in the model, except that we may replace msE used in the denominator of the test ratio by the appropriate one. Apart from this, we may use the Bonferroni, Scheffé, Tukey, and Dunnett methods of multiple comparisons in the usual way.

When the sample sizes are unequal, computing least squares estimates and appropriate standard errors is more complicated. PROC MIXED in SAS software works in this situation.

Block Designs and Random Block Effects

In certain types of experiments, it is extremely common for the levels of a blocking factor to be randomly selected. For example, in medical, psychological, educational, or pharmaceutical experiments, blocks frequently represent subjects that have been selected at random from a large population of similar subjects. In agricultural experiments, blocks may represent different fields selected from a large variable population of fields. In industrial experiments, different machine operators may represent different levels of the blocking factor and may be similar to a random sample from a large population of possible operators. Raw material may be delivered to the factory in batches, a random selection of which are used as blocks in the experiment.

In this case, the block effects will be treated as random effects and all interaction effects with block shall be random as well.

Read the design and analysis of the Temperature Experiment in Section 17.9.

Using SAS

For models with mixed effects, it is recommended to use proc mixed over proc glm. Only use the latter for equal sample sizes while the former can be always used and provides more options. proc mixed is based on restricted likelihood estimation, and differs from those in the proc glm.

Example The Candle Experiment (described on page 667 in Exercise 5). In this complete block design, each experimenter is a block and the only treatment factor is the candle color with 4 levels. We will run a model with block and block by color interaction as random effects.

data candle;
  do person=1 to 4;
    do row=1,2;
      do color=1 to 4;
        do col=1,2;
          input time @@;
          output;
          drop row col;
  end; end; end; end;
  lines;
   989 1032  1044  979  1011  951   974  998
  1077 1019   987 1031   928 1022  1033 1041
   899  912   847  880   899  800   886  859
   911  943   879  830   820  812   901  907
   898  840   840  952   909  790   950  992
   955 1005   961  915   871  905   920  890
   993  957   987  960   864  925   949  973
  1005  982   920 1001   824  790   978  938
;
run;
proc print data=candle;
run;

proc mixed data=candle;
class person color;
model time=color;
random person person*color;
lsmeans color /cl pdiff adjust=Tukey;
run;

Partial output:

\[ \begin{aligned} &\text { Covariance Parameter Estimates }\\ &\begin{array}{|l|r|} \hline \text { Cov Parm } & \text { Estimate } \\ \hline \text { person } & 3049.70 \\ \hline \text { person*} \text {color } & 12.2483 \\ \hline \text { Residual } & 1708.85 \\ \hline \end{array} \end{aligned} \] The variance of the block effects is estimated as 3049.70 and the variance of the interaction effects is 1708.85.

\[ \begin{aligned} &\text { Type } 3 \text { Tests of Fixed Effects }\\ &\begin{array}{|l|r|r|r|r|} \hline \text { Effect } & \text { Num DF } & \text { Den DF } & \text { F Value } & \operatorname{Pr}>\text { F } \\ \hline \text { color } & 3 & 9 & 11.44 & 0.0020 \\ \hline \end{array} \end{aligned} \] The \(p\)-value for testing that the factor color has equal effects is 0.0020. Note that the number of degrees of freedom for the denominator is 9, which is \((b-1)\times (v-1)\) or the degrees of freedom for the interaction term.

\[ \text{Least Squares Means } \\ \begin{array}{|l|l|r|r|r|r|r|r|r|r|} \hline \text { Effect } & \text { color } & \text { Estimate } & \begin{array}{r} \text { Standard } \\ \text { Error } \end{array} & \text { DF } & \text { t Value } & \operatorname{Pr>|t|} & \text { Alpha } & \text { Lower } & \text { Upper } \\ \hline \text { color } & 1 & 963.56 & 29.5346 & 9 & 32.62 & <.0001 & 0.05 & 896.75 & 1030.37 \\ \hline \text { color } & 2 & 938.31 & 29.5346 & 9 & 31.77 & <.0001 & 0.05 & 871.50 & 1005.12 \\ \hline \text { color } & 3 & 882.56 & 29.5346 & 9 & 29.88 & <.0001 & 0.05 & 815.75 & 949.37 \\ \hline \text { color } & 4 & 949.31 & 29.5346 & 9 & 32.14 & <.0001 & 0.05 & 882.50 & 1016.12 \\ \hline \end{array} \]

The table provides the least squares estimates of the treatment means of color and the confidence intervals for the means. Note these are not simultaneous confidence intervals.

This table provides confidence intervals for pairwise differences, both individual confidence intervals and the simultaneous confidence intervals using Tukey’s method. For example, the 95% confidence interval for \(color_1-color_2\) is (-8.2827, 58.7827) and for the 95% simultaneous confidence intervals, it becomes (-21.0253, 71.5253).

Comparing with proc glm

I will show you the code for proc glm although prefer proc mixed.

proc glm data=candle;
class person color;
model time=person color person*color;
random person person*color/test;
lsmeans color /cl pdiff adjust=Tukey E=person*color;
run;

Note the test option in the random statement is recommended. It will identify the correct denominators. In addition, the E=person*color is necessary for correct confidence intervals. The results from proc mixed and proc glm are identical for this experiment because the equal sample sizes. However, proc glm would require you specify the correct error term in the hypothesis testing and confidence intervals.