Least Squares Estimation

The two-way main-effects model is \[ \begin{gathered} Y_{i j t}=\mu+\alpha_i+\beta_j+\epsilon_{i j t}, \\ \sum_{i=1}^a \alpha_i=0, ~\sum_{j=1}^b \beta_j=0,\\ \epsilon_{i j t} \sim N\left(0, \sigma^2\right), \end{gathered} \] \(\epsilon_{i j t}{ }^{\prime} \mathrm{s}\) are mutually independent, \(t=1, \ldots, r_{i j} ; \quad i=1, \ldots, a ; \quad j=1, \ldots, b\).

Note there is no equivalent one-way model to the two-way main effects model, unlike the two-way complete model. This is a simplified model than the two-way complete model. There are \(1+(a-1)+(b-1)=a+b-1\) number of freely-changing parameters. The number of degrees of freedom for error is therefore \(n-a-b+1\).

I also note that the least squares estimator for \(\mu_{ij}\) is no longer the sample mean \(\bar Y_{ij\cdot}\). This is because \(\mu_{ij}\) are now constrained. The precise formula for these estimators can be given but more complex. They can be derived and expressed through matrix and vector notations. We will skip that part in this class.

In case of equal sample size, we have \[\hat \mu=\bar Y_{\cdot\cdot\cdot}, \hat\alpha_i=\bar Y_{i\cdot\cdot}-\bar Y_{\cdot\cdot\cdot}, \hat\beta_j=\bar Y_{\cdot j \cdot}-\bar Y_{\cdot\cdot\cdot}. \] Hence the least squares estimator for \(\mu_{ij}\) is \(\hat\mu+\hat\alpha_i+\hat\beta_j=\bar Y_{i\cdot\cdot}+\bar Y_{\cdot j \cdot}-\bar Y_{\cdot\cdot\cdot}\). Note that three terms are not independent. Therefore the variance of \(\hat\mu_{ij}\) is NOT the sum of variances of three additive terms.

Derivation of LSE When Sample Sizes are Equal

We estimate \(\mu, \alpha_i, \beta_j\) by minimizing \[ \sum_{i=1}^a \sum_{j=1}^b \sum_{t=1}^r\left(y_{i j t}-\left(\mu+\alpha_i+\beta_j\right)\right)^2 \] subject to \(\sum_{i=1}^a \alpha_i=0, \sum_{j=1}^b \beta_j=0,.\)

Taking derivatives with respect to \(\mu\), \(\alpha_i\) and \(\beta_j\), we get \[ \begin{aligned} y_{\ldots}-a b r \hat{\mu}-b r \sum_i \hat{\alpha}_i-a r \sum_j \hat{\beta}_j &=0, \\ y_{i . .}-b r \hat{\mu}-b r \hat{\alpha}_i-r \sum_j \hat{\beta}_j &=0, \quad i=1, \ldots, a, \\ y_{j . j}-a r \hat{\mu}-r \sum_i \hat{\alpha}_i-a r \hat{\beta}_j &=0, \quad j=1, \ldots, b . \end{aligned} \] The solution: \[ \begin{aligned} &\hat{\mu}=\bar{y}_{\ldots},\\ &\hat{\alpha}_i=\bar{y}_{i . .}-\bar{y}_{\ldots}, \quad i=1, \ldots, a \text {, }\\ &\hat{\beta}_j=\bar{y}_{. j .}-\bar{y}_{\ldots}, \quad j=1, \ldots, b \end{aligned} \]

Main Affects Contrasts for Factor A

Recall a contrast for the main effects of A is \(\sum_{i=1}^a c_i \bar \mu_{i\cdot}\) with \(\sum_{i=1}^a c_i=0\). Therefore \[ \sum_{i=1}^a c_i \bar\mu_{i\cdot}=\sum_{i=1}^a c_i\alpha_i, \text{ with } \sum_i c_i=0. \]

The contrast is estimated by \(\sum_{i=1}^a c_i\hat \alpha_i=\sum_i c_i(\bar Y_{i\cdot\cdot}+\bar Y_{\cdot j \cdot}-\bar Y_{\cdot\cdot\cdot})=\sum_i c_i \bar Y_{i\cdot\cdot}\).

It is variance is \[ \operatorname{Var}\left(\sum_i c_i \bar{Y}_{i . .}\right)=\frac{\sigma^2}{b r} \sum_i c_i^2 \] For example, \(\alpha_2-\alpha_1\) has least squares estimator and associated variance \[ \hat{\alpha}_2-\hat{\alpha}_1=\bar{Y}_{2 . .}-\bar{Y}_{1 . .} \quad \text { with } \quad \operatorname{Var}\left(\bar{Y}_{2 . .}-\bar{Y}_{1 . .}\right)=\frac{2 \sigma^2}{b r} . \]

Main Effects Contrasts of Factor B

For Factor \(B\), a main-effect contrast \(\sum k_j \beta_j\) with \(\sum_j k_j=0\) has least squares estimator and associated variance \[ \sum_j k_j \hat{\beta}_j=\sum_j k_j \bar{Y}_{. j .} \quad \text { and } \quad \operatorname{Var}\left(\sum_j k_j \bar{Y}_{. j .}\right)=\frac{\sigma^2}{a r} \sum_j k_j^2 \]

Estimation of \(\sigma^2\) in the Main-Effects Model

First note \[ S S E=\sum_i \sum_j \sum_t\left(Y_{i j t}-\bar{Y}_{i . .}-\bar{Y}_{. j .}+\bar{Y}_{\ldots}\right)^2 . \]

We can show \[ E[S S E]=(n-a-b+1) \sigma^2 \text{ where } n=abr. \] Hence the unbiased estimator for \(\sigma^2\) is \[ MSE=SSE/(n-a-b+1). \]

It can be shown that \(\operatorname{SSE} / \sigma^2\) has a chi-squared distribution with \((n-a-b+1)\) degrees of freedom. This is true when the sample sizes are not all equal exception \(n=\sum_i\sum_j r_{ij}\).

Given a dataset, the calculated SSE is denoted by ssE. An upper \(100(1-\alpha) \%\) confidence bound for \(\sigma^2\) is therefore given by \[ \sigma^2 \leq \frac{\text{ssE}}{\chi_{n-a-b+1,1-\alpha}^2} \]

Confidence Interval for a Single Contrast

The \(100(1-\alpha)\%\) confidence interval of a single contrast is of the form

\[ \text{ estimate }\pm t_{n-a-b+1, \alpha/2} \times (\text{ standard error}). \]

For example, the 95% confidence interval for the main effects contrast \(\alpha_2-\alpha_1\) is \[ (\bar{y}_{2 \cdot\cdot}-\bar{y}_{1 \cdot\cdot})\pm t_{n-a-b+1, \alpha/2} \times \sqrt{msE\times 2/(br)}. \] Note the number of degrees of freedom of \(t\)-distribution is always the degrees of freedom of \(MSE\).

Simultaneous Confidence Intervals

We may obtain simultaneous confidence intervals for some main effects contrasts for Factor A, or for the main effects contrasts for Factor B. These intervals are given similar to those in the one-way model or two-way complete model and are given in the same form

\[ \text{ estimate }\pm w \times (\text{ standard error}). \] where \(w\) depends on the particular method to be used for constructing the simultaneous confidence intervals. Given below are the \(w\) values for different methods for the main-effects contrasts for Factor \(A\): \[\begin{align} &w_B=t_{n-a-b+1, \alpha / 2 m} ; \\ &w_S=\sqrt{(a-1) F_{a-1, n-a-b+1, \alpha}} \\ &w_T=q_{a, n-a-b+1, \alpha} / \sqrt{2} ; \\ &w_{D 2}=|t|_{a-1, n-a-b+1, \alpha}^{(0.5)} \end{align}\]

The \(w\) values for simultaneous confidence intervals for main-effects contrasts for Factor \(B\) are given similarly. These values are given similarly to those for the one-way model or the two-way complete model except the degrees of freedom for the error differ.

Unequal Variances

When the variances are unequal, i.e., \(Y_{ijt}\sim N(0, \sigma_{ij}^2)\) with \(\sigma_{ij}^2\) being unequal, we may use a variance stabilizing transformation or use Satterthwaite’s approximation to the degrees of freedom. The provide the option DDFM=Satterthwaite in the model statement in SAS.

Analysis of Variance

For the two-way main effects model, we may test two hypotheses \[ H_0^A: \alpha_1=\alpha_2=\cdots=\alpha_a \] and \[ H_0^B: \beta_1=\beta_2=\cdots=\beta_b. \]

Often, these are the hypothesis we test first. If the null hypothesis is rejected, we may use contrasts for further analysis to find out more about the differences that exist among the main effects.

The F-test is similar to those in the two-way complete model by comparing the difference between the SSE under \(H_0\) and the SSE under the full model (which is now the two-way main effects model).

The exact formula can be easily given when the sample sizes are equal. For example, under \(H_0\), the model is reduced to a one-way model with \(a\) treatments each ofwhich has a sample size \(br\). Then \[ ssE_0^B=\sum_{i=1}^a \sum_{j=1}^b \sum_{t=1}^r\left(y_{i j t}-\bar{y}_{i . .}\right)^2. \]

\[ \begin{aligned} s s B &=\sum_{i=1}^a \sum_{j=1}^b \sum_{t=1}^r\left(y_{i j t}-\bar{y}_{i . .}\right)^2-\sum_{i=1}^a \sum_{j=1}^b \sum_{t=1}^r\left(\left(y_{i j t}-\bar{y}_{i . .}\right)-\left(\bar{y}_{. j .}-\bar{y}_{\ldots} . .\right)\right)^2 \\ &=a r \sum\left(\bar{y}_{. j .}-\bar{y}_{\ldots}\right)^2 \end{aligned} \] Therefore, when \(H_0^B\) is true, \[ \frac{S S B /(b-1) \sigma^2}{S S E /(n-a-b+1) \sigma^2}=\frac{M S B}{M S E} \sim F_{b-1, n-a-b+1}, \] and the decision rule for testing \(H_0^B\) against \(H_A^B\) is \[ \text { reject } H_0^B \text { if } \frac{m s B}{m s E}>F_{b-1, n-a-b+1, \alpha} \text {. } \]

Similarly, \[ ssA =b r \sum\left(\bar{y}_{i\cdot\cdot}-\bar{y}_{\cdot\cdot\cdot}\right)^2. \] We reject \[H_0^A \text { if } \frac{m s A}{m s E}>F_{a-1, n-a-b+1, \alpha}. \]

ANOVA Table for the Two-Way Main-Effects Model
Source of Variation Degrees of Freedom Sum of Squares Mean Square Ratio
Factor A a-1 ssA (ssA)/(a-1) (msA)/(msE)
Factor B b-1 ssB (ssB)/(b-1) (msB)/(msE)
Error n-a-b+1 ssE (ssE)/(n-a-b+1)
Total n-1 sstot