SAS Lab 2

Objective of the Lab

Run SAS for the two-way models
Test hypothesis for the interaction effects
Test hypothesis for the mean effects
Multiple comparisons for the main effects
Use estimate statement for any contrast
Know the difference of the ordering of factors in the class statement
Know the difference between lsmeans and means statements

An experiment was conducted to determine the effects of three different pesticides of the yield of fruit from two different varieties of a citrus tree. Six trees of each variety were randomly selected from an orchard. The four pesticides were then randomly assigned to two trees of each variety, and applications were made according to recommended levels. Yields of fruits (in bushels per tree) were obtained after the test period.

	Pesticide
Variety	1	2	3
1	49	50	38
	39	55	*

2	55	67	53
	41	58	42

* Note: This tree was accidently hit by a vehicle and has no yield data.

Two-way Complete Model

This is a completely randomized design. Without any knowledge about the interaction, we will employ the two way complete model with Pesticide as Factor A (with 3 levels) and Variety as Factor B (with two levels).

data pest;
input variety   pesticide   yield;
lines; 
1   1   49
1   1   39
1   2   50
1   2   55
1   3   38
2   1   55
2   1   41
2   2   58
2   3   53
2   3   42
2   2   67
;
proc print;
run;

proc glm data=pest;
class pesticide variety;
model yield=variety pesticide variety*pesticide;
run;

Is there a significant interaction effect?

The p-values 0.82. Fail to reject the null hypothesis. No significant interaction effects.

Always use the p-value that corresponds to the Type III SS.

Are there significant differences in the mean yields among the three pesticides? Use \(\alpha=0.10\)

The p-values is 0.089. We reject the null hypothesis at \(\alpha=0.1\) level of significance and believe there is a significance among the three main effects of pesticide.

Are there significant differences in the mean yields between the two varieties? Use \(\alpha=0.10\).

The p-value is 0.1998 and we fail to reject the null hypothesis which says no difference between the main effects of variety.

Find 90% simultaneous confidence intervals for the pairwise comparisons of the main effects of pesticide.

Use Tukey’s method. The SAS code is (under the model statement)

proc glm data=pest;
class variety pesticide;
model yield=variety pesticide variety*pesticide;
lsmeans pesticide/cl pdiff adjust=Tukey;
run;

There are 3 main effects and consequently 3 all pairwise comparisons for the main effects. The SAS output is below.

		Difference Between	Simultaneous 95% Confidence Limits
j	j	Means	for LSMean(i)-LSMean(j)
1	2	-11.500000	-28.139482, 5.139482
1	3	3.250000	-15.353507, 21.853507
2	3	14.750000	-3.853507, 33.353507

Note what each option in the lsmeans statement does:

cl: To produce the least squares estimates and the confidence interval for each treatment means (these are not simultaneous confidence intervals).
pdiff: To provide confidence intervals for pairwise differences. If a method for simultaneous confidence intervals are provided through adjust=, then simultaneous confidence intervals are provided. Compare the difference if you use the following statement:

lsmeans pesticide/cl pdiff;

You will see that the confidence intervals differ from those provided by Tukey’s method. Why?

Find a 95% confidence interval for the difference between the mean yield for the two varieties (variety 2-variety 1) when pesticide 1 is applied.

This requires some careful work. We are comparing two specific treatment means in a two-way model. It is doable but not as straightforward as in the one-way model.

First we must know how SAS coded the treatments. Since pesticide is entered into the class statement before variety, pesticide is Factor A and variety is Factor B. We are therefore comparing \(\mu_{12}\) with \(\mu_{11}\)

\[ \begin{align} \mu_{12}-\mu_{11} &= (\mu+\alpha_1+\beta_2+(\alpha\beta)_{12})-(\mu+\alpha_1+\beta_1+(\alpha\beta)_{11}) \\ &= (-\beta_1+\beta_2)+(-(\alpha\beta)_{11}+(\alpha\beta)_{12}). \end{align} \]

Therefore, this contrast involves the main effects of Factor B (variety) and the interaction effects. The SAS code is

estimate "v2-v1|pesticide=1" variety -1 1 variety*pesticide -1 1 0 0 0 0;

SAS output provides estimate=4.0000000 and standard error=7.23187389. We will need to find the \(t\) critical value \(t_{5, 0.025}=\) 2.5706. We therefore have the 95% confidence interval \(4.0\pm 18.59\).

Find a 95% confidence interval for the difference between the mean yield for pesticide 2 and pesticide 3 when they are applied to Variety 1 trees.

Here we are estimating \(\mu_{21}-\mu_{31}\), for which we must first express it interms of main-effects and interaction effects.

\[ \begin{align} \mu_{21}-\mu_{31} &= (\mu+\alpha_2+\beta_1+(\alpha\beta)_{21})-(\mu+\alpha_3+\beta_1+(\alpha\beta)_{31}) \\ &=(\alpha_2-\alpha_3)+((\alpha\beta)_{21}-(\alpha\beta)_{31}). \end{align} \] The SAS code is

estimate "P2-P3|v=1" pesticide 0 1 -1 variety*pesticide 0 0 1 0 -1 0;

The estimate and standard error are 14.5000000 and 8.85720046, respectively. The confidence interval can be given similar to the previous question.

In this problem, I will show you the order that the factors appear in the class statement is important. Compare the output from the following program:

proc glm data=pest;
class pesticide variety;
model yield=variety pesticide variety*pesticide/solution;
run;

with those from

proc glm data=pest;
class variety pesticide;
model yield=variety pesticide variety*pesticide/solution ;
run;

We see that the interaction plots are different and the treatments are coded differently.

Run the following code

proc glm data=pest;
class variety pesticide;
model yield=variety pesticide variety*pesticide/solution;
estimate "v2-v1|pesticide=1" variety -1 1 variety*pesticide -1 1 0 0 0 0;
run;

You will get the note (in the log file) that “v2-v1|pesticide=1 is not estimable”.

Let us see how to code for Problem 6 now. Since variety is now the first factor that entered the class statement, we are comparing \(\mu_{21}\) with \(\mu_{11}\). We can write

\[ \mu_{21}-\mu_{11}=(\alpha_2-\alpha_1)+(\alpha\beta)_{21}-(\alpha\beta)_{11}. \]

proc glm data=pest;
class variety pesticide;
model yield=variety pesticide variety*pesticide;
estimate "v2-v1|pesticide=1" variety -1 1 variety*pesticide -1 0 0 1 0 0;
run;

Two-way Main-Effects Model

Suppose the experimenter has knowledge that no interaction effects exist. Then the two-way main effects model can be used. Let us see how to get the simultaneous confidence intervals in Problem 5 and the confidence interval in Problem 6.

proc glm data=pest;
class pesticide variety;
model yield=pesticide variety;
lsmeans pesticide/cl pdiff adjust=Tukey;
estimate 'v2-v1' -1 1;
run;

Note how the p-values changed! The confidence intervals for the pairwise comparison become shorter.

Some additional options in the model statement. The options include solution, DDFM=Satterthwaite. The “solution” option shows you how SAS codes the treatment combinations as well as the values of \(\alpha_i, \beta_j\) and \((\alpha\beta)_{ij}\). The option DDFM=Satterthwaite provides Satterthwaite’s approximation to the denominator degrees of freedom. This only works when each treatment has replicates (i.e., \(r_{ij}>1\)).

SAS Lab 2

Hao Zhang

Objective of the Lab

Two-way Complete Model

Two-way Main-Effects Model