Comparsions of Several Multivariate Means

The focuses of this lab are paired comparsions, repeated measures, comparing two smaples and MANOVA. Several data from the text are used as examples.

Paired Comparisons

We will use the Effluent Data from Example 6.1 in the text. Wastewater samples were sent to two labs for chemical analysis. One is a state lab and the other one is a comercial lab. The variables are biochemical oxygen demand (BOD) and suspended solid (SS). The goal is to assess if the two labs' analyses agree.

Read in and check the Effluent Data

> eff_read.table("effluent.data",header=T)
> eff

(notice that the first 11 observations were from the comercial lab and the rest 11 observations were from the state lab)

Make a plot with a different label for the two labs and with lines joining the paired samples.

> win.graph()
> plot(eff$bod,eff$ss,type="n",xlab="bod",ylab="ss")

(There are a few plot types: "p": points; "l": lines; "b": both points and lines; "h": high density plot; "s": stairstep plot; "n": no data plotted yet)

> text(eff$bod,eff$ss,c(rep("c",11),rep("s",11)))
> for(i in 1:11) lines(c(eff$bod[i],eff$bod[i+11]),c(eff$ss[i],eff$ss[i+11]))

(When i=1, (eff$bod[1], eff$bod[12]) and (eff$ss[1],eff$ss[12]) are from the same pair of experimental unit, we just connected them together)

Compute the difference between the two labs.

> d_eff[1:11,3:4]-eff[12:22,3:4]
> d

The mean

> mu_apply(d,2,mean)
> mu

The variance

> S_var(d)
> S

The T^2 test that the true mean is 0 (i.e. that the difference is zero)

> t2_11 * mu %*% solve(S) %*% mu
> t2

The critical value is

> fcrit_(2*10/9)*qf(0.95,2,9)
> fcrit

So, what is your conclusion about the null hypothesis, that is , there is no difference between the comercial and the state labs.m Note that if we compute the simultaneous confindence intervals for each variable separately

>sintv1_c(mu[1]-sqrt(fcrit)*sqrt(S[1,1]/11),mu[1]+sqrt(fcrit)*sqrt(S[1,1]/11))
>sintv1
>sintv2_c(mu[2]-sqrt(fcrit)*sqrt(S[2,2]/11),mu[2]+sqrt(fcrit)*sqrt(S[2,2]/11))
>sintv2

Both intervals contain zero -> the null hypothesis would be (falsely) accepted. This can be seen pictorially

> plot(ellipse(S/11,cent=mu,t=sqrt(fcrit)),type="l")
> abline(h=c(mu[2]-sqrt(fcrit)*sqrt(S[2,2]/11),mu[2]+sqrt(fcrit)*sqrt(S[2,2]/11)))
> abline(v=c(mu[1]-sqrt(fcrit)*sqrt(S[1,1]/11),mu[1]+sqrt(fcrit)*sqrt(S[1,1]/11)))
> points(0,0)

See what happens if you use the t-critical value:

> tcrit_qt(0.975,10)
> abline(v=c(mu[1]-tcrit*sqrt(S[1,1]/11),mu[1]+tcrit*sqrt(S[1,1]/11)),lty=2)
> abline(h=c(mu[2]-tcrit*sqrt(S[2,2]/11),mu[2]+tcrit*sqrt(S[2,2]/11)),lty=2)

Thus the person who knew nothing of multivariate methods would also be misled.

Repeated Measures

The data come from an anesthetics experiment. 19 dogs are given the drug pentobarbitol and then administered CO2 at two pressure levels. Halothane was then added and the CO2 administration repeated. The response was milliseconds between heartbeats.

The treatments are

high CO2 pressure, no Halothane
low CO2 pressure, no Halothane
high CO2 pressure, Halothane
low CO2 pressure, Halothane

Read in the data

> dog_ read.table("dog.data")
> dog

Plot the four groups:

> boxplot(dog$V1,dog$V2,dog$V3,dog$V4,names=c("high-no","low-no","high-yes","low-yes"))

Also make a plot with lines joining the same dog

> matplot(1:4,t(dog),type="b",axes=F)
> axis(2)
> axis(1,at=1:4,c("high-no","low-no","high-yes","low-yes"))

Notice that the lines cross much less than what would be expected by chance. This can be attributed to the dependence between measurements on the same dog.

Now compute the difference between successive columns:

> ndog_cbind(dog$V1-dog$V2,dog$V2-dog$V3,dog$V4-dog$V3)
> ndog

What graphical examination of this new matrix would be appropriate?

Compute the mean and variance:

> mu_apply(ndog,2,mean)
> mu
> S_var(ndog)

The T^2 statistic:

> t2_19 * mu %*% solve (S) %*% mu
> t2

which is much bigger than the F-critical value

> fcrit_(3*18/16)*qf(0.95,3,16)
> fcrit

so we reject the null hypothesis of equality.

We can also do this using contrast matrices and the original data:

> cm_cbind(1,diag(rep(-1,3)))
> cm
> omu_apply(dog,2,mean)
> omu
> oS_var(dog)
> oS
> t2_19 * t(cm %*% omu) %*% solve(cm %*% oS %*% t(cm)) %*% (cm %*%
> omu)
> t2

and get the same result but the first method seems easier.

Apart from lack of equality, what are the conclusions? Our choice of contrasts is not good for comparing the treatment levels. A better choice is compare the average of high and low CO2, Halothane or no Halothane, and the remaining interaction effect.

> ndog_cbind(dog$V1+dog$V3-dog$V2-dog$V4,dog$V3+dog$V4-dog$V1-dog$V2,dog$V1+dog$V4-dog$V2-dog$V3)
> ndog

cbind is a function to form a matrix from several vectors. Recompute T^2

> mu_apply(ndog,2,mean)
> mu
> S_var(ndog)
> S
> t2_19 * mu %*% solve (S) %*% mu
> t2

and get the same result. Now compute the CI for each of the contrasts:

> for(i in 1:3)
print(c(mu[i]-sqrt(fcrit*S[i,i]/19),mu[i]+sqrt(fcrit*S[i,i]/19)))

The first shows that lower CO2 produces longer times between heartbeats. The second shows that Halothane produces longer times between heartbeats. The third shows that the interaction effect is not significant (zero is in the interval). Note that the Halothane runs occured after the no halothane runs so there could be a confounding time effect.

Comparing Two Multivariate Samples

Let's reanalyze the effluent data. In truth each water sample was divided in two and sent to two labs for comparison. This was the correct way to do it, but suppose (for the sake of the example) that there had been 22 samples - half were sent to one lab and the other half to the other lab.

Compute the means and variances in the two groups:

> mu1_apply(eff[1:11,3:4],2,mean)
> mu2_apply(eff[12:22,3:4],2,mean)
> S1_var(eff[1:11,3:4])
> S2_var(eff[12:22,3:4])

and the pooled variance:

> Spool_((11-1)*S1+(11-1)*S2)/(11+11-2)
> Spool

The T² statistic for testing equality is

> t2_(mu1-mu2) %*% solve(2*Spool/11) %*% (mu1-mu2)
>t2

whereas the critical value is

> fcrit_(20*2/19)*qf(0.95,2,19)
> fcrit

so the hypothesis of equality is rejected.

Let's take a look at the confidence region:

> plot(ellipse(2*Spool/11,cent=(mu1-mu2),t=sqrt(fcrit)),type="l")
> abline(v=0,h=0,lty=2)
> points(mu1[1]-mu2[1],mu1[2]-mu2[2])

Here's the simultaneous confidence interval for the bod:

>sintv1_c(mu1[1]-mu2[1]-sqrt(fcrit)*sqrt(2*Spool[1,1]/11),mu1[1]-mu2[1]+sqrt(fcrit)*sqrt(2*Spool[1,1]/11))
>sintv1
>abline(v=sintv1,lty=5)

Can you compute and plot the interval for ss?

Look at the ratio of corresponding entries:

> S1/S2

There is a somewhat worrying difference in variation.

MANOVA

3 measurements are made on anteater sampled skulls from 3 different locations. Are anteater skulls from the different locations comparable? The measurements are log(basal length), log(occipital nasal length) and log(max. nasal length). First read in and check the data:

> ant_read.table("anteater.data",header=T)
> ant

Now fit the model - the response is a matrix of three columns and the predictor is the location:

> g_manova(cbind(basal,occipital,nasal) ~ location, data=ant)

Now compute the 4 different test statistics - note that the the Pillai trace is the default:

> summary(g)                
> summary(g,test="wilk")
> summary(g,test="roy")
> summary(g,test="hot")

The "approx F." refers to the fact that all 4 test statistics are transformed to approximately match an F distribution for the computation of the p-value. Note that the p-value varies but the null is accepted by all 4 tests. This is a small sample so not finding a difference is hardly surprising.

Now compute some of the auxiliary quantities:

> gs_summary(g)

The eigenvalues lambda:

> gs$E

Note the last one is 0 (within rounding error).

The sum of squares: the first is the between (B) and the second is the within (W)

> gs$SS

With these quantities, we can construct the manova tables.