In this note, I will show how to choose an appropriate sample size to ensure that the simultaneous confidence intervals are bounded by some length. We have discussed how to choose a sample to yield a desired power of the F-test to detect a target difference. These are the two primary ways to determine the sample size for planning purpose.
Note all simultaneous confidence intervals have the form \[\text{estimate} \pm w \times \text{standard error}.\] where \(w\) depends the method. The follow table summarizes the \(w\) value to be used.
Method | Bonferroni | Scheffe | Tukey | Dunnett |
---|---|---|---|---|
\(w\) | \(t_{n-v, \alpha /(2 m)}\) | \(\sqrt{(v-1) F_{v-1, n-v, \alpha}}\) | \(\frac{q_{v, n-v, \alpha}}{\sqrt{2}}\) | needs the multivariate t-distribution |
SAS function | \(tinv(1-alpha,df)\) | \(finv(1-alpha,df)\) | \(probmc('range',.,prob,df,v)\) | \(probmc('dunnett2',.,prob,df,v-1)\) |
Table | A.4, p802 | F value in A.6,804 | q value in A.8, p814 | A.10, 818 |
The standard error depends on the contrast as well as the \(MSE\). For example, for a pairwise difference \(\mu_i-\mu_j\) with equal sample size \(r\), the standard error is \(\sqrt{MSE\times(2/r)}\). We must have an estimate for \(MSE\). One either uses an educated guess or a confidence upper limit for \(\sigma^2\) because \(MSE\) is an estimate for \(\sigma^2\). Say, use a 90% confidence upper limit for \(\sigma^2\) to replace MSE.
Consider the trout experiment in Exercise 15 of Chap. 3. The SAS code for the analysis of variance is given below
data trout;
do sulfa = 1 to 4;
do rep = 1 to 10;
input hemo @@;
output;
end; end;
lines;
6.7 7.8 5.5 8.4 7.0 7.8 8.6 7.4 5.8 7.0
9.9 8.4 10.4 9.3 10.7 11.9 7.1 6.4 8.6 10.6
10.4 8.1 10.6 8.7 10.7 9.1 8.8 8.1 7.8 8.0
9.3 9.3 7.2 7.8 9.3 10.2 8.7 8.6 9.3 7.2
;
run;
proc glm data=trout;
class sulfa;
model hemo=sulfa;
lsmeans sulfa/cl adjust=Tukey;
run;
The ANOVA table is produced below.
Source | DF | Sum of Squares | Mean Square | F Value | Pr>F |
---|---|---|---|---|---|
Model | 3 | 26.80275000 | 8.93425000 | 5.70 | 0.0027 |
Error | 36 | 56.47100000 | 1.56863889 | 0 | 0 |
Corrected Total | 39 | 83.27375000 |
Suppose the experiment were to be repeated and we would like the 95% simultaneous confidence intervals using Tukey’s method to have a half-width 1 g per 100 ml. We will use the 90% confidence upper limit of \(\sigma^2\) for the planning purpose. Assuming equal sample size, how large the sample size \(r\) should be?
First, find the 90% confidence upper limit for \(\sigma^2\). It is given by \(SSE/\chi_{n-v, 0.90}^2=56.4710/\chi_{36, 0.90}^2=56.4710/25.6433=\) 2.2022. Then we can calculate the MSD or half-width of Tukey’s 95% simultaneous confidence intervals.
data q;
input r @@;
alpha=0.05;
v=4;
MSE=2.2022;
n=v*r;
df=n-v;
prob=1-alpha;
qT=probmc("range",.,prob,df,v);
msd=(qT/2**0.5)*(MSE*2/r)**0.5;
lines;
20 30 40
;
proc print;
run;
The SAS output is reproduced below.
Obs | r | alpha | v | MSE | n | df | prob | qT | msd |
---|---|---|---|---|---|---|---|---|---|
1 | 20 | 0.05 | 4 | 2.2022 | 80 | 76 | 0.95 | 3.71485 | 1.23269 |
2 | 30 | 0.05 | 4 | 2.2022 | 120 | 116 | 0.95 | 3.68638 | 0.99878 |
3 | 40 | 0.05 | 4 | 2.2022 | 160 | 156 | 0.95 | 3.67263 | 0.86174 |
Read Example 4.5.1 about the bean-soaking experiment. We revise the code slightly to easily get the desired sample size.
data size;
input r @@;
alpha=0.05;
v=5;
MSE=10;
n=v*r;
df=n-v;
prob=1-alpha;
qT=probmc("range",.,prob,df,v);
msd=(qT/2**0.5)*(MSE*2/r)**0.5;
lines;
10 15 16 17 18 19
;
proc print data=size;
run;
SAS output is below. We see that a sample size 18 will yield a width of the simultaneous confidence intervals to be less than 6.
Obs | r | alpha | v | MSE | n | df | prob | qT | msd |
---|---|---|---|---|---|---|---|---|---|
1 | 10 | 0.05 | 5 | 10 | 50 | 45 | 0.95 | 4.01842 | 4.01842 |
2 | 15 | 0.05 | 5 | 10 | 75 | 70 | 0.95 | 3.96001 | 3.23334 |
3 | 16 | 0.05 | 5 | 10 | 80 | 75 | 0.95 | 3.95308 | 3.12519 |
4 | 17 | 0.05 | 5 | 10 | 85 | 80 | 0.95 | 3.94703 | 3.02723 |
5 | 18 | 0.05 | 5 | 10 | 90 | 85 | 0.95 | 3.94170 | 2.93797 |
6 | 19 | 0.05 | 5 | 10 | 95 | 90 | 0.95 | 3.93696 | 2.85617 |