Blocks are used to block out the effects of nuisance factors that may have a major effect on the response but are not of interest to us. Designs involving blocks are called block designs. The sets of similar experimental units are grouped together to form blocks, and the conditions that vary from block to block form the levels of the blocking factor. The intent of blocking is to prevent large differences in the experimental units from masking differences between treatment effects, while at the same time allowing the treatments to be examined under different experimental conditions.
Since the levels of the blocking factor do not necessarily need to be measured, the block design is very popular.
Agricultural experimenters may know that plots close together in a field are alike, while those far apart are not alike. Plots are blocks.
Industrial experimenters may know that two items produced by one machine have similar characteristics, while those produced by two different machines are somewhat different. Machines are blocks.
Medical experimenters may know that measurements taken on the same subject will be alike, while those taken on different subjects will not be alike. The subjects are blocks.
For the analysis of a block design, it is quite similar to factorial designs without blocks except we usually do not investigate the block effects and interaction effects of the block factor with treatment factors.
The block size refers to the number of experiment units in a block. Commonly block sizes are equal, denoted by \(b\).Sometimes the block sizes are naturally defined, and sometimes they need to be specifically selected by the experimenter. It is not uncommon in industry for an experiment to be automatically divided into blocks according to time of day as a precaution against changing experimental conditions.
A complete block design
have different definitions in literature. In a
broader sense, it refers to block design where all treatments are used
in each block. In a narrower sense, it refers to a block design where
the block size is the multiples of the number of treatments, and each
treatment is allocated the same number of experiment units. Whatever way
it is defined, all treatments means can be compared without confunding
with block effects.
If the block size equals the number of treatment and each treatment is
assigned to one unit completely at random, the designed is a
randomized complete block design
, or simply a
randomized block design
.
If the block size is pre-determined, we can calculate the number of blocks b that are required to achieve a confidence interval of given length, or a hypothesis test of desired power, in much the same way as we calculated sample sizes in factorial designs. If the number of blocks b is fixed, but the block sizes can be large, then the same techniques can be used to calculate the block size for a general complete block design.
In a complete block design, we assume each treatment is assigned to \(s\) units in each block. Having every level of the treatment factor observed more than once per block gives sufficient degrees of freedom to be able to measure a block \(\times\) treatment interaction if one is anticipated. Therefore, there are two standard models for the general complete block design, the block-treatment model (without interaction)
\[\begin{equation} Y_{h i t}=\mu+\theta_h+\tau_i+\epsilon_{h i t} \tag{1} \end{equation}\]
and the block-treatment interaction model, which includes the effect of block-treatment interaction: \[\begin{equation} Y_{\text {hit }}=\mu+\theta_h+\tau_i+(\theta \tau)_{h i}+\epsilon_{h i t}. \tag{2} \end{equation}\]
\[ h=1, \ldots, b ; i=1, \ldots, v; t=1, \ldots, s. \] where \(\mu\) is a constant, \(\theta_h\) is the effect of the \(h\) th block, \(\tau_i\) is the effect of the \(i\) th treatment, \(Y_{h it}\) is the random variable representing the measurement on treatment \(i\) observed in block \(h\), and \(\epsilon_{h it}\) is the associated random error.
In each case, the model includes the error assumptions: \(\epsilon_{hit}\) ’s are mutually independent and have the identical \(N\left(0, \sigma^2\right)\) distribution.
The analyses for these two models are carried out similarly to the two-way main-effects model and the two-way complete model except now we are not interested in the block effects.
Source of variation | Degrees of freedom | Sum of squares | Mean square | Ratio |
---|---|---|---|---|
Block | b-1 | \(ss\theta\) | - | - |
Treatment | v-1 | ssT | ms T=(ssT)/(v-1) | (msT)/(msE) |
Error | bvs-b-v+1 | ssE | msE=(ssE)/(bvs-b-v+1) | |
Total | bvs-1 | sstot |
Source of variation | Degrees of freedom | Sum of squares | Mean square | Ratio |
---|---|---|---|---|
Block | b-1 | ss | - | - |
Treatment | v-1 | ssT | msT=\(\frac{ssT}{v-1}\) | (msT)/(msE) |
Interaction | (b-1)(v-1) | ss\(\theta\)T | ss\(\theta T=\frac{ss\theta T }{(b-1)(v-1)}\) | (ms\(\theta\)T)/(msE) |
Error | bv(s-1) | |||
Total | bvs-1 | sstot | msE=(ssE)/(bv(s-1)) |
Although models (1) and (2) are similar to the two-way main-effects model with two treatment factors and the two-way complete model, respectively, they differ from the two-way models involving two treatment factors. The reason is how treatments are randomly assigned. In a block design, the units are grouped into blocks and treatments are randomly assigned within each block. This difference leads to some controversy as to whether or not a test of equality of block effects is valid.
Consider a block design in agriculture with 3 blocks. Each block is a 10 acre land which is divided into 12 plots, each of which receives one of the 6 treatments at random. This is a complete block design. On the other hand, if you take the block factor as a treatment factor, and consider a \(3\times 6\) factorial design, what would be the experiment units then? From this perspective, the difference between a block design and a factorial design is clear. We therefore do not perform hypothesis testing on the block.
However, when blocks represent nuisance sources of variation, we do not need to know much about the block effects since it is very unlikely that we can use the identical blocks again. Rather than testing for equality of block effects, we will merely compare the block mean square msθ with the error mean square msE to determine whether or not blocking was beneficial in the experiment at hand.
Example: Resting metabolic rate experiment
The experiment was run to compare the effects of inpatient and outpatient protocols on the in-laboratory measurement of resting metabolic rate (RMR) in humans. A previous study had indicated measurements of RMR on elderly individuals to be 8% higher using an outpatient protocol than with an inpatient protocol. The experimenters hoped to conclude that the effect on RMR of different protocols was negligible.
The experimental treatments consisted of three protocols: (1) an inpatient protocol in which meals were controlled—the patient was fed the evening meal and spent the night in the laboratory, then RMR was measured in the morning; (2) an outpatient protocol in which meals were controlled—the patient was fed the same evening meal at the laboratory but spent the night at home, then RMR was measured in the morning; and (3) an outpatient protocol in which meals were not strictly controlled—the patient was instructed to fast for 12 hours prior to measurement of RMR in the morning. The three protocols formed the v = 3 treatments in the experiment.
Since subjects tend to differ substantially from each other, error variability can be reduced by using the subjects as blocks and measuring the effects of all treatments for each subject. In this experiment, there were nine subjects (healthy, adult males of similar age) and they formed the \(b=9\) levels of a blocking factor “subject.” Every subject was measured under all three treatments (in a random order), so the blocks were of size \(k=3=v\).
Protocol | |||
---|---|---|---|
Subject | 1 | 2 | 3 |
1 | 7131 | 6846 | 7095 |
2 | 8062 | 8573 | 8685 |
3 | 6921 | 7287 | 7132 |
4 | 7249 | 7554 | 7471 |
5 | 9551 | 8866 | 8840 |
6 | 7046 | 7681 | 6939 |
7 | 7715 | 7535 | 7831 |
9 | 9862 | 10087 | 9711 |
Plot for the RMR experiment.
We see there is a significant variation from individual to individual that is larger than the variation within an individual. The following is the SAS code and outcome.
data resting;
input subject @@;
do protocol=1 to 3;
input rate @@;
output;
end;
lines;
1 7131 6846 7095
2 8062 8573 8685
3 6921 7287 7132
4 7249 7554 7471
5 9551 8866 8840
6 7046 7681 6939
7 7715 7535 7831
8 9862 10087 9711
9 7812 7708 8179
;
proc print;
run;
proc glm data=resting;
class subject protocol;
model rate=subject protocol;
lsmeans protocol;
run;
Source of variation | Degrees of freedom | Sum of squares | Mean square | Ratio | p-value |
---|---|---|---|---|---|
Subject | 8 | 23,117,462.30 | 2,889,682.79 | - | - |
Protocol | 2 | 35,948.74 | 17,974.37 | 0.23 | 0.7950 |
Error | 16 | 1,235,483.26 | 77,217.70 | ||
Total | 26 | 24,388,894.30 |
The Bonferroni, Scheffé, Tukey, and Dunnett methods described in previously for factorial designs can all be used for obtaining simultaneous confidence intervals for sets of treatment contrasts in a general complete block design. For example, the block-treatment model (1), without interaction, is similar to the two-way main-effects model. Thus, a set of \(100(1-\alpha) \%\) simultaneous confidence intervals for treatment contrast \(\Sigma c_i \tau_i\) is of the form \[ \sum c_i \tau_i \in\left(\sum c_i \bar{y}_{. i .} \pm w \sqrt{m s E \sum c_i^2 / b s}\right) \] where the critical coefficients for the four methods are, respectively, \[ \begin{gathered} w_B=t_{d f, \alpha / 2 m} ; w_S=\sqrt{(v-1) F_{v-1, d f, \alpha}} ; \\ w_T=q_{v, d f, \alpha} / \sqrt{2} ; w_{D 2}=|t|_{v-1, d f, \alpha}^{(0.5)}, \end{gathered} \] where \(n=b v s\) and \(d f=n-b-v+1\).