Repeated Measures

Statistical Modelling and Analysis

The modelling and analysis of repeated measures are a complex topic. In this section, we only highlight some models and analyses by looking at some real data sets.

The Univariate Analysis of Variance Approach

Example 1. (Alzheimer’s Data, Hand and Taylor,1987, Table G.1) Two groups of patients with Alzheimer’s disease were compared, one of which had 26 patients and received placebo, and the other had 22 and was treated with lecithin. The response variable is the number of words that a patient can recall from lists of words. The response variable was measured at time units 0, 1, 2, 4, and 6. Plots of the data are given in Figure 1.

Figure 1: Alzheimer study response profiles: Placebo group on right, lecithin group on left.

From the graph, we can see differences between subjects within each group as well as differences between the two groups. In general, we will regard subject effects as random effects. In some analyses, the repeated measures from the same subject are assumed to be independent. If we take this position, we will have the univariate analysis of variance approach. The corresponding statistical model for this experiment is \[\begin{equation} y_{ijk}=\mu+\alpha_i+d_{j(i)}+\tau_k+(\alpha \tau)_{ik}+\epsilon_{ijk}, \tag{1} \end{equation}\] where \(\alpha_i, \tau_k\) and \((\alpha \tau)_{ik}\) are fixed effects of treatment \(i\), time \(k\), and their interaction, respectively, \(d_{j(i)}\) is the random effect associated with the \(j^{th}\) subject in group \(i\), \(\epsilon_{ijk}\) is random error associated with the \(j^{th}\) subject in group \(i\) at time \(k\), \(d_{j(i)}\) are i.i.d. \(N(0, \sigma_s^2)\) and \(\epsilon_{ijk}\) are i.i.d. \(N(0, \sigma^2)\). Note that \[E(y_{ijk})=\mu+\alpha_i+\tau_k+(\alpha \tau)_{ik}, ~Var(y_{ijk})=\sigma_s^2+\sigma^2,\] and the covariance between any two different observations on the same subject is \(Cov(y_{ijk}, y_{ijk'})=Var(d_{j(i)})=\sigma_s^2,~j \ne j'\). Such a covariance structure is called compound symmetric. Note also compound symmetry implies that var\((y_{ij}-y_{ij'})\) is a constant for any \(j\ne j'\). Such a condition is called {}. Many computer programs report the results of the Mauchly test of sphericity though it seems this test is not powerful for detecting small departures from sphericity. Some adjusted F-tests for non-sphericity exist. Model (1) is similar to the model we used for split-plot designs since subjects are nested within the treatment groups.

We can use a very flexible SAS procedure proc mixed for model (1).

proc mixed;
 class group subj time;
 model response=group time group*time;
 random subj(group);
run;

The model statement specifies three fixed effects in the model and the random statement specifies the random effect(s).

We see this model is similar the the model for a split-plot design.

Modelling Covariance Structure

As we said before, repeated measures from the same subject are usually dependent. Consider the alzheimer experiment again. The measurements from the same subject on 5 occasions might be correlated. In this scenario, the model will be essentially the same but the error terms \(\epsilon_{ijk}\) for the same subject are correlated. We should model this correlation structure. There are three commonly used covariance structures: compound symmetric, autoregression of order one (AR(1)) and unstructured.

Compound Symmetry

\[Var(\epsilon_{ijk})=\sigma^2,~Cov(\epsilon_{ijk}, \epsilon_{ijk'})=\rho \sigma^2,~k\ne k'\]

AR(1) \(\epsilon_{ijk}, k=1, 2, \cdots\) is assumed to be an AR(1) process. Therefore, \(Cov(\epsilon_{ijk}, \epsilon_{ijk'})=\sigma^2 \rho^{|k-k'|}\).
Unstructured Covariance No mathematical pattern is imposed on the covariance matrix and the covariance structure of the repeated measures is estimated using the facts that this covariance structure remains the same for every subjects, and measurements taken from different subjects are independent.

SAS Program

We use the repeated statement in proc mixed with options type to specify one of the three covariance structures. For example, if we use the compound symmetric covariance structure for the Alzheimer experiment, the SAS program is

proc mixed;
 class group subj time;
 model response=group time group*time;
 repeated/type=cs sub=subj(group) r rcorr;

In the repeated statement, type=cs specifies the covariance structure type to be compound symmetric, sub specifies that the compound symmetric structure pertains to submatrics corresponding to each subjects in each group. The options r and rcorr request printing of covariance matrix and correlation matrix.

If we were to use AR(1), we would change the repeated statement to

 repeated/type=ar(1) sub=subj(group) r rcorr;

Note, this program is not appropriate for the experiment since the repeated measures were taken at unequally spaced time intervals. Use type=sp(pow) for unequally spaced measures.

If we use unstructured covariance, we change the repeated statement to

 repeated/type=un sub=subj(group) r rcorr;

Some criteria exist for choosing the covariance structure, among which are Akaike’s Information Criterion (AIC) and Schwarz’s Bayesian Criterion (SBC). Both penalize the log likelihood function by addition a penalty term which increases with the number of parameters. We then choose the structure that maximizes a penalized log likelihood.

Figure 2: Growth curves of chicks on four different protein diets.

Modeling Time As a Regression Variable

Consider the study on body weights of chicks on different diets. There are four groups, each on different protein diet. Body weights are measured on alternate days. The body weights for the four groups are plotted in Figure 2.

From the plots, we can see the differences between the groups. In addition, there are between-chicks differences within each group. For each chick, the growth curve can be reasonably modeled as a quadratic function of time. A reasonable model would be

\[\begin{equation} y_{ijt}=\mu+\alpha_i+t \beta_{i} +t^2 \gamma_{i} + t b_{j(i)}+ t^2 c_{j(i)}+\epsilon_{ijt}, \end{equation}\] where \(\mu, \alpha_i,\beta_i\) and \(\gamma_i\) are fixed parameters, which explain for between-group differences, \(b_{j(i)}\) and \(c_{j(i)}\) are random coefficients, and \(b_{j(i)}\) are i.i.d. \(N(0,\sigma_{i, b}^2)\), \(c_{j(i)}\) are i.i.d. \(N(0, \sigma_{i,c}^2)\). The two random coefficients explain the between-subject differences within a group. \(b_{j(i)}\) and \(c_{j(i)}\) can be correlated.