Table of Contents

Longitudinal or growth curve data (where individuals are repeatedly measured over time) are often analyzed using (linear) mixed-effects models. It can be instructional to examine how the results from such models can be approximated by a (simpler) two-stage analysis (e.g., section 4.2.3 in Diggle et al., 2002; section 3.2 in Verbeke & Molenberghs, 2000; Stukel & Demidenko, 1997).

As an illustration, we can use the Orthodont data from the nlme package:

Therefore, the estimated average distance at age 8 is $b_0 = 22.04$ millimeters ($SE = .420$). For each year, the distance is estimated to increase on average by $b_1 = 0.66$ millimeters ($SE = .071$). However, there is variability in the intercepts and slopes, as reflected by their estimated standard deviations ($SD(b_{0i}) = 1.887$ and $SD(b_{1i}) = 0.226$, respectively). Also, intercepts and slopes appear to be somewhat correlated ($\hat{\rho} = .21$). Finally, residual variability remains (reflecting deviations of the measurements from the subject-specific regression lines), as given by the residual standard deviation of $\hat{\sigma} = 1.310$.

Two-Stage Approach

Now let's use a two-stage approach to analyze these data. First, the linear regression model is fitted to the data for each person separately (i.e., based on the four observations per individual):

Finally, we conduct a multivariate meta-analysis with the model coefficients (since we have two correlated coefficients per subject). The V matrix contains the variances and covariances of the sampling errors. We also allow for heterogeneity in the true outcomes (i.e., coefficients) and allow them to be correlated (by using an unstructured variance-covariance matrix for the true outcomes). The model can be fitted with:

The results are similar to the ones given earlier, with an estimated average intercept of $b_0 = 22.28$ ($SE = .410$), an estimated average slope of $b_1 = .58$ ($SE = .056$), estimated standard deviations of the underlying true intercepts and slopes equal to $SD(b_{0i}) = 1.987$ and $SD(b_{1i}) = 0.219$, respectively, and a correlation between the underlying true intercepts and slopes equal to $\hat{\rho} = .20$ (no residual standard deviation is given, since that source of variability is already incorporated into the V matrix).

Discussion

This example illustrates how a two-stage procedure (i.e., fit the "level 1 model" per person, then aggregate the coefficients using essentially a meta-analytic model with corresponding random effects) is quite similar to what a mixed-effects model does. There are of course some underlying differences and the mixed-effects model approach should be more efficient. However, as described by Stukel and Demidenko (1997), there are circumstances where the two-stage approach is more robust to certain forms of model misspecification (when interest is focused on a subset of the model coefficients or some functions thereof).