How can I perform mediation with multilevel data? (Method 1) | Stata FAQ

NOTE: We are not fully confident that the methods on this page are valid for testing for mediated effects in multilevel models. Proceed at your own risk.

Mediator variables are variables that sit between the independent variable and dependent variable and
mediate the effect of the IV on the DV. A model with one mediator is shown in the figure below.

The idea, in mediation analysis, is that some of the effect of the predictor variable, the IV, is transmitted to the DV through the mediator variable, the MV. And some of the effect of the IV passes directly to the DV. That portion of of the effect of the IV that passes through the MV is the indirect effect. The program ml_mediation (see How can I use the search command to search for programs and get additional help? for more information about using search). will compute direct and indirect effects for multilevel data. The approach used in ml_mediation was adapted from Krull & MacKinnon (2001).

When you have multilevel data, the variables may come from different levels of the model. The DV will always be a level one variable. Depending on your data, the IV and MV may be either level 1 or level 2 variables. According to Krull & MacKinnon (2001) a predictor variable may be mediated by a variable at the same level or lower. Thus a level 2 mediator may be mediated by a level 2 or level 1 variable. A level 1
predictor may only be mediated by another level 1 variable. Logically, a level 1 predictor cannot affect a level 2 mediator.

ml_mediation computes the indirect effect as the product of coefficients, i.e., indirect effect = coef[a]*coef[b]. When the response varible is at level 1, ml_mediation uses the xtmixed, reml command by default with xtmixed, mle as an option. When the response variable is at level 2, i.e., the MV is level 2, ml_mediation uses the xtreg, be command. The ml_mediation program will detect which variables are level 1 and which are level 2.

The DV and MV must be a continuous variables. The IV may be a continuous or binary predictor variable. While the CVs may be continuous, binary or factor variables.

We will illustrate the use of the ml_mediation command with a simulated multilevel dataset, ml_med.dta.. Let’s look at the data.

The variables write, socst, abil and hon are all level 1 variables. The variable cid is the cluster, level 2, identifier, while hon is a binary variable that indicates membership in the honor society. Abil is a composite measure of academic ability. Now, we are ready to try a multilevel mediation model in which all of the variables are at level 1.

The output includes the results of three equations: 1) the DV on the IV, 2) the MV on the IV, and 3) the DV on the MV and IV. The direct, indirect and total effects along with various proportions and ratios are shown below the results of the three equations.

We see that hon is significant in equation 1 and is also a significant predictor of the mediator variable, abil, in equation 2. However, hon is not significant in equation 3 when the mediator is included in the model. This suggests that there is mediation. The output includes the indirect, direct and total effects. It does not however include standard errors or confidence intervals. To get these you need to bootstrap the results. You can bootstrap any of the effects found in the return list.

We will illustrate this by bootstrapping the ml_mediation command with 500 replications. You may want to do more than 500 reps, maybe a lot more. You will probably also want to use a differnt seed value. Please note that we are bootstrapping cluster so we need the cluster option. We also need to give the clusters a new id when they are resampled, thus the idcluster option. Note that we now have to use the new cluster name, ncid, in the ml_mediation command.