In Part 2 of this seven-part overview, I described obtaining the sample size(s) or conditional variance (CV) associated with an effect-size (ES) estimate to quantify this estimate’s sampling error or (im)precision. Here in Part 3 I’ll address coding features linked to ESs. Whereas this overview’s first three parts focus on collecting data used in a research synthesis, its subsequent four parts will address meta-analyzing these data and presenting results. (Part 1 lists the topics for all seven parts.)

Task 3: Collect Features of Effect Sizes

Research synthesists typically collect several types of info about each ES they estimate (besides measures of sampling error). In a simple situation where each study contributes one ES estimate, we might assign every study a value on one or more study-level variables, thereby linking each study’s ES estimate to these coded features. More complex situations might also involve features of lower- or higher-level units, such as when a study contributes multiple ESs or subsets of studies are clustered in an important way (e.g., from the same publication, project, or research team). Below I’ll comment on uses for these ES features and associated issues. My description of this task will be shorter than for the previous two, mainly because coding ES features is less unique to research synthesis than are ESs and CVs.

Types of and Uses for Features

It’s not feasible to enumerate the vast diversity of features research synthesists collect. What’s most useful to code depends heavily on one’s review objectives, available data, audience, and other considerations. Nevertheless, many variables typically of interest to synthesists fall into one of the following fuzzy categories:

Context: aspects of the circumstances or environment in which the study was conceived, conducted, or disseminated, such as geographic location, time period, funding, data-collection setting, or constraints

Investigators: aspects of the people who designed, implemented, or reported the study, such as professional affiliation, academic discipline, or position on relevant theories or practices

Subjects: aspects of the entities from whom data were obtained (e.g., humans, non-human animals, plants, groups, regions), of which some may vary among a study’s subjects and others may not (e.g., due to inclusion/exclusion criteria); for human participants these might include demographic characteristics, info on group affiliations, numerous physical or psychological variables, and so on

Methods: aspects of how the study was conducted and the data were analyzed, such as sampling or allocation schemes, masking/blinding of personnel or subjects, interventions, stimuli, measures of key variables, data processing, and analysis techniques

Dissemination: aspects of how the study was reported, such as publication date, thoroughness of info about methods, and features of the outlet (e.g., medium, academic discipline)

Perhaps the most prominent use for ES features is as covariates (i.e., predictor, explanatory, or independent variables) in certain meta-analytic procedures aimed at understanding variation among ESs, such as meta-analytic analogs of ANOVA and regression. In that role, ES features are often called moderators. Some specialized types of meta-analysis use particular ES features in unique ways to model ES variation; for example, validity generalization often entails using measures of (un)reliability, range restriction, and other “artifacts” that tend to distort estimates of predictive-validity coefficients (e.g., correlations, covariances, regression slopes). Research synthesists also use ES features in other ways in different phases of research synthesis, such as the following:

as inclusion or exclusion criteria for selecting studies or particular results to be synthesized

as indicators of methodological quality, which may be used to make judgments about threats to validity or risk of bias or to assess these potential problems more rigorously in statistical analyses

in descriptive analyses to characterize the meta-analytic sample or, more broadly, the pertinent literature’s “landscape” (e.g., types of studies well-represented in vs. absent from available literature)

in procedures for dealing with missing data, such as multiple imputation

Complicating Issues

Numerous issues can arise when deciding which ES features to collect and extracting them from study reports. Although many of these also arise in primary studies, others are more likely in research syntheses, largely because synthesists rely on the conduct and reporting of primary studies. Here I’ll mention just a few issues that are especially common and relevant to later meta-analytic procedures.

Missing data: Rarely are all desired ES features available from all studies. A study’s value on a given variable is usually unavailable because the investigators either didn’t record it (especially for variables attend to less by authors in the focal research domain), recorded it but didn’t report it, or recorded and reported it but not in the synthesist’s desired form (e.g., subjects’ age range instead of mean age). As a variant of the latter problem, a study’s authors might report info about a desired variable ambiguously or incompletely (e.g., vaguely defined inclusion criteria). Obtaining certain unreported info from the study’s authors is sometimes feasible but can be discouragingly fruitless.

Alternative coding schemes: Some features can be quantified or otherwise coded in multiple ways. Choosing among these schemes often involves balancing the coded info’s quality versus availability, in that including more studies tends to require cruder coding. For instance, when coding studies’ funding we might be able to determine for most studies simply whether it was funded or by whom, but fewer studies will provide more detailed info (e.g., amount or duration of award, funder’s stance on relevant policy). As another example, suppose that for studies comparing treatment and control groups we wish to code participants’ sex: Most studies might report at least whether each sex was included, but we might expect fewer studies to report this separately for each group or provide more refined info (e.g., majority sex, each sex’s percentage). A related issue involves harmonizing info on features operationalized differently among studies, such as classifying trials’ criteria for screening participants based on their varying definitions.

Lower-level features: Meta-analysts are often interested in features at a lower level than an ES. For instance, a focal feature might vary among two or more groups compared by an ES, such as aspects of each group’s data-collection setting. More common, perhaps, are characteristics that could vary among subjects who contributed data to the ES, in which case meta-analysts often record an aggregate measure as the ES feature. For example, we might record the percentage of participants at each level of a categorical variable (e.g., attriter and complier; female and male; low, middle, and high socioeconomic status) or measures of a continuous variable’s level/location/central tendency or spread/dispersion/variability (e.g., mean or SD of age, median or range of income). Such subject-level aggregates are usually random, in that they’d vary over hypothetical samples of subjects; this has implications for how these aggregates are used in meta-analytic procedures.

How best to address these and other issues depends in part on one’s aims and available resources. More rigorous and defensible strategies for dealing with “messy” data tend to be more technically complicated, require more data, or consume more resources. For instance, improving the extraction of difficult variables from reports might entail enlisting content-area experts to craft valid but practicable operational definitions, conducting pilot studies to refine coding schemes, employing multiple coders, and assessing inter-coder reliability. Some data-extraction challenges would be mitigated if all eligible studies provided individual participant data (IPD), so that features of each subject were available, but in most research domains that’s not currently feasible. Several authors have written about benefits of IPD and related ways to enhance meta-analytic capabilities, such as data repositories, trial registries, and prospective meta-analyses.

Example—Workplace Exercise: Numerous features of ESs and studies played roles in Conn, Hafdahl, Cooper, Brown, and Lusk’s (2009) quantitative review of workplace exercise interventions, introduced in Part 1 of this overview. Here I’ll mention several features they (well, we) used, mainly for three purposes.

Inclusion/exclusion criteria: They explicitly excluded studies that focused on chronically ill workers, were not reported in English, or were published before 1969 or after late 2007. Their inclusion criteria were fairly broad with respect to publication status, sample size, and study design (e.g., not only true experiments); a study’s having fewer than three participants was their only reported exclusion criterion based on these features, but they implicitly excluded designs that didn’t permit comparing intervention versus non-intervention groups or conditions (e.g., 1-group posttest).

Descriptive analyses: Before addressing meta-analyses of ESs, they described several aspects of the retrieved studies. Some of these aspects would be relevant to many research syntheses, such as publication outlet and year as well as funding status. Other aspects were more particular to their review topic, such as characteristics of the company or companies included in a study (e.g., for-profit status, size, number, number of locations, sector). They also reported several characteristics of the studies’ exercise interventions, such as indicators of how they were integrated into the workplace (e.g., designed by workplace, delivered at worksite, delivered by an employee, accompanied by worksite policy) as well as “dosage” variables (e.g., frequency or duration of supervised exercise or motivational content).

Moderators: They examined several ES features—a subset of those used in descriptive analyses—as potential moderators of the intervention effect. These were all categorical variables, and most were dichotomies representing company characteristics or the intervention’s integration into the workplace. As we’ll consider more closely in Part 5 of this overview, analyses of such a variable addresses whether and how ESs differ between its two or more levels. Although these analyses involved only the four outcome variables on which the most ES estimates were available, some potential moderators were reported for too few ESs to support an analysis. For instance, relatively few studies reported collecting data not at the worksite or making an organizational-level policy change as part of the intervention, so each of these two potential moderators was analyzed for only two of the four outcome variables.

Conn et al. encountered a variety of challenges when collecting features of ESs and studies. Many of these were not detailed in the published report. Besides numerous instances of the aforementioned issues—missing data, alternative coding schemes, lower-level features—they confronted related complications such as multiple reports of a given project and multiple treatment or control groups per study. These “multiples” can cause trouble when desired info is reported inconsistently across them (e.g., from different sub-samples of participants), which often requires judgment calls that influence the recorded data’s quality.

With that, I’ll end this segment on collecting features associated with ESs. As in the previous two parts of this overview, I’ve raised several issues superficially, and many of them deserve more extensive consideration. Although future blog posts will probably focus less on this particular data-collection task than on other tasks in meta-analysis, I welcome suggestions for specific issues I’ve mentioned above—or others—that you’d like discussed in more depth (e.g, with details and reference citations).