Sunday, December 19, 2010

Alan and Bo are both quoted in this new article from Brain, Child magazine. In addition to discussing substantive issues of causal inference, author Katy Read also probes the chain of diffusion of social-science research.

The chain begins, of course, with the investigators who conducted the research. Those who conduct correlational studies typically include a statement of limitations at the end of their articles, noting that the findings are open to alternative causal interpretations. In less-guarded moments, however, even research scientists will use phraseology that implies a preferred causal direction.

Universities, research institutes, and/or professional organizations may then issue press releases on a particular study. Ultimately, a research finding may make it into the media. At each reporting step removed from the (methodologically trained) scientific investigators, therefore, statements of caution regarding causality are less likely to appear.

Saturday, November 27, 2010

Angela Lee Duckworth, Eli Tsukayama, and Henry May have published an article entitled, "Establishing Causality Using Longitudinal Hierarchical Linear Modeling: An Illustration Predicting Achievement From Self-Control" (abstract) in the October 2010 issue of the new journal Social Psychological and Personality Science.

The authors are interested in the personality trait of self-control (which encompasses such abilities as persistence and delay of gratification) and whether it can actually be shown to cause academic achievement (GPA) during the years from fifth to eighth grade. Duckworth and colleagues acknowledge the difficulty early on:

"Manipulating personality in a random-assignment experiment could, in theory, establish its causal role for later outcomes but, alas, personality is not easily manipulated. To our knowledge, no empirical investigation to date has successfully manipulated trait-level self-control and measured subsequent effects on life outcomes" (p. 311).

The authors also carefully review the arguments for why longitudinal predictive models using path-analysis and structural equation modeling (controlling for prior levels of the dependent variable), though an improvement over cross-sectional correlations, are still vulnerable to unmeasured third variables. "Longitudinal growth-curve modeling using hierarchical linear models (HLM)" is then presented as offering "a partial solution to the third variable problem" (p. 312).

The core of the argument appears to be this: "By treating a predictor as a time-varying covariate in the prediction of trajectories, one can rule out the possibility of all time invariant confounds (e.g., relatively stable variables such as socioeconomic status). Specifically, if short-term changes in a predictor predict subsequent short-term changes in achievement, a confounding variable z would have to predict these changes and also be tightly yoked to changes in the predictor over time (i.e., the confound and predictor would have to go up and down together in synchrony over time)" (p. 312).

The authors indeed found self-control to predict GPA longitudinally (and not the reverse), concluding as follows:

"This longitudinal HLM study illustrated an innovative analytic strategy that effectively controlled for all time-invariant third-variable confounds... What our analyses did not rule out, however, is the possibility of an unmeasured time-varying third variable that changes in sync with self-control and causally determines subsequent academic performance" (p. 316; my emphasis added).

I would recommend this article for its excellent exposition on causality with non-experimental designs and for the approach it demonstrates. There are a lot of technical details in the article, as well, and even readers experienced with some of the general techniques used may find themselves having to pause periodically to wrap their minds around each specific analytic decision the authors describe.

Tuesday, September 28, 2010

Nearly two years ago, we first mentioned propensity scores as a tool for trying to draw as strong a causal inference as possible from nonexperimental designs, between exposure to a "treatment" experience and some later outcome. For example, how might attending private (vs. public) schools affect students' intellectual curiosity? An article by Harder, Stuart, and Anthony in the September 2010 Psychological Methods offers practical advice and concrete examples for conducting propensity-score analyses. Propensity scores involve matching the treatment and control groups on other variables (covariates) that are thought to predict treatment status or have "potential to confound the relationship between the treatment and the outcome" (p. 235). Harder et al. discuss several specific methods for implementing such matching, such as 1:1, 1:k (where one treated participant can be matched to multiple comparison persons), and full matching (where one matched set can include multiple treated individuals and multiple comparison counterparts). The article also discusses software packages for implementing propensity scores, although these do not appear very plentiful at the moment. The article is definitely worth a look by anyone contemplating a correlational or quasi-experimental study that one hopes to frame in causal terms (acknowledging that a full causal claim will be out of reach with such designs).

For table of contents, preface and more information please click here. As you can see, I have had a natural indirect effect on the cover design, but zero controlled direct effect.

2. A symposium on causality and related topics by some of the contributors to Heuristics, Probabilities and Causality was held at UCLA on March 12. Videos of lectures, by C. Hitchcock, S. Greenland, T. Richardson, J. Robins, R. Scheines, J. Tian, Y. Shoham and J. Pearl, can be viewed here. Videos of additional lectures will be posted in the near future.

3.1An open letter from Judea Pearl to Nancy Cartwright concerning "Causal Pluralism," a topic central to a discussion of her book Hunting Causes, which appeared recently in Economics and Philosophy 26:69-77.(Posted May 31, 2010)

3.2A lively discussion by T. Richardson, J. Robins and J. Pearl on the structure of the causal hierarchy and the scientific role of untestable counterfactual assumptions. (Posted May 3 and May 15, 2010)

5. Another posting of potential interest is Technical Report R-364, by T. Kyono (Master Thesis), titled "Commentator: A Front-End User-Interface Module for Graphical and Structural Equation Modeling." It takes a DAG as input and prints: (1) all identifiable direct effects, (2) all identifiable causal effects, (3) all (minimal) sets of admissible covariates, (4) all instrumental variables, and (5) (almost) all testable implications of a model. The source code is available upon request.

6. Finally, I have received inquiries regarding a slide that I used at NYU, in which an instrumental variable poses as an innocent confounder and, upon adjustment, amplifies, rather than reduces confounding bias. The moral of the story was (and is) that "outcome assignment" is safer to model than "treatment assignment." The pertinent paper is R-356 (link).

7. As always, your thoughts are welcome and will surely be put into some good cause when conveyed to other blog readers.

Monday, May 17, 2010

The March 2010 issue of the journal Psychological Methods contained a special series on quasi-experiments and causal inference, highlighting the complementary approaches of Donald Rubin and the late Donald Campbell. The heart of the series is a pair of lead articles, by Will Shadish, and by Stephen West and Felix Thoemmes, with a brief introductory piece by journal editor Scott Maxwell and three commentaries following the two main articles. It's taken me a while, but I'm now ready to comment on the special series.

Being able to demonstrate causality in social science research is a daunting task. Even in a tightly controlled one-hour laboratory experiment, there are potential threats such as experimenter bias or demand characteristics. In long-term field studies, even with random assignment, there is a lot of time for things to go wrong, namely the threats to internal validity enumerated by Campbell (with colleagues Stanley and Cook) such as history effects, attrition, or diffusion and imitation of treatments (here and here). Finally, in quasi-experiments, which lack the crucial element of random assignment, one knows from the start that a full causal inference is out of reach.

Where Cook and Rubin differ is in their main techniques for trying to get as much probative evidence as possible out of a given research design (experimental, quasi-experimental, or correlational/observational). As Shadish notes, with some qualification, "[Rubin] has focused more on analysis, whereas [Campbell] has focused more on design" (p. 11). West and Thoemmes phrase the contrast as follows: "Campbell's perspective emphasizes the prevention of threats to internal validity rather than their correction" (p. 22).

Campbell's approach, in essence, seems to be examining ahead of time where the flaws in a research design might lie, and then shoring up the design as best as possible. The latter might be done via steps such as pre-testing, extra control groups, and extra dependent variables (any of which might be affected by an artifactual phenomenon, but only one of which should be affected by the intervention). West and Thoemmes describe a previous quasi-experimental study by Reynolds and West (1987), who sought to test the effectiveness of an intervention to help convenience stores sell lottery tickets. This example, which involved matched control groups because store managers would not accept random assignment, was very clear and came with accompanying graphics to illustrate interpretation of the findings.

Rubin's approach relies on quantitative operations to assemble comparable experimental and control groups, when original group membership was not determined randomly. Propensity scores appear to be the main weapon in Rubin's arsenal. West and Thoemmes also walk the reader through an example of propensity scores, using 2008 research by Wu, West, and Hughes on retention in first grade and later academic achievement.

Even with my background teaching research methods for 13 years and working on projects using quasi-experimentation and propensity scores, I still found the articles highly educational. For example, the idea that Campbell's internal-validity-threats framework could be taken beyond the simple presence or absence of threats to possible quantification of the magnitude of the threats' effect was new to me. I also found discussion of the continuing controversy over matching to be highly informative. (As part of the extended commentary, Rubin writes in his piece about being "flabbergasted" upon hearing Campbell's denunciation of matching in a face-to-face meeting between the two in the early 1970s [p. 39].)

I enthusiastically recommend the Psych Methods special series as an aid to non-experimental research and as assigned reading for graduate-level methodology courses.