In an interview, researcher Joshua Johnson stated the relationship in causal terms, too:

We found that when people spent more time commuting [or spending] extra hours on their commute, that made them less likely to be engaged in politics.

a) Think for a moment: What are the variables in this research? Would they most likely be measured or manipulated?

b) Sketch a scatterplot that would show the relationshiop between commuting time and political engagement.

c) Can the researchers really support a causal claim from these data, at least as decribed here?

Shankar Vedantam, the NPR host who described the study, explained:

There's something about commuting in particular that seems to affect engagement, and the researchers are drawing here on this paper on earlier work by the behavioral economist Daniel Kahneman. He's found that commuting ranks among the most unpleasant parts of people's day. There's something uniquely stressful about commuting, and so when you get home after a hellacious day, you really have nothing to give to other people in terms of civic engagement, in terms of getting involved in your neighborhood politics.

Here's a reminder to my readers--just because you can offer a good explanation for a correlation, that doesn't change the fact that a correlational study cannot definitively support causation!

But back to the story. The correlational pattern gets more complex--it involves a moderator! I'll quote the transcript of the NPR report:

INSKEEP: Are all people affected the same way by this stress?

VEDANTAM: So this is a really important point, Steve, because it turns out that even though there's a general connection between political engagement and commuting, the effect is not experienced evenly by everyone. Commuting disproportionately seems to cause the poor to disengage from politics. And as we go up the income ladder, the effects that commuting have on political engagement actually decrease....

Commuting is stressful for everyone, but the poor find it harder to buffer themselves against the effects of the stress. When you're well off, you come home from a terrible day, you can go out for dinner. You can buy yourself a treat. When you're poor, you have less access to those kinds of safety nets.

d) In this example, we can say that social class moderates the relationship between commuting and political engagement. Sketch a moderator table that would capture this pattern.

Suggested Answers:

a) Think for a moment: What are the variables in this research? Would they most likely be measured or manipulated?

The two variables are time spent commuting and degree of political engagement. These are both measured variables, because it would be practically and ethically impossible to assign people to have a long commute.

b) Sketch a scatterplot that would show the relationshiop between commuting time and political engagement.

Your scatterplot should have Time spent commuting on one axis, and Degree of political engagement on the other. The dots should be spread in a negatively sloping pattern, because the more time people spend commuting, the less politically engaged people are.

c) Can the researchers really support a causal claim from these data, at least as decribed here?

No. There is covariance here: time spent commuting goes with less political engagement. Temporal precedence seems ambiguous: Which variable came first? Did commuting come first, followed by less engagement? Or do less politically engaged people prefer to work farther from the center of things? Finally, internal validity isn't clear--there may be third variables that explain this relationship. Perhaps younger people tend to commute further, and they are also less politically engaged. Perhaps less educated people commute further and are also less engaged. We would want to see Johnson's original research report--he probably statistically controlled for several such variables. His report would tell us which of these possible third variables were statistically controlled for, to help rule out third variable explanations.

d) In this example, we can say that social class moderates the relationship between commuting and political engagement. Sketch a moderator table that would capture this moderation pattern.

Here's one possible pattern (These data are fabricated, but I made them to fit the pattern described):

Social class level Relationship between commuting time and engagement

Lower SES -0.21*

Middle SES 0.03

Note: Initially, to make the moderator easier to understand, I oversimplified the quote. But in fact, the moderation pattern was a little more complex:

Commuting disproportionately seems to cause the poor to disengage from politics. And as we go up the income ladder, the effects that commuting have on political engagement actually decrease until we get to the very wealthy, where the longer your commute, the more likely you are to be politically engaged.

To represent the full pattern, you might show this (again, I fabricated the data to illustrate):

If you read the text of the NBC story, you get a fairly detailed picture of the variables in this study. For example, the variable "Breastfeeding" was apparently analyzed in two ways--as a categorical vairable (breastfeeding vs. not) and as a quantitative variable (how long babies were breastfed):

Young children who were breastfed as infants scored higher on
intelligence tests than formula-fed kids, and the longer and more
exclusively they were breastfed, the greater the difference, say Harvard
University researchers in a study published today in JAMA Pediatrics.

According to the NBC story,

The Harvard study, unlike most past studies, controlled for these and
other variables, including the mother’s intelligence, education level,
and any postpartum depression; family income and home environment; and
the child’s race, ethnicity, sex and birth weight.

“As a result,
we felt we were able to get a reasonable estimate of what the
relationship is between the length of breastfeeding and the IQ of the
child at school age,” says Dr. Mandy Belfort, lead author and assistant
professor of pediatrics at Harvard Medical School.

The language above should provide a big clue that the researchers used multiple regression in their analyses, because it says, "controlled for."

c) Why would the researchers want to control for mother's intelligence when they look for the association between breastfeeding and child's IQ?

d) Using the information in the two paragraphs quoted above, sketch a regression table. Place the dependent variable at the top, and list this study's independent variables underneath it. Which beta is the focus of the story? What do you know about this beta--was it significant, or not? What do you think the other betas on the table might be?

One of the findings of the study was that moms who breastfed for an entire 12 months had children whos verbal IQ was 4.2 points higher than non breastfed babies. A doctor was quoted by NBC as follows:

“I would take three or four IQ points any given day,” says pediatrician
Michael Georgieff, director of the Center for Neurobehavioral
Development at the University of Minnesota Medical School. Georgieff
was not involved in the current study. “It’s a pretty significant
shift, especially demographically across the world if everyone were to
make that gain.”

This pediatrician is discussing the effect size of this finding. He is suggesting that the effect size, in practical terms, is very important.

e) In your own words, what does the textbook say about the relationship between effect size and the importance of a finding?

05/20/2013

The Atlantic provides a clearly written summary of a recent longitudinal study. The headline reads:

Study: Math Skills at Age 7 Predict How Much Money You'll Make

According to the summary, the researchers measured kids' IQ, math skills, reading skills, and SES (Socio-economic status) at age 7, and then measured their income at age 42.

They found that

How much money the people made at midlife
was predicted by math ability at age seven. The other factors may have helped them on the
path to success, but even when those were controlled for, the
association between basic math and reading skills and future
socioeconomic status remained, and remained
significant...

Do you notice how this description uses the key phrase, "even when those were controlled for..."? This should signal to you that the researchers used multiple regression in their design.

a) What might the regression table have looked like in the study? What would the DV have been? What would the predictor variables have been? Estimate what you think the beta might have been for the predictor, "Math skills at age 7" (that is, is it positive or negative? Significant or not?).

b) Suppose a critic reads this article and says, "I don't think that it's math skills--I think it's IQ. Smarter people just earn more money and they did better at math as kids--that's all." What should you say in response? In this study, is IQ a potential third variable that could explain the association between math skills and future income?

c) Now suppose that a critic suggests that school quality might be a third variable. Kids who go to higher-quality schools had better math skills and also made more money. In this study, is school quality a potential third variable that could have explained the association between math skills and future income?

d) One part of the story says,

...How much money the people made at midlife
was predicted, for girls only, by
early reading ability.

Is this sentence describing a mediator? A moderator? Or a third variable?

04/10/2013

Are charter schools effective? Philadelphia Inquirer columnist Joel Naroff wrote about the data we might use to decide whether charter schools are doing well. He describes the pitfalls of using achievement test scores to measure school effectivness. His article uses the language of multiple regression and internal validity to describe how difficult this process is.

Specifically, Naroff mentions the following about charter schools, discussing

....what researchers call "selection bias." Parents have to choose to send
their children to a charter school, meaning that charters start with
students who have high parental involvement.

a) To translate his comment into our own research methods terms: Imagine a study that compared achievement test scores in a city's charter schools with the city's traditional public schools. Sketch a simple bar graph showing that charter schools score higher in achievement than traditional schools. Does this result mean that charter schools cause better school performance? What are some internal validity issues here? Apply the term "selection bias" or "selection effect," as the author is doing above.

This article also states that

Unfortunately, we don't adjust test scores for differences in factors
such as intelligence, income, parental involvement, or school
facilities. Instead, we use misleading, unadjusted test-score
comparisons. That means some schools that test well might not be doing
all they can, while those with failing scores might be making
outstanding progress. We are largely clueless about what is happening.

In this paragraph, the author seems to be advocating for using multiple regression to evaluate the effectiveness of charter schools.

b) Why would multiple regression help us rule out the selection bias problem the author mentioned earlier?

c) Sketch a possible table of multiple regression results, as the author suggests above: Assume that achievement test score is the dependent variable of interest. What would be your predictor variables? What results would show that charter schools really are more effective than traditional schools, even controlling for these selection effect variables? What results would show that charter schools are not more effective than traditional schools?

d) Do you agree that "we are largely clueless" about what is going on? Do you think the regression approach he mentions is a good idea? Are there any downsides to the regression approach?

Suggested answers to question c:

Below is a possible table of (fabricated) results that would show that charter schools really are more effective than traditional schools, even controlling for the selection effect variables the author mentioned.

* represents a statistically significant effect. This table suggests that charter schools are associated with better achievement test scores, even when controlling for earlier scores, IQ of child, income, parental involvement, and school facilities.

Now, below is a possible table of (fabricated) results that would show that charter schools are not more effective than traditional schools, even controlling for these selection effect variables

* represents a statistically significant
effect. This table suggests that charter schools are not associated with
better achievement test scores, when controlling for earlier
scores, IQ of child, income, parental involvement, and school
facilities.

10/10/2012

Perhaps you know a perfectionist in your life--a person who has very high standards and tends to be self-critical about any errors they make. Do perfectionists really do better work? Are they happier than others?

According to most research, perfectionism is not particularly good for people. This short summary features work by psychologist Gordon Flett (York University, Toronto) and his colleagues. He explains that perfectionists may actually be less likely to complete quality work--even though they try harder to do so. Producing lower quality work may be a "cost of perfectionism," according to this research.

The price of perfectionism also includes mental and physical health consequences. Here is one research finding described in the summary:

Some of the largest costs associated with perfectionism may be in terms of poor health. A longitudinal study following a sample of Canadians over 6.5 years showed that trait perfectionism predicted earlier mortality (Fry & Debats, 2009). This finding held even after controlling for other health risk factors such as pessimism and low conscientiousness.

a) Do you recognize the telltale signs of multiple regression here? Reread the quote and find the key phrase that tells you regression methods were used.

b) Create a table of results, placing the dependent variable at the top, and listing the independent variables underneath, using the tables in Chapter 8 as a model. What variables did they measure in this study? Which beta is significant, according to the quote above? (Hint: There are at least three independent variables in this study, according to the quote.)

07/10/2012

Corporal punishment was associated with increased odds of anxiety and mood disorders, including major depression, panic disorder, post-traumatic stress disorder, agoraphobia and social phobia. Several personality disorders and alcohol and drug abuse were also linked to physical punishment, the researchers found.

The study, originally published in the journal Pediatrics, intended to test the effects of physical punishment separately from more severe physical or sexual abuse. The journalist reported that the method involved

about 34,000 individuals aged 20 or older gathered from the U.S. National Epidemiologic Survey on Alcohol and Related Conditions. Participants were questioned face-to-face and asked, on a scale of "never" to "very often," how often they were ever pushed, grabbed, shoved, slapped or hit by their parents or another adult living [in] their home.

a. The journalist's story mentions that while the study shows that physical punishment is associated with some mental illnesses, the results cannot "prove that one causes the other." Why not? What are three possible causal paths for this association? (Hint: Think about both temporal precedence and the third variable problem)

Part of the story reports that

The researchers adjusted the data to take into account socio-demographic factors and any family history of dysfunction.

b. Was family history of dysfunction one of your third variables in question a? If so, what does the above statement mean?

One of the experts who was interviewed about this study said,

"While it's a well-done study, looking at a national data sample, there are limitations in the way the study was done," said Dr. Andrew Adesman, chief of developmental and behavioral pediatrics at Steven and Alexandra Cohen Children's Medical Center in New Hyde Park, N.Y. "There are limitations to relying on adults recalling childhood experiences...."

c. This expert is evaluating two of the big validities. Which of his quoted statements evaluate which validities, and why?

Suggested answers

a. The two variables in this association are a history of physical punishment and a diagnosis of mental illness. It could be the case that physical punishment causes mental illness. It could also be the case that kids who show early signs of mental illness (somehow) cause their parents to be more physical with them. Or, it is possible that a third variable, such as family dysfunction or poverty, is associated with both mental illness and physical punishment.

b. The statement means that the researchers used a multivariate correlational design, testing whether there is still a link between spanking and mental illness even when controlling for family dysfunction. Since the link is still there, it means that family dysfunction can actually be ruled out as a third variable explaining the relationship.

c. The expert praised the external validity of the study when he said, "While it's a well-done study, looking at a national data sample". But he criticized the construct validity of the study when he said, "there are limitations to relying on adults recalling childhood experiences". By mentioning that people may not accurately recall their childhoods, this expert is suggesting that the study's operationalization of spanking may have been flawed.

04/10/2012

More good diet news from the New York Times, along with a chance to apply lessons from Chapter 8 in the text.

A recent study showed that people who eat chocolate more frequently tend to weigh 5 to 7 pounds less than people who don't.

The people who ate chocolate the most frequently, despite eating more calories and exercising no differently from those who ate the least chocolate, tended to have lower B.M.I.’s.

Importantly, the variable concerning chocolate consumption was about frequency, not quantity. It was the frequency of eating chocoloate, not how much people ate, that was associated with lower body weight.

“It’s not the case that eating the largest amount of chocolate is beneficial; it’s that eating it more often was favorable,” Dr. Golomb said. “If you eat 10 pounds of chocolate a day, that’s not going to be a favorable thing.”

Can we make a causal statement here? Can eating chocoloate more frequently cause people to lose weight? Here's what the article says:

Dietary studies can be unreliable, since so many complicating factors can influence results, and it is difficult to pinpoint cause and effect. But the researchers adjusted their results for a number of variables, including age, gender, depression, vegetable consumption, and fat and calorie intake. “It didn’t matter which of those you added, the relationship remained very stably significant,” said Dr. Golomb.

a. In the quoted text you should recognize the telltale signs of multiple regression. As an exercise, create a regression table with a dependent variable indicated up top, and then a list of independent variables that corresponds to what you read above. Then estimate beta weights that might correspond to what the researcher is describing above.

b. Imagine that a friend says, "It's probably not the chocolate. It's just that the frequent chocolate eaters somehow consume fewer calories." What should you say in reply?

Suggested answers:

a. The dependent variable in this analysis is BMI. The independent variable of interest is frequency of chocolate consumption. The other independent variables include age, gender, depression, vegetable consumption, and fat and calorie intake. Your invented beta for the frequency of chocolate consumption variable should be negative and significant (indicating that as frequency of chocolate consumption goes up, BMI goes down, controlling for the other variables in the equation).

(For more fun, you can even check your invented values against the original article, linked here!)

b. Your friend isn't right, because even when the authors controlled for fat and calorie intake, the beta for chocolate is still significant. Chocolate consumption is still related to lower BMI, even taking that variable into account.

02/10/2012

In the journal-to-journalism cycle, the details of a study can get distorted on their way to the web. Here are three different takes on a recent article published by Canadian researchers. The study had apparently investigated links between spanking and child outcomes.

Notice that the headlines in each study make different claims and focus on different variables.

11/03/2011

This Fox News article describes the results of an empirical journal article (published in Child Development), presenting the results for a popular audience. Here is an excerpt from the journalist's description:

The researchers assessed the infants' temperaments and the mothers' parenting styles in the first six months of life (by observing the pair during feeding time) and during the toddler years... The researchers followed up with the mothers and the children's teachers when their kids were in kindergarten (ages 5 and 6).

The results showed that

...children who exhibit aggressive, defiant and explosive behavior by the time they're in kindergarten very often have tumultuous relationships with their parents from early on.

What they didn't find was a correlation between difficult early-life behaviors of the child (if the child was irritable or quick to switch mood in his or her first six months) and later aggressive actions and attitudes.

a.) Is this study correlational or experimental? How do you know?

b.) There are four variables described in the exerpt above. What are they?

c.) Following the model in Figure 8.3, sketch the results of this study. The exact correlations are not given here, but you can estimate which correlation is strongest.

d.) ﻿﻿The headline of the story is "Negative parenting starts aggressive personalities early." How does the study establish temporal precedence for this causal claim? Can the study establish internal validity? If so, how?

Women who took folic acid supplements in the first two months of pregnancy were less likely to have kids with severe language delays in a new study from Norway.

b. Given this summary, what are the variables in this study?

Here are some of the methodological details:

The researchers gave surveys to close to 40,000 Norwegian women a few months into their pregnancies. Those included questions on what supplements women were taking in the four weeks before they got pregnant through eight weeks after conception.

Then, when their kids were three years old, Susser and his colleagues asked the same women about kids' language skills, including how many words they could string together in a phrase.

c. The information above can help you interrogate construct validity. How well does it seem they measured their variables?

Finally, consider this comment:

The pattern remained after Susser's team took into account other factors that were linked to both folic acid supplementation and language skills, such as a mom's weight and education, and whether or not she was married.

d. If you've read Chapter 8, you can probably figure out what this means. Explain, in your own words, how the researchers can "take into account" these other factors.

If you’re a research methods instructor or student and would like us to consider your guest post for everydayresearchmethods.com, please contact Dr. Morling. If, as an instructor, you write your own critical thinking questions to accompany the entry, we will credit you as a guest blogger.