Correlational Studies

07/20/2018

Is social media use responsible for depressed mood? Photo: Ian Allenden/Alamy stock

Do smartphones harm teenagers? If so, how much? In this blog, I've written before about the quasi-experimental and correlational designs used in research on screen time and well-being in teenagers. In that post you can practice identifying the different designs we can use to study this question.

Today's topic is more about the size of the effect in studies that have been published. A recent Wired story tried to put the effect size in perspective.

One side of the argument, as presented by Robbie Gonzalez in Wired, scares us into seeing social media as dangerous.

For example, first

...there were the books. Well-publicized. Scary-sounding. Several, really, but two in particular. The first, Irresistible: The Rise of Addictive Technology and the Business of Keeping Us Hooked, by NYU psychologist Adam Alter, was released March 2, 2017. The second, iGen: Why Today's Super-Connected Kids are Growing Up Less Rebellious, More Tolerant, Less Happy – and Completely Unprepared for Adulthood – and What That Means for the Rest of Us, by San Diego State University psychologist Jean Twenge, hit stores five months later.

In addition,

...Former employees and executives from companies like Facebook worried openly to the media about the monsters they helped create.

But is worry over phone use warranted? Here's what Gonzalez wrote after talking to more researchers:

When Twenge and her colleagues analyzed data from two nationally representative surveys of hundreds of thousands of kids, they calculated that social media exposure could explain 0.36 percent of the covariance for depressive symptoms in girls.

But those results didn’t hold for the boys in the dataset. What's more, that 0.36 percent means that 99.64 percent of the group’s depressive symptoms had nothing to do with social media use. Przybylski puts it another way: "I have the data set they used open in front of me, and I submit to you that, based on that same data set, eating potatoes has the exact same negative effect on depression. That the negative impact of listening to music is 13 times larger than the effect of social media."

In datasets as large as these, it's easy for weak correlational signals to emerge from the noise. And a correlation tells us nothing about whether new-media screen time actually causes sadness or depression.

There are several things to notice in the extended quote above. First let's unpack what it means to, "explain 0.36% of the covariance". Sometimes researchers will square the correlation coefficient r to create the value R2. The R2 tells you the percentage of variance explained in one variable by the other (incidentally, they usually say "percent of the variance" instead of "percent of covariance."). In this case, it tells you how much of the variance in depressive symptoms is explained by social media time (and by elimination, it tells you what percentage is attributable to something else). We can take the square root of 0.0036 (that's the percentage version of 0.36%) to get the original r between depressive symptoms and social media use. It's r = .06.

Questions

a) Based on the guidelines you learned in Chapter 8, is an r of .06 small, medium, or large?

b) Przybylski claims that the effect of social media use on depression is the same size as eating potatoes. On what data might he be basing this claim? Illustrate your answer with two well-labelled scatterplots, one for social media and the other for potatoes. Now add a third scatterplot, showing listening to music.

c) When Przybylski states that the correlation held for the girls, but not the boys, what kind of model is that? (Here are your choices: moderation, mediation, or a third variable problem?)

d) Finally, Przybylski notes that in large data sets, it's easy for weak correlation signals to appear from the noise. What statistical concepts are being applied here?

e) Chapter 8 presents another example of a large data set that found a weak (but statistically significant) correlation. What is it?

f) The discussion above between Gonzalez and Przybylski concerns which of the four big validities?

a) An r of .06 is probably going to be characterized as "small" or "very small" or even "trivial." That's what the "potatoes" point is trying to illustrate, in a more concrete way.

b) One scatterplot should be labeled with "potato eating" on the x axis and "depression symptoms" on the y axis. The second scatterplot should be labeled with "social media use" on the x axis and "depression symptoms" on the y axis. These first two plots should show a positive slope of points with the points very spread out--to indicate the weakness of the association. The spread of the first two scatterplots should be almost the same, to represent the claim the two relationships are equal in magnitude. The third scatterplot should be labeled with "listening to music" on the x axis and "depression symptoms" on the y axis, and this plot should show a much stronger, positive correlation (a tighter cloud of points).

c) It is a moderator. Gender moderates (changes) the relationship between screen use and depression.

d) Very large data sets have a lot of statistical power. Therefore, large data sets can show statistical significance for even very, very, small correlations--even correlations that are not of much practical interest. A researcher might report a "statistically significant' correlation, but it's essential to also ask about the effect size and its practical value (the potatoes argument). Note: you can see the r = .06 value in the original empirical article here, on p. 9.

e) The example in Chapter 8 is the one about meeting one's spouse online and having a happier marriage--that was a statistically significant relationship, but r was only .03. That didn't stop the media from hyping it up, however.

f) Statistical validity

g) The research on smartphones and depressive symptoms is correlational, making causal claims (and causal language) inappropriate. That means that we can't be sure if social media is leading to the (slight) increase in depressive symptoms, or if people who have more depressive symptoms end up using more social media, or if there's some third variable responsible for both social media use and depressive symptoms. As the Wired article states,

...research on the link between technology and wellbeing, attention, and addiction finds itself in need of similar initiatives. They need randomized controlled trials, to establish stronger correlations between the architecture of our interfaces and their impacts; and funding for long-term, rigorously performed research.

b) What foods might be associated with your own cultural identity (or identities?)

Here are some elements of the journalist's story. NPR reported about...

...a recent study in the Journal of Experimental Social Psychology, authored by Jay Van Bavel, social psychologist at New York University and his colleagues. The researchers found that the stronger your sense of social identity, the more you are likely to enjoy the food associated with that identity. The subjects of this study were Southerners and Canadians, two groups with proud food traditions.

The first experiment, containing 103 people, found that the more strongly someone self-identifies as Southern, the more they would expect Southern food to taste good, food like fried catfish or black-eyed peas.

c) In the study above, what are the two variables? Do they seem to be manipulated or measured?

d) Given your answer to question c) is this study really an "experiment"?

e) Can this study (above) support the causal claim that "identity impacts the food you like"? What are some alternative explanations? Hint: Think about temporal precedence and third variable explanations.

Here's the description of a second study:

In a second experiment, containing 151 people, researchers also found that when Southerners were reminded of their Southernness — primed, in psychology speak — their perception of the tastiness of Southern food was even higher. That is, the more Southern a person was feeling at that moment, the better the food tasted [compared to a group who was not primed].

e) What are the two variables in the study above? Were the variables manipulated or measured?

f) Given your answer to question e) is this study really an "experiment"?

g) Can this study support the claim that "identity impacts the food you like"?

They found a similar result when taste-testing with Canadians, finding that Canadian test subjects only preferred the taste of maple syrup over honey in trials when they were first reminded of their Canadian identity.

h) You know the drill: For the study above, what kind of study was is? What are its variables?

i) Challenge question: Can you tell if the independent variable in the Canadian study was manipulated as between groups or within groups?

In sum, it appears that two out of the three studies reviewed by this NPR article were experimental, so they're more likely to support the causal claim about "identity impacting the food you like." The journalist calls attention to this manipulation of identity in this description:

The relationship between identity and food preference is not new. However, the use of priming to induce identity makes this study different from its predecessors.

"Priming is like opening a filing drawer and bringing to your attention all the things that are in the drawer," says Paul Rozin, food psychologist at University of Pennsylvania, who was not involved in the study. "You can't really change peoples' identities in a 15-minute setting, but you can make one of their identities more salient, and that's what they've done in this study."

j) What other ways might you manipulate cultural identity in an experimental design?

Good news! The empirical journal article is open-access here. When you read it, you'll see that the journalist simplified the design of the studies for her article in NPR.

05/10/2018

I'm standing at my desk as I compose this post....could that make my writing go better? Yes, according to an editorial entitled, "Standing up at your desk could make you smarter." The editorial leads with a strong causal claim and then describes three studies, each with a different design. Here's one of the studies:

A study published last week...showed that sedentary behavior is associated with reduced thickness of the medial temporal lobe, which contains the hippocampus, a brain region that is critical to learning and memory.

The researchers asked a group of 35 healthy people, ages 45 to 70, about their activity levels and the average number of hours each day spent sitting and then scanned their brains with M.R.I. They found that the thickness of their medial temporal lobe was inversely correlated with how sedentary they were; the subjects who reported sitting for longer periods had the thinnest medial temporal lobes.

a) What were the two variables in this study? Were they manipulated or measured? Was this a correlational or experimental study?

b) The author writes that the study "showed that sedentary behavior is associated with reduced thickness of the medial temporal lobe." Did he use the correct verb? Why or why not?

Here's a second study described in the editorial:

Intriguingly, you don’t even have to move much to enhance cognition; just standing will do the trick. For example, two groups of subjects were asked to complete a test while either sitting or standing [randomly assigned]. The test — called Stroop — measures selective attention. Participants are presented with conflicting stimuli, like the word “green” printed in blue ink, and asked to name the color. Subjects thinking on their feet beat those who sat by a 32-millisecond margin.

c) What are the two variables in this study? Were they manipulated or measured? Was this a correlational or experimental study?

d) Does this study support the author's claim that "you don't have to move much to enhance cognition; just standing will do the trick"? Why or why not?

e) Bonus: What kind of experiment was being described here? (Posttest only, prettest/posttest, repeated measures, or concurrent measures?) Comment, as well, on the effect size.

It’s also yet another good argument for getting rid of sitting desks in favor of standing desks for most people. For example, one study assigned a group of 34 high school freshmen to a standing desk for 27 weeks. The researchers found significant improvement in executive function and working memory by the end of the study.

f) What are the variables in this study? Were they manipulated or measured?

g) Do you think this study can support a causal claim about standing desks improving executive function and working memory?

The author added the following statement to the third study on high school freshmen:

True, there was no control group of students using a seated desk, but it’s unlikely that this change was a result of brain maturation, given the short study period.

h) What threat to internal validity has the author identified in this statement?

i) What do you think of his evaluation of this threat?

j) Of the three studies presented, which provides the strongest evidence for the claim that "standing up at your desk could make you smarter"? What do you think? On the basis of this evidence, should I keep standing here?

03/20/2018

Maybe you shouldn't get into the car with this guy. Credit: Shutterstock.com

You probably know drivers who honk, tailgate, and shake their fists; you know others who give drivers space and respect. Now researchers have identified a trait the aggressive drivers might share: Narcissism.

The participants answered questions from the Narcissistic Personality Inventory, a set of questions used since 1988 to measure narcissism. This questionnaire had participants rate how strongly they agreed with items such as: “I like to be the center of attention,” or “I am an extraordinary person” on a 1 to 5 scale. They then addressed similar items about aggressive driving behavior: “I often swear when driving a car,” or “When driving my car, I easily get angry about other drivers.” ...The researchers report that the more narcissistic drivers are, the more angry and aggressive they reported becoming on the road.

a) Let's talk measurement first. How was narcissism measured--Did they use a self report? An observational measure? or a physiological measure?

b) Now for the second variable, aggressive driving: How was this measured--Was it self report? An observational measure? or a physiological measure?

c) Was this a correlational or experimental study? How do you know?

d) Sketch a graph (with well-labeled axes) of the results of the study.

Next, the researchers conducted a lab-based study with university students. They measured narcissism just as they'd done before, but they measured aggressive driving differently. Here's how they measured aggressive driving in the lab:

...participants sat in the driver’s seat of a 2010 Honda Accord, surrounded on three sides by a curved projection screen. In a 15- to 25-minute driving exercise, the participants saw other computer-generated cars and were told that some of them were being operated by other study participants. (In fact, the experimenters were controlling the other vehicles.)

During the exercise, the participants encountered:

a car pulling suddenly in front of them;

a traffic jam with two 10-second full traffic stops, one after another;

a construction zone with one lane closed and the other slowed down;

a second car mimicking the human driver’s behavior; and

a traffic light that was red for 60 seconds and green for just 5 seconds.

The researchers found that the participants who scored high on narcissism measures were more likely to tailgate, speed, drive off-road, cross the center line into oncoming traffic, drive on the shoulder, honk their horn, or use “verbal aggression” or “aggressive gestures,” in the experimenters’ chaste wording.

e) In this study, how was aggressive driving measured--Did they use a self report? An observational measure? or a physiological measure?

f) Was this a correlational or experimental study? How do you know?

g) Sketch a graph of the results of the study. Label your axes mindfully.

h) Can you think of moderators of this basic relationship? For example, might there be situations or settings for which narcissism is especially strongly linked to aggressive driving? (As you answer, consider this: Past work on narcissism has established that narcissists aren't always aggressive; they are mainly aggressive when others reject them or when they are provoked.) Create a moderator table like those seen in Chapter 8 (e.g., Figure 8.19 or Table 8.5)

i) Can you think of a mediator that explains the relationship between narcissism and aggressive driving? If so, sketch a mediator diagram like those seen in Chapter 9 (e.g., Figure 9.11 or Figure 9.13)

01/20/2018

Do people with wider faces (left) show more antisocial tendencies than those with narrow faces (right)? Photo: istockphoto

According to several previous studies in psychological science, men with wider faces--a greater ratio of width to height (like in the photo on the left, compared to the right)--tend to show antisocial tendencies such as racial bias, exploitation, and even aggression. Researchers attributed this link to exposure to testosterone during development, which, they say, causes both wider facial structure and antisocial behavior.

Kosinski found that previous studies often had methodological shortcomings such as small sample sizes. Half of the previous studies that he identified involved fewer than 25 participants and the average sample size was 40. And seven out of ten of the studies only just crossed the conventional threshold for significance of p=.05.

These factors led Kosinski to conduct a large-scale study of face measurements and behavioral tendencies. His research, published in Psychological Science, finds no relationship between facial width-to-height ratios (fWHR) and behavioral tendencies in a large sample of over 135,000 participants.

Questions

a) Review the material in Chapters 11 and 14, and explain why studies based on small samples can lead to results that are difficult to replicate. (You might also want to review the "kindergarten height" example in this recent blog post).

b) Why is it a problem that, in 7 out of 10 studies, the results "only just crossed the conventional threshold for significance?"

Now read a bit more about the "big data" methods that Kosinski employed in his research:

Kosinski turned to a very large dataset collected via a Facebook app called MyPersonality.org. The app comprised a collection of psychometric tests and surveys that Facebook users could take and then see how they scored — they could also volunteer their scores and Facebook profile data to be used in research projects. Using this bank of over 800,000 users’ surveys and over 2 million profile pictures, Kosinski tested his research question: Do broad faces indicate antisocial tendencies? [...]

After a preliminary experiment with 1,692 users showed that a computer could measure width-to-height ratios with the same accuracy that humans could, Kosinski analyzed 173,241 photos from 137,163 male and female participants (some users had multiple profile pictures and their measurements were averaged before analysis).

The results showed that facial broadness didn’t substantially correlate with any of the 55 personality measures tested....For example, broader-faced people reported themselves to be more prosocial, sympathetic, trusting, and cooperative,” says Kosinski. “Also, broader-faced people reported less interest in drug use, weapons, piercing, and tattoos. Moreover, broader-faced people did not score significantly higher on any of the traits positively related to antisocial and aggressive behavioral tendencies, including the personality facets of excitement-seeking and anger, impulsiveness, and militarism (i.e., interest in paramilitary groups, the armed forces, bodybuilding, martial arts, and survivalism).”

c) According to this description, Kosinski is basically running a series of bivariate correlations. Each one was between a self-reported trait and _________?

d) Pick one of the personality variables tested in the study. Now sketch a scatterplot of the result, labelling your axes carefully.

e) Kosinski's sample included more than a hundred thousand users. Why might this lead to a more stable estimate of the true relationship between facial broadness and personality? (This is the complement to question a), above)

f) Kosinki's study is an example of a "failure to replicate." Review the concepts in Table 14.1 and indicate which elements might apply in this case.

g) What questions might you ask about the construct validity of the personality measures used in Kosinski's study?

Suggested answers

a) and e) Small samples are more likely to be affected by one or two extreme scores, whereas in very large samples, the extreme scores are much more likely to be balanced out by other scores. The gifs in this blog post show the principle dynamically.

b) Some researchers have proposed that when a manuscript reports p-values very close to the conventional cutoff of .05 (p-values of .04 or .03), it's a sign that a researcher might have "p-hacked" the study. P-hacking is when a researcher goes through a series of options when analyzing the data, such as eliminating outliers, adding covariates, or testing multiple dependent measures, stopping analysis only when p just crosses under the .05 threshold. Therefore, when, in a body of literature, most of the p-values are just below .05, we might suspect that the underlying finding is a fluke, not a real result.

c) Facial broadness, as measured by width-to-height ratio.

d) One axis should be labelled "facial broadness" and the other might be labelled "interest in drug use." The cloud of points should be extremely spread out, showing no pattern or discernible slope.

e) see a) answer above.

f) The concepts in Table 14.1 that seem to apply best are the third (the original study's sample was very small) and perhaps the fourth (the original study may have tried multiple statistical analyses). (We cannot be sure without more investigation into the original studies, but these are the two issues raised in the APS summary of Kosinski's work.)

g) Indeed, we don't know much about the personality measures used in the study. The full manuscript might report more about whether data collected with these personality measures shows that they are reliable and valid.

09/20/2017

One conspiracy theory holds that the government spreads dangerous chemicals in the exhaust from airplanes (the "chemtrails" theory). According to the studies, who is more likely to believe this theory? Credit: Jason Batterham/Shutterstock

There's a causal claim in this journalist's story about research on people who believe conspiracy theories. First, here's the causal headline:

b) Draw a diagram of this causal claim, using this form: A ---> B. (That is, which variable comes first, according to the wording the journalist used?)

Now that you've established the claim, does the research support it? The journalist reports on several studies conducted by researchers at the University of Mainz. Here's one:

The researchers first asked a sample of 238 US participants recruited via Amazon’s Mechanical Turk survey website to complete a self-reported “Need For Uniqueness” scale (they rated their agreement with items like “being distinctive is extremely important to me”) and a Conspiracy Mentality scale (e.g. “Most people do not see how much our lives are determined by plots hatched in secret.”) before indicating whether or not they believed in a list of 99 conspiracy theories circulating online....[The results showed that] participants’ self-reported Need For Uniqueness ... correlated with their stronger endorsement of the conspiracy beliefs.

c) In the study above, sketch a scatterplot of the results they described. Label your axes mindfully.

d) Is the study above correlational or experimental? Why or why not?

e) Does this study support the claim that the journalist assigned to it? That is, does it show that believing conspiracy theories causes people's need for uniqueness to be satisfied? As you explain your answer, pay careful attention to possible third variables. That is, what third variables ("C"s) might be reasonably correlated with both need for uniqueness and belief in conspiracies?

Here's another study in the series:

The second study replicated this finding with a further 465 Mechanical Turk participants based in the US, but this time half the sample read a list of the five most well known conspiracy theories and the five least known ones, whereas the other half of the group read the five most popular conspiracy theories and the five least popular. Again, self-reported Need For Uniqueness correlated with stronger agreement with the various conspiracy theories.

f) Is the second study correlational or experimental? Why? Sketch the results in a well-labeled scatterplot.

g) Does this second study rule out any of the third variables you came up with when answering question e) above?

Here's a final study:

Note, the conspiracy theory that featured in this final experiment was entirely made-up by the researchers. ...The conspiracy theory was about smoke detectors and the claim was that they produce dangerous hypersound. The researchers led half of 290 participants on Amazon’s Mechanical Turk to believe that this was a popular conspiracy theory in Germany where it was alleged to be believed by 81 per cent of Germans. The rest of the participants were led to believe that the theory was doubted by 81 per cent of Germans.

While information about the popularity of the theory didn’t affect participants overall, it did impact those who said that they tended to endorse a lot of conspiracy theories. Among these conspiracy-prone participants, their belief in the made-up smoke detector conspiracy was enhanced on average when the conspiracy was framed as a minority opinion.

h) Is the above study correlational or experimental? Why?

i) Sketch the results of the study--Hint: you might need to use a bar graph that also includes two colors of bars.

j) To what extent does this final study support the claim that believing "conspiracy theories causes people's need for uniqueness to be satisfied"?

k) If the study doesn't support the claim given in j), is there another, modified causal claim that the study can support? What is it?

In the final section of the story, the journalist concludes, "Taken together, the findings provide convincing evidence that some people are motivated to agree with conspiracy theories with an aura of exclusiveness."

l) Should this final claim be classified as causal or association? To what extent is this claim supported by the studies?

My thanks to Marianne Lloyd of Seton Hall University for sharing this story!

Selected answers

d) This is a correlational study because both need for uniqueness and belief in conspiracy theories were measured variables.

e) In this correlational study, we can't know for sure whether Need for Uniqueness --> Belief in C0nspiracy Theories, whether Belief in Conspiracy Theories --> Need for Uniqueness, or whether some third variable is associated with both. One possible third variable might be social anxiety: Perhaps socially anxious people believe more conspiracy theories and socially anxious people are also high on need for uniqueness. Being highly educated is associated with lower belief in conspiracy theories (according to this study), and if highly educated people are also low in need for uniqueness, then education is a potential third variable as well.

f) Both variables were measured again, so....correlational study!

g) You could raise the same objections here that you did in e)

h) This study is experimental because they manipulated the proportion of people who supposedly believed in the theory. There are two measured variables in this study as well; one is the tendency to believe in conspiracy theories (high or low), and the other is the extent of belief in the smoke detector theory.

j) No; in fact need for uniqueness wasn't reported by the journalist as a variable at all, so it probably can't support that claim.

k) I believe this study can support the following claim: Telling people that only a minority of Germans support a conspiracy theory causes people to believe it more, only if they already are the type of person who believes in conspiracy theories.

l) The final statement is an association claim. Because most of the studies were correlational, the studies do support it!

06/20/2017

The study found an association between screen time and language delay in a sample of 900 18-month olds. Photo: Maria Sbytova/Shutterstock

This CNN story reports on research presented at an academic conference for Pediatrics. According to the conference presentation,

[A] study found that the more time children between the ages of six months and two years spent using handheld screens such as smartphones, tablets and electronic games, the more likely they were to experience speech delays.

According to this description, the two main conceptual variables in the study are "time spent using a handheld screen" and "speech delay." Read on to find out how each variable was operationalized:

In the study, which involved nearly 900 children, parents reported the amount of time their children spent using screens in minutes per day at age 18 months. Researchers then used an infant toddler checklist, a validated screening tool, to assess the children's language development also at 18 months. They looked at a range of things, including whether the child uses sounds or words to get attention or help and puts words together, and how many words the child uses.

a) According to the text, how did the study operationalize the variable, time spent using a handheld screen? Do you think this was a valid measure? Why or why not?

b) According to the text, how did the study operationalize the variable, speech delay?

c) The passage describes the infant toddler checklist as "a validated screening tool"--how do you think this tool was validated? What data might they have collected to validate this measure? (apply concepts from Chapter 5)

d) Sketch a scatterplot of the association they reported in the first quoted section.

e) Does this association allow us to conclude that "exposure to handheld screens causes children to experience speech delays?" Why or why not?

Read the following description; identify the mediator that the speaker is proposing.

"We do know that young kids learn language best through interaction and engagement with other people, and we also know that children who hear less language in their homes have lower vocabularies." It may be the case that the more young children are engaged in screen time, then the less time they have to engage with caretakers, parents and siblings, said [an expert who commented on the findings].

f) Sketch the mediator pattern that is hypothesized above.

Suggested answers

a) This variable appears to have been operationalized via parents' reports of their children's screen time.

b) Speech delay may have been operationalized with multiple measures. It is not clear from the reporting if the Infant Toddler Checklist is one measure, separate from whether the "child uses sounds or words to get attention or help and puts words together, and how many words the child uses," or if these three components comprise the infant toddler checklist.

c) One way to validate a measure such as the Infant Toddler Checklist would be through a known groups paradigm. Professional speech language pathologists might identify groups of children who either do have speech delay or do not. Then all children in the two groups would be administered the Checklist. If the Checklist is valid, then the group who have been diagnosed with speech delay should score lower on it than the group who who have been diagnosed as developing normally.

d) Your scatterplot should have "Speech delay" (or, alternatively, "Infant Toddler Checklist") on one axis and "Time spent on screens" on the other axis. The dots should slope upwards from left to right.

e) The results show covariance: Speech delay is associated with time on screens. The method does not allow us to establish temporal precedence, since both variables were apparently measured at the same time. We do not therefore know if the screen time came first (inhibiting speech delay) or if children with speech delays are more likely to be drawn to screens (perhaps to cope with frustration of not communicating easily). Internal validity also is not established. A reasonable third variable explanation might be time in day care: it is possible that children in day care are both less likely to be on screens (because most day cares have lots of other toys) and children in day care are exposed to more language because there are more people around.

f) You'd draw this mediator pattern:

exposure to screens ---> less social engagement with caretakers ---> language delay

05/20/2017

Can you find North Korea on this map? Credit: ASDFGH/Wikimedia Commons

Readers can explore several frequency and association claims in this New York Times story headlined, "If Americans Can Find North Korea on a Map, They're More Likely to Prefer Diplomacy."

The story reports that when shown a map of East Asia (like the one shown here), only 36% of Americans could correctly locate the country of North Korea. The country has been in the news a lot lately because it has launched ballistic missiles. What should the American response be to such aggression?

An experiment led by Kyle Dropp of Morning Consult from April 27-29, conducted at the request of The New York Times, shows that respondents who could correctly identify North Korea tended to view diplomatic and nonmilitary strategies more favorably than those who could not. These strategies included imposing further economic sanctions, increasing pressure on China to influence North Korea and conducting cyberattacks against military targets in North Korea.

They also viewed direct military engagement – in particular, sending ground troops – much less favorably than those who failed to locate North Korea.

a) The finding that only 36% of Americans can locate North Korea on a map is what kind of claim? How would you state the variable(s) involved in this claim?

b) The report says "respondents who could correctly identify North Korea tended to view diplomatic and nonmilitary strategies more favorably than those who could not." What kind of claim is this? What are the variable(s) involved in this claim?

c) In the quoted text above, the journalist called this "An experiment." It's not--how come?

The lead image in the story (linked here), depicts people's guesses about where North Korea is located. Could you locate it?

d) Go to the story now and scroll down about half way to see how they have presented correlations between "Knowing where North Korea is" and their endorsement of various responses to North Korean aggression. What does a negative number mean in this table?

e) Make a prediction: Whom do you think is more likely to know where North Korea is: Men or women? Democrats or Republicans? People who've visited another country or those who've never been abroad? Older or younger Americans?

Now, scroll down the story to the pink bar graphs and see if your predictions were supported.

04/10/2017

Does giving a child a sip change his or her long-term drinking habits? Photo: Tang Ming Tung/Getty Images

It's a strong causal claim: Giving kids sips of beer turns them into teenage drunks. Did the journalist get it right? Here are some quotes from the story, posted in the food website Munchies:

Those innocent tastes of Chianti at the Thanksgiving dinner table could morph your child from a sweet, sober cherub into a bleary-eyed teenage booze-guzzling ne'er-do-well.

New research in the Journal of Studies on Alcohol and Drugs has found that children who sip alcohol as youngsters have an increased likelihood of becoming drinkers by the time they reach high school. In a long-term study by Brown University of 561 students in Rhode Island, researchers found that those who had tried even small sips were a whopping five times more likely to have tried a whole beer or cocktail by the time they reached ninth grade, and four times more likely to have gotten rip-roaring drunk.

a) What keywords in this quote indicate that the journalist is making a causal claim?

b) What were the two variables studied by the researchers? Explain whether whether you think each one was measured or manipulated.

c) What kind of study is this claim apparently based upon--correlational or experimental?

d) Given the study's design, is the causal claim appropriate? Apply the three causal criteria.

In an interview with Munchies, the lead researcher, Kristina Jackson, mentions several possible third variables for the association:

But Jackson also believes that other factors correlate with these numbers, in addition to the "early sipper" factor. Parents' drinking habits, a family history of alcoholism, and general personality and behavioral characteristics also have strong impacts on the boozy worldviews of children and teenagers.

e) Chapter 9 readers: Do you see any evidence that the researchers controlled for these potential internal validity problems in their analyses? You might have to hunt down the original journal article to find out.

f) The journalist made a dramatic point about the statistic about kids who'd sipped beer "being four times more likely to have gotten rip-roaring drunk." Which of the four big validities is this statement about?

Even though the journalist's causal claim is probably not justified, adolescent substance use is a serious issue. The journalist supplemented the story with several frequency claims. You might be interested in some of these statistics.

Roughly 30 percent of the students said that they had tasted alcohol when in sixth grade..., mostly due to exposure from their parents while at a party, on vacation, or in other special circumstances. Of that group (the "early sippers"), 26 percent reported having consumed a full alcoholic drink by ninth grade, while only 6 percent of non-early-sippers had experienced the pleasures of an ice-cold Natural Ice or homemade Screwdriver. And at that same age (roughly 14-15 years old), 9 percent of early sippers had gotten totally trashed, while only 2 percent of those with less-loose parents had.

02/10/2017

The death of a fellow officer on duty was rated as one of the most stressful events for police officers. Credit: W. Smith/Epa/REX/Shutterstock

What happens when people are exposed to very stressful events? A study has investigated this question using a sample of police officers. The study found a correlation between exposure to stressful job events and cortisol change over time. This is a journalist's take on the study.

Here's an overview :

For most people, cortisol, the vital hormone that controls stress, increases when they wake up. It's the body's way of preparing us for the day....[Now, a] study of more than 300 members of the Buffalo Police Department suggests that police events or conditions considered highly stressful by the officers may be associated with disturbances of the normal awakening cortisol pattern. That can leave the officers vulnerable to disease, particularly cardiovascular disease, which already affects a large number of officers.

The study's two main variables were the experience of major stress and cortisol patterns. First, read how they measured stress in the sample:

For this study, participating officers assessed a variety of on-the-job stressors using a questionnaire that asks officers to rate 60 police-related events with a "stress rating." Events perceived as very stressful are assigned a higher rating.

Exposure to battered or dead children ranked as the most stressful event, followed by: killing someone in the line of duty; having a fellow officer killed on duty; a situation requiring the use of force; and being physically attacked.

Identifying the five most intense stressors police can face was significant, Violanti said. "When we talk about interventions to help prevent disease, it's tricky because these stressors are things that can't be prevented," he said. ... The survey showed that the officers experienced one of the five major stressors, on average, 2.4 times during the month before the survey was completed.

Second, read how they measured cortisol patterns. Notice how in this case, the variable is operationalized not as a single outcome, but as a pattern over four time periods:

Cortisol was measured using saliva samples taken upon waking up, and 15, 30 and 45 minutes thereafter.

Here's how the journalist described and interpreted the result:

Officers who weren't as stressed showed a steep and steady, or regular, increase in cortisol from baseline. However, officers with a moderate and high major stress index had a blunted response over time.

That's because stress affects a system in the body known as the hypothalamic pituitary adrenal axis, or HPA Axis. When you're stressed, the HPA Axis elicits cortisol, a hormone that gets the body going and activates against the stressor, Violanti explained. Under normal circumstances, the body's cortisol pattern looks like a normal bell curve: It rises when we wake up, peaks around midday and comes back down at bed time.

"If you experience chronic stress or high stress situations, the cortisol can no longer adjust normally like this. So what happens with people under a lot of stress, the cortisol flattens out. For some people it goes down and others it goes up and stays up. That's called the dysregulation of the HPA axis," said Violanti, who served with the New York State Police for 23 years before shifting into academia.

Questions to answer:

Draw a well-labeled figure depicting the study's main result. (Will it be a bar graph, line graph, or a scatterplot? What are the best labels?)

What makes this a correlational study?

Evaluate the construct validity of a) the operationalization of stress and b) the operationalization of cortisol patterns. In your opinion, how well did they measure these two variables?

Consider the external validity of this study. What are the characteristics of the Can we assume that the results will generalize to other cops? Do we know if the results generalize to other professions or people who have experienced stress? Why or why not?

Can the study support the causal claim that "exposure to stress causes cortisol dysregulation in cops?" Consider temporal precedence (the directionality problem) as well as internal validity (the third variable problem).

Question 4 is about external validity. You might be interested to read the authors' take on this question:

While the current study focused on Buffalo officers, the findings have implications for cops around the country, said paper co-author Michael Andrew, PhD, chief of the Biostatistics and Epidemiology Branch of the CDC/NIOSH Health Effects Laboratory Division in Morgantown, West Virginia.

"These findings show that exposure to major events inherent to police work may lead to a temporary reduction in the biological ability to respond to further stressful events. Since the major stressor events in this study were originally developed to reflect events that can apply to any police department, these results should generalize, more or less, to any police department in the U.S.," Andrew said, adding, "This points to the need for continued focus on supporting police officer health."

For Question 5, you already know that this study establishes covariance. However, temporal precedence is not very clear. It's possible that cops with poor cortisol regulation are more likely to be involved in future stressful events (for some reason). Internal validity is more of a problem, because, at least based on what's presented here, we don't know if they controlled for third variables such as what type of neighborhood the cops usually patrolled, or for personality characteristics such as impulsiveness, Type A personality, or other traits. For example, an impulsive personality might be associated with more stressors on the job, and might also be associated with cortisol patterns.

If you’re a research methods instructor or student and would like us to consider your guest post for everydayresearchmethods.com, please contact Dr. Morling. If, as an instructor, you write your own critical thinking questions to accompany the entry, we will credit you as a guest blogger.