Chapter 08; Bivariate Correlation Research

07/20/2018

Is social media use responsible for depressed mood? Photo: Ian Allenden/Alamy stock

Do smartphones harm teenagers? If so, how much? In this blog, I've written before about the quasi-experimental and correlational designs used in research on screen time and well-being in teenagers. In that post you can practice identifying the different designs we can use to study this question.

Today's topic is more about the size of the effect in studies that have been published. A recent Wired story tried to put the effect size in perspective.

One side of the argument, as presented by Robbie Gonzalez in Wired, scares us into seeing social media as dangerous.

For example, first

...there were the books. Well-publicized. Scary-sounding. Several, really, but two in particular. The first, Irresistible: The Rise of Addictive Technology and the Business of Keeping Us Hooked, by NYU psychologist Adam Alter, was released March 2, 2017. The second, iGen: Why Today's Super-Connected Kids are Growing Up Less Rebellious, More Tolerant, Less Happy – and Completely Unprepared for Adulthood – and What That Means for the Rest of Us, by San Diego State University psychologist Jean Twenge, hit stores five months later.

In addition,

...Former employees and executives from companies like Facebook worried openly to the media about the monsters they helped create.

But is worry over phone use warranted? Here's what Gonzalez wrote after talking to more researchers:

When Twenge and her colleagues analyzed data from two nationally representative surveys of hundreds of thousands of kids, they calculated that social media exposure could explain 0.36 percent of the covariance for depressive symptoms in girls.

But those results didn’t hold for the boys in the dataset. What's more, that 0.36 percent means that 99.64 percent of the group’s depressive symptoms had nothing to do with social media use. Przybylski puts it another way: "I have the data set they used open in front of me, and I submit to you that, based on that same data set, eating potatoes has the exact same negative effect on depression. That the negative impact of listening to music is 13 times larger than the effect of social media."

In datasets as large as these, it's easy for weak correlational signals to emerge from the noise. And a correlation tells us nothing about whether new-media screen time actually causes sadness or depression.

There are several things to notice in the extended quote above. First let's unpack what it means to, "explain 0.36% of the covariance". Sometimes researchers will square the correlation coefficient r to create the value R2. The R2 tells you the percentage of variance explained in one variable by the other (incidentally, they usually say "percent of the variance" instead of "percent of covariance."). In this case, it tells you how much of the variance in depressive symptoms is explained by social media time (and by elimination, it tells you what percentage is attributable to something else). We can take the square root of 0.0036 (that's the percentage version of 0.36%) to get the original r between depressive symptoms and social media use. It's r = .06.

Questions

a) Based on the guidelines you learned in Chapter 8, is an r of .06 small, medium, or large?

b) Przybylski claims that the effect of social media use on depression is the same size as eating potatoes. On what data might he be basing this claim? Illustrate your answer with two well-labelled scatterplots, one for social media and the other for potatoes. Now add a third scatterplot, showing listening to music.

c) When Przybylski states that the correlation held for the girls, but not the boys, what kind of model is that? (Here are your choices: moderation, mediation, or a third variable problem?)

d) Finally, Przybylski notes that in large data sets, it's easy for weak correlation signals to appear from the noise. What statistical concepts are being applied here?

e) Chapter 8 presents another example of a large data set that found a weak (but statistically significant) correlation. What is it?

f) The discussion above between Gonzalez and Przybylski concerns which of the four big validities?

a) An r of .06 is probably going to be characterized as "small" or "very small" or even "trivial." That's what the "potatoes" point is trying to illustrate, in a more concrete way.

b) One scatterplot should be labeled with "potato eating" on the x axis and "depression symptoms" on the y axis. The second scatterplot should be labeled with "social media use" on the x axis and "depression symptoms" on the y axis. These first two plots should show a positive slope of points with the points very spread out--to indicate the weakness of the association. The spread of the first two scatterplots should be almost the same, to represent the claim the two relationships are equal in magnitude. The third scatterplot should be labeled with "listening to music" on the x axis and "depression symptoms" on the y axis, and this plot should show a much stronger, positive correlation (a tighter cloud of points).

c) It is a moderator. Gender moderates (changes) the relationship between screen use and depression.

d) Very large data sets have a lot of statistical power. Therefore, large data sets can show statistical significance for even very, very, small correlations--even correlations that are not of much practical interest. A researcher might report a "statistically significant' correlation, but it's essential to also ask about the effect size and its practical value (the potatoes argument). Note: you can see the r = .06 value in the original empirical article here, on p. 9.

e) The example in Chapter 8 is the one about meeting one's spouse online and having a happier marriage--that was a statistically significant relationship, but r was only .03. That didn't stop the media from hyping it up, however.

f) Statistical validity

g) The research on smartphones and depressive symptoms is correlational, making causal claims (and causal language) inappropriate. That means that we can't be sure if social media is leading to the (slight) increase in depressive symptoms, or if people who have more depressive symptoms end up using more social media, or if there's some third variable responsible for both social media use and depressive symptoms. As the Wired article states,

...research on the link between technology and wellbeing, attention, and addiction finds itself in need of similar initiatives. They need randomized controlled trials, to establish stronger correlations between the architecture of our interfaces and their impacts; and funding for long-term, rigorously performed research.

07/10/2018

Her school achievement later in life can be predicted from her ability to wait for a treat (or by her family's SES). Photo: Manley099/Getty Images

There's a new replication study about the famous "marshmallow study", and it's all over the popular press. You've probably heard of the original research: Kids are asked to sit alone in a room with a single marshmallow (or some other treat they like, such as pretzels). If the child can wait for up to 15 minutes until the experimenter comes back, they receive two marshmallows. But if they eat the first one early, they don't. As part of the original study, kids were tracked over several years. One of the key findings was that the longer children were able to wait at age 4, the better they were doing in school as teenagers. Psychologists have often used this study as an illustration of how self-control is related to important life outcomes.

The press coverage of this year's replication study illustrates at least two things. First, it's a nice example of multiple regression. Second, it's an example of how different media outlets assign catchy--but sometimes erroneous--headlines on the same study.

First, let's talk about the multiple regression piece. Regression analyses often try to understand a core bivariate relationship more fully. In this case, the core relationship they start with is between the two variables, "length of time kids waited at age 4" and "test performance at age 15." Here's how it was described by Payne and Sheeran in the online magazine Behavioral Scientist:

The result? Kids who resisted temptation longer on the marshmallow test had higher achievement later in life. The correlation was in the same direction as in Mischel’s early study. It was statistically significant, like the original study. The correlation was somewhat smaller, and this smaller association is probably the more accurate estimate, because the sample size in the new study was larger than the original. Still, this finding says that observing a child for seven minutes with candy can tell you something remarkable about how well the child is likely to do in high school.

a) Sketch a well-labelled scatterplot of the relationship described above. What direction will the dots slope? Will they be fairly tight to a straight line, or spread out?

b) The writers (Payne and Sheeran) suggest that a larger sample size leads to a more accurate estimate of a correlation. Can you explain why a large sample size might give a more accurate statistical estimate? (Hint: Chapter 8 talks about outliers and sample size--see Figures 8.10 and 8.11.)

Now here's more about the study:

The researchers next added a series of “control variables” using regression analysis. This statistical technique removes whatever factors the control variables and the marshmallow test have in common. These controls included measures of the child’s socioeconomic status, intelligence, personality, and behavior problems. As more and more factors were controlled for, the association between marshmallow waiting and academic achievement as a teenager became nonsignificant.

c) What's proposed above is that social class is a third variable ("C") that might be associated with both waiting time ("A") and school achievement ("B"). Using Figure 8.15. draw this proposal. Think about it, too: Why does it make sense that lower SES might go both with lower waiting time (A)? Why might lower SES go with lower school achievement (B)?

d) Now create a mockup regression table that might fit the pattern of results being described above. Put the DV at the top (what is the DV?), then list the predictor variables underneath, starting with Waiting time at Age 4, and including things like Child's Socioeconomic Status and Intelligence. Which betas should be significant? Which should not?

Basically, here we have a core bivariate relationship (between wait time and later achievement), and then a critic suggests a possible third variable (SES). They used regression to see if the core relationship was still there when the third variable was controlled for. The core relationship went away, suggesting that SES was a third variable that can help explain why kids who wait longer do better in school later on.

Next let's talk about some of the hype around this replication study. The Behavioral Scientist piece (quoted above) is one of the more balanced descriptions. Its headline was, Try to Resist Misinterpreting the Marshmallow Test. It emphasized that the core relationship was replicated. It also explains in some detail why SES is related to self-control, and how the two probably cannot be meaningfully separated--it's a nuanced report. But other press coverage had a doomsday feel:

One person on Twitter even wrote,"The marshmallow/delayed gratification study always felt "wrong" to me - this year it was reported to be hopelessly flawed"

Are these headlines and comments fair? Probably not. As Payne and Sheeran write in Behavioral Scientist,

The problem is that scholars have known for decades that affluence and poverty shape the ability to delay gratification. Writing in 1974, Mischel observed that waiting for the larger reward was not only a trait of the individual but also depended on people’s expectancies and experience. If researchers were unreliable in their promise to return with two marshmallows, anyone would soon learn to seize the moment and eat the treat. He illustrated this with an example of lower-class black residents in Trinidad who fared poorly on the test when it was administered by white people, who had a history of breaking their promises. Following this logic, multiple studies over the years have confirmed that people living in poverty or who experience chaotic futures tend to prefer the sure thing now over waiting for a larger reward that might never come. But if this has been known for years, where is the replication crisis?

05/10/2018

I'm standing at my desk as I compose this post....could that make my writing go better? Yes, according to an editorial entitled, "Standing up at your desk could make you smarter." The editorial leads with a strong causal claim and then describes three studies, each with a different design. Here's one of the studies:

A study published last week...showed that sedentary behavior is associated with reduced thickness of the medial temporal lobe, which contains the hippocampus, a brain region that is critical to learning and memory.

The researchers asked a group of 35 healthy people, ages 45 to 70, about their activity levels and the average number of hours each day spent sitting and then scanned their brains with M.R.I. They found that the thickness of their medial temporal lobe was inversely correlated with how sedentary they were; the subjects who reported sitting for longer periods had the thinnest medial temporal lobes.

a) What were the two variables in this study? Were they manipulated or measured? Was this a correlational or experimental study?

b) The author writes that the study "showed that sedentary behavior is associated with reduced thickness of the medial temporal lobe." Did he use the correct verb? Why or why not?

Here's a second study described in the editorial:

Intriguingly, you don’t even have to move much to enhance cognition; just standing will do the trick. For example, two groups of subjects were asked to complete a test while either sitting or standing [randomly assigned]. The test — called Stroop — measures selective attention. Participants are presented with conflicting stimuli, like the word “green” printed in blue ink, and asked to name the color. Subjects thinking on their feet beat those who sat by a 32-millisecond margin.

c) What are the two variables in this study? Were they manipulated or measured? Was this a correlational or experimental study?

d) Does this study support the author's claim that "you don't have to move much to enhance cognition; just standing will do the trick"? Why or why not?

e) Bonus: What kind of experiment was being described here? (Posttest only, prettest/posttest, repeated measures, or concurrent measures?) Comment, as well, on the effect size.

It’s also yet another good argument for getting rid of sitting desks in favor of standing desks for most people. For example, one study assigned a group of 34 high school freshmen to a standing desk for 27 weeks. The researchers found significant improvement in executive function and working memory by the end of the study.

f) What are the variables in this study? Were they manipulated or measured?

g) Do you think this study can support a causal claim about standing desks improving executive function and working memory?

The author added the following statement to the third study on high school freshmen:

True, there was no control group of students using a seated desk, but it’s unlikely that this change was a result of brain maturation, given the short study period.

h) What threat to internal validity has the author identified in this statement?

i) What do you think of his evaluation of this threat?

j) Of the three studies presented, which provides the strongest evidence for the claim that "standing up at your desk could make you smarter"? What do you think? On the basis of this evidence, should I keep standing here?

04/04/2018

In Chapter 8, one of the examples features a study that found that the more "deep talk" people engage in (as measured by the EAR), the happier they reported being (Mehl et al., 2010) (see Figure 8.1).

The same 2010 study also reported that the amount of engagement in small talk was associated with lower well being (this result is presented in Figure 8.9).

Now a team of researchers (including many of the same researchers) have published a second study with similar methodology (Milek et al., in press). The team collected new data in a larger, more heterogeneous sample of U.S. adults. (the original study was only on college students.) The authors used Bayesian analytic techniques, including pooling the new samples with the sample from the 2010 study. You can view a preprint of the report here. It's in press at Psychological Science.

The new paper confirmed evidence for the "deep talk" effect. That is, substantive conversations were linked with greater well being, with a moderate effect size. But the team did not find evidence for the complementary effect of small talk. That is, in the new analysis, the estimate of the small talk was not different from zero.

If you teach this example, it's worth updating students: The "deep talk" result in Figure 8.1 has been replicated, but one of the effects in in Figure 8.9 (the small talk effect) has not.

03/20/2018

Maybe you shouldn't get into the car with this guy. Credit: Shutterstock.com

You probably know drivers who honk, tailgate, and shake their fists; you know others who give drivers space and respect. Now researchers have identified a trait the aggressive drivers might share: Narcissism.

The participants answered questions from the Narcissistic Personality Inventory, a set of questions used since 1988 to measure narcissism. This questionnaire had participants rate how strongly they agreed with items such as: “I like to be the center of attention,” or “I am an extraordinary person” on a 1 to 5 scale. They then addressed similar items about aggressive driving behavior: “I often swear when driving a car,” or “When driving my car, I easily get angry about other drivers.” ...The researchers report that the more narcissistic drivers are, the more angry and aggressive they reported becoming on the road.

a) Let's talk measurement first. How was narcissism measured--Did they use a self report? An observational measure? or a physiological measure?

b) Now for the second variable, aggressive driving: How was this measured--Was it self report? An observational measure? or a physiological measure?

c) Was this a correlational or experimental study? How do you know?

d) Sketch a graph (with well-labeled axes) of the results of the study.

Next, the researchers conducted a lab-based study with university students. They measured narcissism just as they'd done before, but they measured aggressive driving differently. Here's how they measured aggressive driving in the lab:

...participants sat in the driver’s seat of a 2010 Honda Accord, surrounded on three sides by a curved projection screen. In a 15- to 25-minute driving exercise, the participants saw other computer-generated cars and were told that some of them were being operated by other study participants. (In fact, the experimenters were controlling the other vehicles.)

During the exercise, the participants encountered:

a car pulling suddenly in front of them;

a traffic jam with two 10-second full traffic stops, one after another;

a construction zone with one lane closed and the other slowed down;

a second car mimicking the human driver’s behavior; and

a traffic light that was red for 60 seconds and green for just 5 seconds.

The researchers found that the participants who scored high on narcissism measures were more likely to tailgate, speed, drive off-road, cross the center line into oncoming traffic, drive on the shoulder, honk their horn, or use “verbal aggression” or “aggressive gestures,” in the experimenters’ chaste wording.

e) In this study, how was aggressive driving measured--Did they use a self report? An observational measure? or a physiological measure?

f) Was this a correlational or experimental study? How do you know?

g) Sketch a graph of the results of the study. Label your axes mindfully.

h) Can you think of moderators of this basic relationship? For example, might there be situations or settings for which narcissism is especially strongly linked to aggressive driving? (As you answer, consider this: Past work on narcissism has established that narcissists aren't always aggressive; they are mainly aggressive when others reject them or when they are provoked.) Create a moderator table like those seen in Chapter 8 (e.g., Figure 8.19 or Table 8.5)

i) Can you think of a mediator that explains the relationship between narcissism and aggressive driving? If so, sketch a mediator diagram like those seen in Chapter 9 (e.g., Figure 9.11 or Figure 9.13)

02/10/2018

People in studies who spend more time socializing, rather than time on their phones, are generally happier. Photo: Maria Taglienti-Molinari/Getty Images

Are smartphones making young people lonely, anxious and depressed? Are teens spending time on phones instead of dating, driving, or drinking? That's the argument of a new data-based book by psychologist Jean Twenge.

Several graphs presented in the CNN interview depict patterns of teenage behavior over time, from nationally representative surveys of youth. The graphs show how various healthy activities such as "hanging out with friends" dropped starting in 2012. Twenge's explanation is that 2012 is the first year that more than 50% of Americans owned smartphones. Twenge argues that instead of going out with friends, driving, or being independent, today's teenagers are staying home and connecting with friends only via Snapchat and Instagram.

a) The graph in the CNN story (see minute 1:50 to 2:05; and also here) shows several trends over time. These figures come from a study that is quasi-experimental. Which type of quasi-experiment does this seem to be?

b) Take a look at the "More likely to feel lonely" figure, at minute 2:02. (again, you can also see it here, by scrolling down) The change in loneliness after 2007 has been described as "dramatic" and "precipitous". What do you think? Specifically, look at the y-axis of the graph. How dramatic would the data look if the axis ranged from, say, 0 to 100?

c) Twenge's argument is that the advent of cell phones in 2012 is responsible for decreased social contact and increased loneliness, anxiety, and depression of youth. What might be some plausible alternative explanations for the pattern, other than cell phones? (that is, what are some internal validity threats?)

The data above are longitudinal data, collected over time. Twenge's research has also included correlational studies collected in one group of teenagers at a single point in time. This report in the Washington Post presents the results of an empirical study that found, among other things, that teens who spent more time doing Internet, texting, computer games, or social media were lower in happiness.

d) Scroll down to the graph created by the Washington Post, which is titled "What makes teens happy?" You'll see that each bar represents a correlation between one use of time and teen happiness. What does a gray bar, or negative correlation mean (For example, what does the -0.11 correlation mean for Internet? Sketch a little scatterplot of this correlation.

e) In the same graph, what does a blue bar, or positive correlation mean (For example, what does the 0.14 correlation mean for Sports or exercise?) Sketch a little scatterplot of this correlation.

f) Select the correlation mentioned in d.Does this correlation, on its own, allow us to conclude that time on the Internet causes lower levels of happiness? What third variables might be responsible for this negative correlation? What about temporal precedence of the two variables?

g) How might you describe the effect size of these correlations--are they weak, moderate or strong? How do you know?

h) Consider both the longitudinal data (in Questions a, b, and c) and the correlational data (in questions d-g). Both types of studies support the same conclusion, which make them an example of "Pattern and Parsimony." Explain why this is the case, in your own words.

Instructors: You and your students might also be interested in a curvilinear relationship described in the same Post piece:

The report’s findings were not all dire: Teenagers who get a small amount of exposure to screen time, between one and five hours a week, are happier than those who get none at all. The least happy ones were those who used screens for 20 or more hours a week.

01/20/2018

Do people with wider faces (left) show more antisocial tendencies than those with narrow faces (right)? Photo: istockphoto

According to several previous studies in psychological science, men with wider faces--a greater ratio of width to height (like in the photo on the left, compared to the right)--tend to show antisocial tendencies such as racial bias, exploitation, and even aggression. Researchers attributed this link to exposure to testosterone during development, which, they say, causes both wider facial structure and antisocial behavior.

Kosinski found that previous studies often had methodological shortcomings such as small sample sizes. Half of the previous studies that he identified involved fewer than 25 participants and the average sample size was 40. And seven out of ten of the studies only just crossed the conventional threshold for significance of p=.05.

These factors led Kosinski to conduct a large-scale study of face measurements and behavioral tendencies. His research, published in Psychological Science, finds no relationship between facial width-to-height ratios (fWHR) and behavioral tendencies in a large sample of over 135,000 participants.

Questions

a) Review the material in Chapters 11 and 14, and explain why studies based on small samples can lead to results that are difficult to replicate. (You might also want to review the "kindergarten height" example in this recent blog post).

b) Why is it a problem that, in 7 out of 10 studies, the results "only just crossed the conventional threshold for significance?"

Now read a bit more about the "big data" methods that Kosinski employed in his research:

Kosinski turned to a very large dataset collected via a Facebook app called MyPersonality.org. The app comprised a collection of psychometric tests and surveys that Facebook users could take and then see how they scored — they could also volunteer their scores and Facebook profile data to be used in research projects. Using this bank of over 800,000 users’ surveys and over 2 million profile pictures, Kosinski tested his research question: Do broad faces indicate antisocial tendencies? [...]

After a preliminary experiment with 1,692 users showed that a computer could measure width-to-height ratios with the same accuracy that humans could, Kosinski analyzed 173,241 photos from 137,163 male and female participants (some users had multiple profile pictures and their measurements were averaged before analysis).

The results showed that facial broadness didn’t substantially correlate with any of the 55 personality measures tested....For example, broader-faced people reported themselves to be more prosocial, sympathetic, trusting, and cooperative,” says Kosinski. “Also, broader-faced people reported less interest in drug use, weapons, piercing, and tattoos. Moreover, broader-faced people did not score significantly higher on any of the traits positively related to antisocial and aggressive behavioral tendencies, including the personality facets of excitement-seeking and anger, impulsiveness, and militarism (i.e., interest in paramilitary groups, the armed forces, bodybuilding, martial arts, and survivalism).”

c) According to this description, Kosinski is basically running a series of bivariate correlations. Each one was between a self-reported trait and _________?

d) Pick one of the personality variables tested in the study. Now sketch a scatterplot of the result, labelling your axes carefully.

e) Kosinski's sample included more than a hundred thousand users. Why might this lead to a more stable estimate of the true relationship between facial broadness and personality? (This is the complement to question a), above)

f) Kosinki's study is an example of a "failure to replicate." Review the concepts in Table 14.1 and indicate which elements might apply in this case.

g) What questions might you ask about the construct validity of the personality measures used in Kosinski's study?

Suggested answers

a) and e) Small samples are more likely to be affected by one or two extreme scores, whereas in very large samples, the extreme scores are much more likely to be balanced out by other scores. The gifs in this blog post show the principle dynamically.

b) Some researchers have proposed that when a manuscript reports p-values very close to the conventional cutoff of .05 (p-values of .04 or .03), it's a sign that a researcher might have "p-hacked" the study. P-hacking is when a researcher goes through a series of options when analyzing the data, such as eliminating outliers, adding covariates, or testing multiple dependent measures, stopping analysis only when p just crosses under the .05 threshold. Therefore, when, in a body of literature, most of the p-values are just below .05, we might suspect that the underlying finding is a fluke, not a real result.

c) Facial broadness, as measured by width-to-height ratio.

d) One axis should be labelled "facial broadness" and the other might be labelled "interest in drug use." The cloud of points should be extremely spread out, showing no pattern or discernible slope.

e) see a) answer above.

f) The concepts in Table 14.1 that seem to apply best are the third (the original study's sample was very small) and perhaps the fourth (the original study may have tried multiple statistical analyses). (We cannot be sure without more investigation into the original studies, but these are the two issues raised in the APS summary of Kosinski's work.)

g) Indeed, we don't know much about the personality measures used in the study. The full manuscript might report more about whether data collected with these personality measures shows that they are reliable and valid.

09/20/2017

One conspiracy theory holds that the government spreads dangerous chemicals in the exhaust from airplanes (the "chemtrails" theory). According to the studies, who is more likely to believe this theory? Credit: Jason Batterham/Shutterstock

There's a causal claim in this journalist's story about research on people who believe conspiracy theories. First, here's the causal headline:

b) Draw a diagram of this causal claim, using this form: A ---> B. (That is, which variable comes first, according to the wording the journalist used?)

Now that you've established the claim, does the research support it? The journalist reports on several studies conducted by researchers at the University of Mainz. Here's one:

The researchers first asked a sample of 238 US participants recruited via Amazon’s Mechanical Turk survey website to complete a self-reported “Need For Uniqueness” scale (they rated their agreement with items like “being distinctive is extremely important to me”) and a Conspiracy Mentality scale (e.g. “Most people do not see how much our lives are determined by plots hatched in secret.”) before indicating whether or not they believed in a list of 99 conspiracy theories circulating online....[The results showed that] participants’ self-reported Need For Uniqueness ... correlated with their stronger endorsement of the conspiracy beliefs.

c) In the study above, sketch a scatterplot of the results they described. Label your axes mindfully.

d) Is the study above correlational or experimental? Why or why not?

e) Does this study support the claim that the journalist assigned to it? That is, does it show that believing conspiracy theories causes people's need for uniqueness to be satisfied? As you explain your answer, pay careful attention to possible third variables. That is, what third variables ("C"s) might be reasonably correlated with both need for uniqueness and belief in conspiracies?

Here's another study in the series:

The second study replicated this finding with a further 465 Mechanical Turk participants based in the US, but this time half the sample read a list of the five most well known conspiracy theories and the five least known ones, whereas the other half of the group read the five most popular conspiracy theories and the five least popular. Again, self-reported Need For Uniqueness correlated with stronger agreement with the various conspiracy theories.

f) Is the second study correlational or experimental? Why? Sketch the results in a well-labeled scatterplot.

g) Does this second study rule out any of the third variables you came up with when answering question e) above?

Here's a final study:

Note, the conspiracy theory that featured in this final experiment was entirely made-up by the researchers. ...The conspiracy theory was about smoke detectors and the claim was that they produce dangerous hypersound. The researchers led half of 290 participants on Amazon’s Mechanical Turk to believe that this was a popular conspiracy theory in Germany where it was alleged to be believed by 81 per cent of Germans. The rest of the participants were led to believe that the theory was doubted by 81 per cent of Germans.

While information about the popularity of the theory didn’t affect participants overall, it did impact those who said that they tended to endorse a lot of conspiracy theories. Among these conspiracy-prone participants, their belief in the made-up smoke detector conspiracy was enhanced on average when the conspiracy was framed as a minority opinion.

h) Is the above study correlational or experimental? Why?

i) Sketch the results of the study--Hint: you might need to use a bar graph that also includes two colors of bars.

j) To what extent does this final study support the claim that believing "conspiracy theories causes people's need for uniqueness to be satisfied"?

k) If the study doesn't support the claim given in j), is there another, modified causal claim that the study can support? What is it?

In the final section of the story, the journalist concludes, "Taken together, the findings provide convincing evidence that some people are motivated to agree with conspiracy theories with an aura of exclusiveness."

l) Should this final claim be classified as causal or association? To what extent is this claim supported by the studies?

My thanks to Marianne Lloyd of Seton Hall University for sharing this story!

Selected answers

d) This is a correlational study because both need for uniqueness and belief in conspiracy theories were measured variables.

e) In this correlational study, we can't know for sure whether Need for Uniqueness --> Belief in C0nspiracy Theories, whether Belief in Conspiracy Theories --> Need for Uniqueness, or whether some third variable is associated with both. One possible third variable might be social anxiety: Perhaps socially anxious people believe more conspiracy theories and socially anxious people are also high on need for uniqueness. Being highly educated is associated with lower belief in conspiracy theories (according to this study), and if highly educated people are also low in need for uniqueness, then education is a potential third variable as well.

f) Both variables were measured again, so....correlational study!

g) You could raise the same objections here that you did in e)

h) This study is experimental because they manipulated the proportion of people who supposedly believed in the theory. There are two measured variables in this study as well; one is the tendency to believe in conspiracy theories (high or low), and the other is the extent of belief in the smoke detector theory.

j) No; in fact need for uniqueness wasn't reported by the journalist as a variable at all, so it probably can't support that claim.

k) I believe this study can support the following claim: Telling people that only a minority of Germans support a conspiracy theory causes people to believe it more, only if they already are the type of person who believes in conspiracy theories.

l) The final statement is an association claim. Because most of the studies were correlational, the studies do support it!

06/20/2017

The study found an association between screen time and language delay in a sample of 900 18-month olds. Photo: Maria Sbytova/Shutterstock

This CNN story reports on research presented at an academic conference for Pediatrics. According to the conference presentation,

[A] study found that the more time children between the ages of six months and two years spent using handheld screens such as smartphones, tablets and electronic games, the more likely they were to experience speech delays.

According to this description, the two main conceptual variables in the study are "time spent using a handheld screen" and "speech delay." Read on to find out how each variable was operationalized:

In the study, which involved nearly 900 children, parents reported the amount of time their children spent using screens in minutes per day at age 18 months. Researchers then used an infant toddler checklist, a validated screening tool, to assess the children's language development also at 18 months. They looked at a range of things, including whether the child uses sounds or words to get attention or help and puts words together, and how many words the child uses.

a) According to the text, how did the study operationalize the variable, time spent using a handheld screen? Do you think this was a valid measure? Why or why not?

b) According to the text, how did the study operationalize the variable, speech delay?

c) The passage describes the infant toddler checklist as "a validated screening tool"--how do you think this tool was validated? What data might they have collected to validate this measure? (apply concepts from Chapter 5)

d) Sketch a scatterplot of the association they reported in the first quoted section.

e) Does this association allow us to conclude that "exposure to handheld screens causes children to experience speech delays?" Why or why not?

Read the following description; identify the mediator that the speaker is proposing.

"We do know that young kids learn language best through interaction and engagement with other people, and we also know that children who hear less language in their homes have lower vocabularies." It may be the case that the more young children are engaged in screen time, then the less time they have to engage with caretakers, parents and siblings, said [an expert who commented on the findings].

f) Sketch the mediator pattern that is hypothesized above.

Suggested answers

a) This variable appears to have been operationalized via parents' reports of their children's screen time.

b) Speech delay may have been operationalized with multiple measures. It is not clear from the reporting if the Infant Toddler Checklist is one measure, separate from whether the "child uses sounds or words to get attention or help and puts words together, and how many words the child uses," or if these three components comprise the infant toddler checklist.

c) One way to validate a measure such as the Infant Toddler Checklist would be through a known groups paradigm. Professional speech language pathologists might identify groups of children who either do have speech delay or do not. Then all children in the two groups would be administered the Checklist. If the Checklist is valid, then the group who have been diagnosed with speech delay should score lower on it than the group who who have been diagnosed as developing normally.

d) Your scatterplot should have "Speech delay" (or, alternatively, "Infant Toddler Checklist") on one axis and "Time spent on screens" on the other axis. The dots should slope upwards from left to right.

e) The results show covariance: Speech delay is associated with time on screens. The method does not allow us to establish temporal precedence, since both variables were apparently measured at the same time. We do not therefore know if the screen time came first (inhibiting speech delay) or if children with speech delays are more likely to be drawn to screens (perhaps to cope with frustration of not communicating easily). Internal validity also is not established. A reasonable third variable explanation might be time in day care: it is possible that children in day care are both less likely to be on screens (because most day cares have lots of other toys) and children in day care are exposed to more language because there are more people around.

f) You'd draw this mediator pattern:

exposure to screens ---> less social engagement with caretakers ---> language delay

05/20/2017

Can you find North Korea on this map? Credit: ASDFGH/Wikimedia Commons

Readers can explore several frequency and association claims in this New York Times story headlined, "If Americans Can Find North Korea on a Map, They're More Likely to Prefer Diplomacy."

The story reports that when shown a map of East Asia (like the one shown here), only 36% of Americans could correctly locate the country of North Korea. The country has been in the news a lot lately because it has launched ballistic missiles. What should the American response be to such aggression?

An experiment led by Kyle Dropp of Morning Consult from April 27-29, conducted at the request of The New York Times, shows that respondents who could correctly identify North Korea tended to view diplomatic and nonmilitary strategies more favorably than those who could not. These strategies included imposing further economic sanctions, increasing pressure on China to influence North Korea and conducting cyberattacks against military targets in North Korea.

They also viewed direct military engagement – in particular, sending ground troops – much less favorably than those who failed to locate North Korea.

a) The finding that only 36% of Americans can locate North Korea on a map is what kind of claim? How would you state the variable(s) involved in this claim?

b) The report says "respondents who could correctly identify North Korea tended to view diplomatic and nonmilitary strategies more favorably than those who could not." What kind of claim is this? What are the variable(s) involved in this claim?

c) In the quoted text above, the journalist called this "An experiment." It's not--how come?

The lead image in the story (linked here), depicts people's guesses about where North Korea is located. Could you locate it?

d) Go to the story now and scroll down about half way to see how they have presented correlations between "Knowing where North Korea is" and their endorsement of various responses to North Korean aggression. What does a negative number mean in this table?

e) Make a prediction: Whom do you think is more likely to know where North Korea is: Men or women? Democrats or Republicans? People who've visited another country or those who've never been abroad? Older or younger Americans?

Now, scroll down the story to the pink bar graphs and see if your predictions were supported.

If you’re a research methods instructor or student and would like us to consider your guest post for everydayresearchmethods.com, please contact Dr. Morling. If, as an instructor, you write your own critical thinking questions to accompany the entry, we will credit you as a guest blogger.