July 2018

07/20/2018

Is social media use responsible for depressed mood? Photo: Ian Allenden/Alamy stock

Do smartphones harm teenagers? If so, how much? In this blog, I've written before about the quasi-experimental and correlational designs used in research on screen time and well-being in teenagers. In that post you can practice identifying the different designs we can use to study this question.

Today's topic is more about the size of the effect in studies that have been published. A recent Wired story tried to put the effect size in perspective.

One side of the argument, as presented by Robbie Gonzalez in Wired, scares us into seeing social media as dangerous.

For example, first

...there were the books. Well-publicized. Scary-sounding. Several, really, but two in particular. The first, Irresistible: The Rise of Addictive Technology and the Business of Keeping Us Hooked, by NYU psychologist Adam Alter, was released March 2, 2017. The second, iGen: Why Today's Super-Connected Kids are Growing Up Less Rebellious, More Tolerant, Less Happy – and Completely Unprepared for Adulthood – and What That Means for the Rest of Us, by San Diego State University psychologist Jean Twenge, hit stores five months later.

In addition,

...Former employees and executives from companies like Facebook worried openly to the media about the monsters they helped create.

But is worry over phone use warranted? Here's what Gonzalez wrote after talking to more researchers:

When Twenge and her colleagues analyzed data from two nationally representative surveys of hundreds of thousands of kids, they calculated that social media exposure could explain 0.36 percent of the covariance for depressive symptoms in girls.

But those results didn’t hold for the boys in the dataset. What's more, that 0.36 percent means that 99.64 percent of the group’s depressive symptoms had nothing to do with social media use. Przybylski puts it another way: "I have the data set they used open in front of me, and I submit to you that, based on that same data set, eating potatoes has the exact same negative effect on depression. That the negative impact of listening to music is 13 times larger than the effect of social media."

In datasets as large as these, it's easy for weak correlational signals to emerge from the noise. And a correlation tells us nothing about whether new-media screen time actually causes sadness or depression.

There are several things to notice in the extended quote above. First let's unpack what it means to, "explain 0.36% of the covariance". Sometimes researchers will square the correlation coefficient r to create the value R2. The R2 tells you the percentage of variance explained in one variable by the other (incidentally, they usually say "percent of the variance" instead of "percent of covariance."). In this case, it tells you how much of the variance in depressive symptoms is explained by social media time (and by elimination, it tells you what percentage is attributable to something else). We can take the square root of 0.0036 (that's the percentage version of 0.36%) to get the original r between depressive symptoms and social media use. It's r = .06.

Questions

a) Based on the guidelines you learned in Chapter 8, is an r of .06 small, medium, or large?

b) Przybylski claims that the effect of social media use on depression is the same size as eating potatoes. On what data might he be basing this claim? Illustrate your answer with two well-labelled scatterplots, one for social media and the other for potatoes. Now add a third scatterplot, showing listening to music.

c) When Przybylski states that the correlation held for the girls, but not the boys, what kind of model is that? (Here are your choices: moderation, mediation, or a third variable problem?)

d) Finally, Przybylski notes that in large data sets, it's easy for weak correlation signals to appear from the noise. What statistical concepts are being applied here?

e) Chapter 8 presents another example of a large data set that found a weak (but statistically significant) correlation. What is it?

f) The discussion above between Gonzalez and Przybylski concerns which of the four big validities?

a) An r of .06 is probably going to be characterized as "small" or "very small" or even "trivial." That's what the "potatoes" point is trying to illustrate, in a more concrete way.

b) One scatterplot should be labeled with "potato eating" on the x axis and "depression symptoms" on the y axis. The second scatterplot should be labeled with "social media use" on the x axis and "depression symptoms" on the y axis. These first two plots should show a positive slope of points with the points very spread out--to indicate the weakness of the association. The spread of the first two scatterplots should be almost the same, to represent the claim the two relationships are equal in magnitude. The third scatterplot should be labeled with "listening to music" on the x axis and "depression symptoms" on the y axis, and this plot should show a much stronger, positive correlation (a tighter cloud of points).

c) It is a moderator. Gender moderates (changes) the relationship between screen use and depression.

d) Very large data sets have a lot of statistical power. Therefore, large data sets can show statistical significance for even very, very, small correlations--even correlations that are not of much practical interest. A researcher might report a "statistically significant' correlation, but it's essential to also ask about the effect size and its practical value (the potatoes argument). Note: you can see the r = .06 value in the original empirical article here, on p. 9.

e) The example in Chapter 8 is the one about meeting one's spouse online and having a happier marriage--that was a statistically significant relationship, but r was only .03. That didn't stop the media from hyping it up, however.

f) Statistical validity

g) The research on smartphones and depressive symptoms is correlational, making causal claims (and causal language) inappropriate. That means that we can't be sure if social media is leading to the (slight) increase in depressive symptoms, or if people who have more depressive symptoms end up using more social media, or if there's some third variable responsible for both social media use and depressive symptoms. As the Wired article states,

...research on the link between technology and wellbeing, attention, and addiction finds itself in need of similar initiatives. They need randomized controlled trials, to establish stronger correlations between the architecture of our interfaces and their impacts; and funding for long-term, rigorously performed research.

07/10/2018

Her school achievement later in life can be predicted from her ability to wait for a treat (or by her family's SES). Photo: Manley099/Getty Images

There's a new replication study about the famous "marshmallow study", and it's all over the popular press. You've probably heard of the original research: Kids are asked to sit alone in a room with a single marshmallow (or some other treat they like, such as pretzels). If the child can wait for up to 15 minutes until the experimenter comes back, they receive two marshmallows. But if they eat the first one early, they don't. As part of the original study, kids were tracked over several years. One of the key findings was that the longer children were able to wait at age 4, the better they were doing in school as teenagers. Psychologists have often used this study as an illustration of how self-control is related to important life outcomes.

The press coverage of this year's replication study illustrates at least two things. First, it's a nice example of multiple regression. Second, it's an example of how different media outlets assign catchy--but sometimes erroneous--headlines on the same study.

First, let's talk about the multiple regression piece. Regression analyses often try to understand a core bivariate relationship more fully. In this case, the core relationship they start with is between the two variables, "length of time kids waited at age 4" and "test performance at age 15." Here's how it was described by Payne and Sheeran in the online magazine Behavioral Scientist:

The result? Kids who resisted temptation longer on the marshmallow test had higher achievement later in life. The correlation was in the same direction as in Mischel’s early study. It was statistically significant, like the original study. The correlation was somewhat smaller, and this smaller association is probably the more accurate estimate, because the sample size in the new study was larger than the original. Still, this finding says that observing a child for seven minutes with candy can tell you something remarkable about how well the child is likely to do in high school.

a) Sketch a well-labelled scatterplot of the relationship described above. What direction will the dots slope? Will they be fairly tight to a straight line, or spread out?

b) The writers (Payne and Sheeran) suggest that a larger sample size leads to a more accurate estimate of a correlation. Can you explain why a large sample size might give a more accurate statistical estimate? (Hint: Chapter 8 talks about outliers and sample size--see Figures 8.10 and 8.11.)

Now here's more about the study:

The researchers next added a series of “control variables” using regression analysis. This statistical technique removes whatever factors the control variables and the marshmallow test have in common. These controls included measures of the child’s socioeconomic status, intelligence, personality, and behavior problems. As more and more factors were controlled for, the association between marshmallow waiting and academic achievement as a teenager became nonsignificant.

c) What's proposed above is that social class is a third variable ("C") that might be associated with both waiting time ("A") and school achievement ("B"). Using Figure 8.15. draw this proposal. Think about it, too: Why does it make sense that lower SES might go both with lower waiting time (A)? Why might lower SES go with lower school achievement (B)?

d) Now create a mockup regression table that might fit the pattern of results being described above. Put the DV at the top (what is the DV?), then list the predictor variables underneath, starting with Waiting time at Age 4, and including things like Child's Socioeconomic Status and Intelligence. Which betas should be significant? Which should not?

Basically, here we have a core bivariate relationship (between wait time and later achievement), and then a critic suggests a possible third variable (SES). They used regression to see if the core relationship was still there when the third variable was controlled for. The core relationship went away, suggesting that SES was a third variable that can help explain why kids who wait longer do better in school later on.

Next let's talk about some of the hype around this replication study. The Behavioral Scientist piece (quoted above) is one of the more balanced descriptions. Its headline was, Try to Resist Misinterpreting the Marshmallow Test. It emphasized that the core relationship was replicated. It also explains in some detail why SES is related to self-control, and how the two probably cannot be meaningfully separated--it's a nuanced report. But other press coverage had a doomsday feel:

One person on Twitter even wrote,"The marshmallow/delayed gratification study always felt "wrong" to me - this year it was reported to be hopelessly flawed"

Are these headlines and comments fair? Probably not. As Payne and Sheeran write in Behavioral Scientist,

The problem is that scholars have known for decades that affluence and poverty shape the ability to delay gratification. Writing in 1974, Mischel observed that waiting for the larger reward was not only a trait of the individual but also depended on people’s expectancies and experience. If researchers were unreliable in their promise to return with two marshmallows, anyone would soon learn to seize the moment and eat the treat. He illustrated this with an example of lower-class black residents in Trinidad who fared poorly on the test when it was administered by white people, who had a history of breaking their promises. Following this logic, multiple studies over the years have confirmed that people living in poverty or who experience chaotic futures tend to prefer the sure thing now over waiting for a larger reward that might never come. But if this has been known for years, where is the replication crisis?

If you’re a research methods instructor or student and would like us to consider your guest post for everydayresearchmethods.com, please contact Dr. Morling. If, as an instructor, you write your own critical thinking questions to accompany the entry, we will credit you as a guest blogger.