QRock 100.7 was one of several news outlets that had fun describing this study for its readers.

Let's find out what kind of study was conducted to test the claim. The Q100.7 journalist wrote:

A new study found loud music makes us more likely to order unhealthy food when we’re dining out. A new study in Sweden found loud music in restaurants makes us more likely to choose unhealthy menu options. And we’re more likely to go with something healthy like a salad when the music ISN’T so loud.

Researchers went to a café and played music at different decibel levels to see how it affected what people ordered. Either 55 decibels, which is like background chatter or the hum from a refrigerator . . . or 70 decibels, which is closer to a vacuum cleaner.

And when they cranked it to up 70, people were 20% more likely to order something unhealthy, like a burger and fries.

They did it over the course of several days and kept getting the same results. So the study seems pretty legit.

a) OK, go: What seems to be the independent variable in this study? What were its levels? How was it operationalized?

b) What seems to be the dependent variable? How was it operationalized? Think specifically about how they might have operationalized the concept "unhealthy."

c) Do you think this study counts as an experiment or a quasi-experiment? Explain your answer.

d) This study can be called a "field study" or perhaps a "field experiment". Why?

e) To what extent can this study support the claim that loud music makes you eat bad food? Apply covariance, temporal precedence, and internal validity to your response.

f) If you were manipulating the loudness of the music for a study like this, how might you do so in order to ensure it was the music, and not other restaurant factors, were responsible for the increase in ordering "unhealthy" food?

g) The Q100.7 journalist argues that the study seems "pretty legit." What do you think the journalist meant by this phrase?

h) The study on food and music volume are summarized in an open-access conference abstract, published here. You might be surprised to read, contrary to the journalist's report, that the field study was conducted on only two days--with one day at 50db and the other at 70 db. How does this change your thoughts about the study?

g) Conference presentations are not quite the same as peer-reviewed journal publications. Take a moment (and use your PSYCinfo skills) to decide if the authors, Biswas, Lund, and Szocs, have published this work yet in a peer reviewed journal. Why might journalists choose to cover a story that has only been presented at a conference instead of peer-reviewed? Is this a good practice in general?

As you can see, these journalists (or their editors) attached extremely strong titles to their science articles! An actual scientist wouldn't describe the results of a study with such strong terms as "prove" or "They work." That's because research in science is a steady accumulation of evidence--each study teaches us a little bit more, but no study can "prove" a theory or a claim.

The "study" mentioned in the three headlines above was actually a meta-analysis of 522 clinical trials (that is, randomized controlled studies) of antidepressants. Here's a summary and interpretation according to the Neuroskeptic blog:

...the authors, Andrea Cipriani et al., conducted a meta-analysis of 522 clinical trials looking at 21 antidepressants in adults. They conclude that “all antidepressants were more effective than placebo”, but the benefits compared to placebo were “mostly modest”. Using the Standardized Mean Difference (SMD) measure of effect size, Cipriani et al. found an effect of 0.30, on a scale where 0.2 is considered ‘small’ and 0.5 ‘medium’.

a) Review: What does a meta-analysis do? Why might we value a meta-analysis over a single study?

b) When the journalist describes the Standardized Mean Difference (SMD), they are referring to a statistic very much like Cohen's d. As you can see, the conventions for SMD are the same as for Cohen's d. Do you agree that the effect size of .31 could be considered "modest" according to these conventions?

c) I wrote above that "no study can 'prove' a theory or a claim." But what about a meta-analysis--do you think meta-analyses are more likely to be able to prove a theory? Are they definitive? (Why or why not?).

The Neuroskeptic criticized the media's coverage of this meta-analysis on a couple of grounds. First, they pointed out how the results of the new study are almost exactly the same as several old studies, suggesting that the new study is not particularly groundbreaking:

The thing is, “effective but only modestly” has been the established view on antidepressants for at least 10 years. Just to mention one prior study, the Turner et al. (2008) meta-analysis found the overall effect size of antidepressants to be a modest SMD=0.31 – almost exactly the same as the new estimate.

Second, the Neuroskeptic cleverly points out that, a few years ago, the media assigned the opposite headline to virtually the same result:

Cipriani et al.’s estimate of the benefit of antidepressants is also very similar to the estimate found in the notoriousKirsch et al. (2008) “antidepressants don’t work” paper! Almost exactly a decade ago, Irving Kirsch et al. found the effect of antidepressants over placebo to be SMD=0.32, a finding which was, inaccurately, greeted by headlines such as “Anti-depressants ‘no better than dummy pills‘”.

d) What is a placebo, and why might it be important to use one in a study of antidepressants?

e) Why do you think the media wrote such different headlines about similar meta-analystic results?

Finally, here are some important additional comments from the Neuroskeptic article:

I’m not criticizing Cipriani et al.’s study, which is a huge achievement. It’s the largest antidepressant meta-analysis to date, including an unparalleled number of difficult-to-find unpublished studies (although both Turner et al. and Kirsch et al. did include some.) It includes a broader range of drugs than previous work, although it’s not quite comprehensive: there are no MAOis, for instance, and in general older drugs are under-represented.

Even so, Cipriani et al. meta-analyzed the evidence on all of the most commonly prescribed drugs, and they were able to produce a comparative ranking of the different medications in terms of effectiveness and side-effects, which is likely to be useful.

f) Explain why Neuroskeptic is praising Cipriani's study on its use of "difficult-to-find unpublished studies." Why is this important in meta-analysis?

11/20/2017

The sun sets in Amarillo, TX an hour later than it does in Huntsville, AL though they are on the same time zone. Amarillo residents get less sleep and earn more money: Is there a causal connection? Photo: Creativeedits/Wikimedia Common

Sleep is an essential human function and getting more sleep is associated with improved mood, cognitive performance, and physical performance. Therefore, it might make sense that sleep would improve people's productivity and ability to earn money. That's the topic of a Freakonomics episode on the "Economics of Sleep." You can read the transcript or listen to the 45 minute episode here. (The section I focus on starts around minute 10.)

Freakonomics' hosts interviewed a set of economists (including Matthew Gibson, Jeff Shrader, Dan Hamermesh, and Jeff Biddle) about their research on sleep, work hours, and income. The economists mentioned that, in order to establish a causal link between sleep and income:

What we need is something like an experiment for sleep. Almost as though we go out in the United States and force people to sleep different amounts and then watch what the outcome is on their wages.

While it is theoretically possible to conduct such an experiment, it is practically difficult to assign people to different sleep conditions for a long enough period of time to notice an impact on their wages. So the economists took an alternative path and used quasi-experimental data. In a creative twist, they compared wages at two ends of a single American time zone. The example they gave is Huntsville, AL and Amarillo, TX. Here's why. Gibson stated:

It turns out that ever since we’ve put time zones into place, we’ve basically been running just that sort of giant experiment on everyone in America.

The story continued. You'll see the transcript version quoted below:

Consider two places like Huntsville, Alabama — which is near the eastern edge of the Central Time Zone — and Amarillo, Texas, near the western edge of the Central zone. [...]

...even though Amarillo and Huntsville share a time zone, the sun sets about an hour later in Amarillo, according to the clock, and since the two cities are at roughly the same latitude as well, they get roughly the same amount of daylight too.

So you’ve got two cities on either end of a time zone, roughly the same size — just under 200,000 people each — where, according to the clock time, sunset is an hour apart. Now, what good is that to a pair of economists interested in sleep research?

GIBSON: It turns out that the human body, our sleep cycle responds more strongly to the sun than it does to the clock. People who live in Huntsville and experience this earlier sunset go to bed earlier.

GIBSON: If we plot the average bedtime for people as a function of how far east they are within a time zone, we see this very nice, clean nice straight line with earlier bedtime for people at the more eastern location.

But since Huntsville and Amarillo are in the same time zone, people start work at roughly the same time, which means alarm clocks go off at roughly the same time.

GIBSON: That means if you go to bed earlier in Huntsville, you sleep longer.

The economists didn't use only Huntsville and Amarillo--they also conducted multiple comparisons of cities around the U.S. that were similarly on each end of a single time zone. Using "city of residence" as their quasi-experimental operationalization of "amount of sleep", the economists were ready to report the results for wages:

So now Gibson and Shrader plugged in wage data for Huntsville vs. Amarillo and other pairs of cities that had a similar sleep gap.

GIBSON: We find that permanently increasing sleep by an hour per week for everybody in a city, increases the wages in that location by about 4.5 percent.

Four and a half percent — that’s a pretty good payout for just one extra hour of sleep per week. If you get an extra hour per night, Gibson and Shrader discovered — here, let me quote you their paper: “Our main result is that sleeping one extra hour per night on average increases wages by 16%, highlighting the importance of restedness to human productivity.”

Questions:

a) What is the independent variable in this time zone and wages study? What is the dependent variable?

b) Is the IV independent groups or within groups?

c) Which of the four quasi-experimental designs is this? Non-equivalent control group posttest only, Non-equivalent control group pretest-posttest, Interrupted time series, or Non-equivalent control group interrupted time series?

d) The economists asserted, "sleeping one extra hour per night on average increases wages by 16%" (italics added). What do you think? Can their study support this claim? Apply the three causal rules, especially taking note of internal validity issues that this study might have.

e) If you consider only one pair of cities, there are multiple alternative explanations, besides sleep, that can account for wage differences. Name two or three such threats (considering Huntsville and Amarillo as an example). Now consider, how might many of these internal validity threats be reduced by conducting the same analysis over many other city pairs?

f) This Freakonomics episode was aired in 2015, but the study (about time zones) they reviewed is not yet published. What do you think about that?

Answers to selected questions

a) The IV is "Hours of sleep" (but you could also call it "location on the time zone: East or West") and the DV is "Wages".

b) The IV is independent-groups.

c) Non-equivalent control group posttest only.

d & e) The results of the study support covariance: People in cities in the Eastern portion of time zones get more sleep and have higher wages than people in the Western portions. Temporal precedence is unclear, I think: Because the data were collected at the same time, it's not clear if the timezone came first, leading to more sleep and higher wages, or if people began to earn higher wages first, and then systematically moved Eastward. (However, the second direction certainly seems less plausible than the first.)

As for internal validity, if we consider only the city pair of Huntsville and Amarillo, we could come up with several alternative explanations. The two cities have different historical trajectories and different ethnic diversities; they are in two different states that have different fiscal policies and industry bases. Perhaps Amarillo has poorer wages in general and people are losing out on sleep there because they are working more than one job. However, these internal validity threats become less of an issue when you consider multiple pairs of cities. It is less plausible that internal validity threats that apply to one city pair would also, coincidentally, apply to all the other city pairs that are at opposite ends of a time zone.

Even though the method is fairly strong, psychologists would be unlikely to make a strong causal claim simply from quasi-experimental data like these, because the independent variable is not truly manipulated. Nevertheless, the method and results of this quasi-experiment are certainly consistent with the argument that getting more sleep may be a factor in earning higher wages.

The year 2016 provided multiple references to implicit and explicit racial biases, especially in politics. So you might be wondering, What does it mean to hold "implicit biases?" Why are people biased against some ethnic groups, and what can we do about it?

It turns out there is a strong research tradition concerned with measuring and correcting implicit bias. There's a series of short videos grouped under the title, What, me biased? Each presents a real-world situation relevant to racial bias and discusses a research study.

a) In the opening minute, TV host Heather McGhee poses a theory about how to reduce racism to the caller. What is the theory? How did the researchers use data to test the theory?

b) What was the independent (manipulated) variable in the study? What was the dependent variable?

c) How do you know the study was an experiment? Was it an independent groups or within groups design?

d) Sketch a graph of the result, labelling your axes mindfully.

e) Work through the theory-data cycle: Did the data support Ms. McGhee's theory, or not?

More Resources for Instructors:

There are seven videos in this series, providing other opportunities to practice research methods concepts. For example, students can practice an individualized version of the theory-data cycle (Chapter 1), where Dr. Dolly Chugh discusses the idea of an "audit" in the video, Check Our Bias to Wreck Our Bias.

10/10/2016

Playing checkers won't make you smarter in general, but it will make you smarter at checkers!

Photo: Shutterstock / Jaren Jai Wicklun

A variety of research methods concepts are illustrated in this NPR story, Brain Games Fail a Big Scientific Test. Let's go through the journalist's story, connecting the material to course concepts.

The article starts with the conclusion: Brain games do not make you smarter.

That's the conclusion of an exhaustive evaluation of the scientific literature on brain training games and programs. It was published Monday in the journal Psychological Science in the Public Interest."

It would be really nice if you could play some games and have it radically change your cognitive abilities," Simons says. "But the studies don't show that on objectively measured real-world outcomes."

a) When the journalist writes, "exhaustive evaluation of the scientific literature," what kind of journal article is he probably describing--an empirical journal article, or a review journal article (Ch 2)?

Next, here's the backstory of the journal article in question:

In October 2014, more than 70 scientists published an open letter objecting to marketing claims made by brain training companies.

Pretty soon, another group, with more than 100 scientists, published a rebuttal saying brain training has a solid scientific base. ...

In an effort to clarify the issue, Simons and six other scientists reviewed more than 130 studies of brain games and other forms of cognitive training. The evaluation included studies of products from industry giant Lumosity....

After combining the results of all the studies (and, notably, after separating the well-conducted studies from the poorly-conducted ones), the researchers concluded that brain games do not make you smarter. Here's a statement of the results:

There were some good studies, Simons says. And they showed that brain games do help people get better at a specific task. "You can practice, for example, scanning baggage at an airport and looking for a knife," he says. "And you get really, really good at spotting that knife."

But there was less evidence that people got better at related tasks, like spotting other suspicious items, Simons says. And there was no strong evidence that practicing a narrow skill led to overall improvements in memory or thinking.

That's disappointing, Simons says, because "what you want to do is be better able to function at work or at school."

b) This example shows the theory-data cycle in action (Ch. 1). What were the two competing theories this literature review was testing? How do the data support one theory and lead us to reject the other one?

Next, as the researchers collected the studies, they found that some of them were flawed in design:

The scientists found that "many of the studies did not really adhere to what we think of as the best practices," Simons says.Some of the studies included only a few participants. Others lacked adequate control groups or failed to account for the placebo effect, which causes people to improve on a test simply because they are trying harder or are more confident.

c) How might you design a really strong study to test the effects of a brain training game? What controls would be necessary?

d) What is the downside to including only a few participants in a study? (Ch 10)

Suggested Answers

a) This was almost certainly a review journal article.

b) One theory said that brain-training games improve your general intelligence; the other theory said that brain-training games do not improve general intelligence. The review of all the studies combined is consistent with the second theory.

c) Your study designs will vary.

d) The main downside to conducting a study with only a few participants is that small samples can be more likely to produce results that are unusual or extreme, because a few outliers in the data can have an undue influence on the overall pattern. Studies based on larger sample sizes are usually more stable and we can be more confident that they will replicate.

The story begins by explaining how difficult it can be for families to handle the stresses of living in poverty.

“When families are living in poverty, the infants need extra-sensitive parents and it’s harder for the parents to give that extra-sensitive parenting,” said Anne Heller, founder of Power of Two, which opened in Brownsville, Brooklyn, last fall and has worked to help nearly 100 families become more sensitive to their children’s needs.

The idea is that forming strong attachments in the first years of life can buffer children from stress over their lifetimes, and also lead to better academic achievement. Without these attachments, the theory goes, some of these children will already be at a disadvantage by preschool.

The Power of Two's founder, Anne Heller, wanted to find a way to help families overcome the obstacles of poverty and homelessness. She went to the psychological literature on child development to find evidence-based interventions.She found a program called Attachment and Biobehavioral Catch-up (ABC)...

...developed by Mary Dozier, a professor at the University of Delaware as part of her work on the Infant Caregiver Project. The group’s research found that children who received the ABC intervention had more secure attachments than those in a control group.

The researchers also looked at levels of cortisol, the hormone produced by the body in response to stress. Children who received ABC had a healthy rhythm of cortisol levels, starting high in the morning and low in the evening, but the control group had a blunted pattern, which researchers said has been linked to behavior and attention problems.

The program Power of Two is an excellent illustration of how we can use consumer-of-research skills to find evidence based programs that help improve people's lives. In this case, it's an evidence-based program that can help people become better parents. When the Power of Two started the program, they applied for funding, and then,

Together, Heller and Bernard hired 12 coaches and recruited families to participate in the program at schools and hospital fairs. The children currently served by the Power of Two range from six months to two years old. [So far, 61] families, including Dawkins and her daughter Maya, have finished the program. [about 100 families total have been involved.]

Here's an example of what ABC looks like in practice, as described by the journalist:

At age 2, Maya turns to her mother, Taneice Dawkins, to show off her every move. She pounds a spoon on a table, prompting her mother to exclaim, “Oh, it’s a loud banging spoon.” The little girl hands her mother a cup, a piece of paper, a plastic banana. Each time, the 36-year-old Brooklyn mother responds, “Thank you, Maya,” a phrase the toddler repeats.

...

During 10 home sessions, ...coaches [had] worked with Dawkins — as they do with all parents in the program — on three areas: how to better nurture the child, how to avoid frightening her and how to follow the child’s lead. For example, nurturing can be picking a child up immediately after a fall. Sounds simple, but many families worry this will spoil the child, a myth coaches try to dispel. Behavior that is frightening to a child can range from nonstop tickling to a parent’s fits of rage. And following a baby’s lead can be the trickiest area to learn, forcing the parent to pick up on subtle cues such as throwing a toy when the child wants to play alone.

Chapter 1 in the text describes the benefits of being a good consumer of information. Anne Heller of Power of Two certainly applied her skills as a consumer of information to research the types of programs that would be most likely to help children living in poverty.

Questions

a) What makes the ABC (Attachment and Biobehavioral Catchup) program "evidence-based?"

b) When the journalist mentions that "children who received ABC intervention had more secure attachments than those in a control group," what does that tell you about the research behind the ABC intervention?

c) Why do you think Anne Heller chose to look for empirically-supported programs, rather than simply trying programs she'd heard about, or programs that sounded good to her?

d) Does research on the ABC program count as applied research, basic research, or translational research?

Suggested answers

a) Since ABC interventions have been tested empirically by researchers and published in peer-review journals, we can be reasonable sure that the intervention actually works. It is supported by research. Psychologists believe that all interventions, such as therapies and social services programs, should be evaluated through well-conducted studies.

b) This means that ABC has been tested in an experiment, probably through random assignment to the ABC program or to a control program.In this experiment, the independent variable would have been type of program (ABC or control), and the dependent variables would have been the children's attachment style and their cortisol levels.

c) Answers may vary, but you might have considered that programs that simply sound good may not actually work. When it comes to behavioral interventions, it is essential to be sure the programs we are using have been shown to actually benefit people.

d) I think this counts as translational research, because it uses lessons from basic research (in this case, on attachment theory) to develop and test an intervention. The ABC approach bridges basic research on attachment and the applied context of parenting while coping with the stressors of poverty.

However, if the Power of Two program were to attempt to follow the families who have gone through the ABC training in their community to see the effectiveness of the program in the real world, that would count as applied research.

02/20/2016

How fast do people talk? How wordy are they? Do they swear a lot? Photo credit: Eugenio Marongiu/Shutterstock

There's a new piece in the Atlantic summarizing some research on phone calls. The journalist's piece is tantalizing, but it doesn't provide some of the information you need to evaluate the research behind it.

According to the article, a data analytics firm analyzed a bunch of recorded phone calls--about 4 million of them--from customer service interactions. According to the journalist, the firm counted how fast people talked, how many words they used, and even how long they would wait on hold before hanging up! The journalist reported:

In some sense, [the] findings hew to cultural stereotypes. The fast-talkers are concentrated in the North; the slow-talkers are concentrated in the South.

Specifically, the fastest talkers were found in Oregon, Minnesota, Massachusetts, Kansas, and Iowa. The slowest were in North Carolina, Alabama, South Carolina, Louisiana, and Mississippi.

The Atlantic piece includes some colorful maps of the USA, indicating the fast, slow, and medium-talking states. Check them out!

a) While you're there, see if you can figure out how they defined "speed of talking" in this research. Also see if you can find out how much faster the Oregonians talk, compared to the Mississippians (how big is the effect? Is it statistically significant?). Before playing up these regional differences in a magazine article, this seems like important stuff to know, right?

The Atlantic piece goes on to discuss how "wordy" the phone calls were. Here the results were different. The journalist explains:

...speedy speech doesn’t necessarily equate to dense speech. [The analytic form] also used its dataset to analyze the wordiest speakers, state by state—the callers who, regardless of their tempo, used the most words during their interactions with customer service agents.

and also:

Some of the slower-talking states (Texas, New Mexico, Virginia, etc.) are also some of the wordier, suggesting a premium on connection over efficiency. Some of the fastest-talking states (Idaho, Wyoming, New Hampshire) are also some of the least talkative, suggesting the get-down-to-business mentality commonly associated with those states.

b) Can you find out, from the article, how much wordier the "wordy" states are compared to the less wordy states? How big are the effects?

c) And can you figure out from the Atlantic article exactly how the researchers operationalized "being wordy" and "talking fast?" How are these two variables different, exactly? (I couldn't find it from the article--let me know if you can!)

I'm being critical of the journalist's coverage here, but I'm also keeping in mind that it might not be her fault. The source of the data is not a peer-reviewed journal article (which would have been required to present this information). Instead, the source was a report by the for-profit analytics firm that had collected the data. Perhaps this firm did not report enough methodological and statistical details to allow careful reporting by the journalist.

Specifically, you learn that talking speed was operationalized as "words per minute". And the differences between states were described this way:

For every 5 words a slow talking state utters, a fast talking state will utter 6.

You also learn that wordiness was operationalized as "total words in a phone conversation." and:

How big is the difference? A New Yorker will use 62% more words than someone from Iowa to have the same conversation with a business,

d) What do you think of these operationalizations of talking speed and wordiness? And what do you think of the magnitudes of the effects? Are these large differences--worthy of a news story on differences between states? If you were the journalist, would you have incorporated this information, or not?

You might also be interested in a second Atlantic story about which states are the "sweariest." Compared to the story on talking fast (and being wordy), this story is more informative. For one, it describes how swearing was operationalized. They

...examin[ed] more than 600,000 phone calls from the past 12 months—calls placed by consumers to businesses across 30 different industries. It then used call mining technology to isolate the curses therein, cross-referencing them against the state the calls were placed from.

The journalist also opined, (probably correctly):

...cursing is also conveniently specific as a data set; you've got your f-bombs and your double hockey sticks and your bodily functions, and, factoring in their permutations, you're good to go. Plus, you don't need much sophisticated sentiment analysis to ensure that your data are accurate: An f-bomb is pretty much an f-bomb, regardless of the contextual subtleties.

In the swearing story, the journalist also described the data better, explaining how the rates of swearing in the highest-swearing state and the lowest state compare:

People in Ohio cursed the most as compared to every other state in the Union: They swore in one out of about every 150 phone conversations. Ohio was followed, respectively, by Maryland, New Jersey, Louisiana, and Illinois.

And who swore the least? Washingtonians. They cursed, on average, during one out of every 300 conversations. (Yes, this means that Ohioans swear at more than twice the rate of Washingtonians....)

d) Now you have a better picture of the data. What do you think of the sizes of these effects? Big or small? Important, or not?

e) The headline of this story announces that Ohio "is the sweariest state in the Union." What do you think of that headline? Specifically, does that headline overstate the results? Does it overgeneralize from the measures they used?

09/10/2015

Why might it be difficult to study the long-term impact of quality preschool programs? Photo: Shutterstock

An NPR feature story recently covered Tulsa, Oklahoma's free preschool program, which began about 10 years ago. Like some other cities around the U.S., Tulsa introduced a free pre-K program, with the goal of helping children from all economic backgrounds get ready to succeed in school. Developmental psychologist Deborah Phillips of Georgetown University has tracked children who participated in the Tulsa program, and is especially interested in how the kids are doing as they start high school. According to the story:

"These children did show huge gains in early math and early literacy skills," says Deborah Phillips, a developmental psychologist at Georgetown University who has been overseeing the study. "They were more likely to be engaged in school, less timid in the classroom and more attentive."

Phillips says preschool gave them a good, strong boost into elementary school, Today, as eighth-graders, says Phillips, most of these kids are still doing really well.

Phillips didn't just look at grades and test scores. Her team looked at student mobility, whether kids were in advanced or special education classes. They examined retention rates, absenteeism, and they even surveyed students' attitudes about school.

Researchers then compared these eighth-graders to a large sample of Tulsa eighth- and seventh-graders who did not attend preschool. They found that those students were not doing nearly as well.

a) What kind of study is this? What are its independent and dependent variables? This is one of those studies that has multiple dependent variables. Make sure you include all of them in your answer!

Although Dr. Phillips' conclusion is that Tulsa's preschool improved kids' abilities to do well in school, the NPR piece interviewed some folks who objected. Here's one example:

....Russ Whitehurst, senior fellow with the Center on Children and Families at the Brookings Institution, a Washington, D.C., think tank....says he's looked closely at the Tulsa study and takes issue with the way researchers compared kids who were in the program with those who were not.

"What Dr. Phillips and her colleagues have done is scrounge up a bunch of kids who for whatever reason — and they don't know that reason — did not attend pre-K at all," he says.

"We don't know if they were similar to the kids who went to pre-K," Whitehurst adds. "That's why the design [raises] question marks about the ability to conclude that pre-K had the affects attributed by Dr. Phillips."

b. What kind of criticism is Dr. Whitehurst raising about this study? (What terms from the text can you apply here?) What, if anything could have been done to prevent the problem?What are the practical and ethical limitations to such research?

Here's another example of a person who is skeptical of the study's results:

...[high school] principal Nanette Coleman ... says she doesn't know how many of her ninth-graders this year attended Tulsa's preschool program, and has not seen the research. But she has a hard time believing that preschool, no matter how good the program, is going to have an impact on a student 10 years later.

"They're going to struggle when they get to me because there are so many outliers that can have a student not be successful," she says. "Let me be clear," she adds, "I've never made a direct linkage between a pre-K program and their high school success."

c) On what source of information (from Chapter 2) is Principal Coleman basing her beliefs? What are some problems with that source of information?

Let's take a look at the study behind the headline. The journalist's story covers a recent publication in the journal, Computers in Human Behavior. As the study's author told the journalist,

"We all have watched a cat video online, but there is really little empirical work done on why so many of us do this, or what effects it might have on us," added Myrick, who owns a pug but no cats. "As a media researcher and online cat video viewer, I felt compelled to gather some data about this pop culture phenomenon."

So far, so good. Empiricism is a great way to answer questions about pop culture phenomena. So what did they do?

The study, by assistant professor Jessica Gall Myrick, surveyed almost 7,000 people about their viewing of cat videos and how it affects their moods. It was published in the latest issue of Computers in Human Behavior. Lil Bub's owner, Mike Bridavsky, who lives in Bloomington, helped distribute the survey via social media.

A survey, then. Here are some of the results:

Participants in Myrick's study reported that:

They were more energetic and felt more positive after watching cat-related online media than before

The pleasure they got from watching cat videos outweighed any guilt they felt about procrastinating

About 25 percent of the cat videos they watched were sought out; the rest were ones they happened upon

Time to analyze this causal claim.

a) What are the variables in the journalist's causal headline, "viewing cat videos boosts energy and mood?"

b) What kind of a study does it take to establish this causality? How might you have designed such a study?

c) What kind of study seems to have been conducted here? Do the results support the causal claim? If not, what would a more appropriate headline be for a story on this study?

Finally, the journalist mentions that LilBub's owner advertised the study on his website, and that for each person who took the survey, he donated money to the APSCA, which supports animal welfare. Over 7000 people completed it.

d) One result from the study is that "about 36 percent described themselves as a cat person, while about 60 percent said they like both cats and dogs." Can we assume from the results and the methodology that 96% of people are animal (either cat or dog) lovers?

07/10/2015

There's a fun interactive datagraphic on gallup.com's website. It's called "State of the States." You can select a polling variable, such as "overall well-being," "support for Obama," or "religiosity," and it will show you how each U.S. state scores on that variable.

Feel free to take a minute to play with the interactive right now. (I'll wait.)

I've pasted a screen shot from the "well-being" results below. Take a look at it, and consider the questions that follow.

a) In the figure above, the variable I selected was "Well being." The thermometer below indicates that darker states are higher in well-being than lighter states. Using that rule, which states are the highest in well-being? Which are the lowest?

b) You might notice that South Dakota is higher in well-being than North Dakota--their shades of green are noticeably different. In fact, you might even imagine a news story in which a reporter suggests that South Dakotans are "happier." But I want you to consider the effect size of the difference. About how much happier are South Dakotans, according to the scale?

Now consider the next screen map (below). This one shows religiosity, indicating the percentage of state residents who consider themselves "Very religious":

c) As before, the thermometer below indicates that darker states are higher in saying they are "very religious" compared to lighter states. Using that rule, what states are the highest in religiosity? Which are the lowest?

d) Take a look at the scale for this variable--what do you notice about the range for Religiosity compared to the range for well-being?

e) On the map, the states of Utah and Idaho are about the same shades of green as South and North Dakota were on the well-being variable. Indeed, the shades of green for Utah and Idaho are noticeably different. In fact, you might now imagine a news story in which a reporter suggests that Utahans are "more religious." Once again, I want you to consider the effect size of the difference. How much more religious are Utahans, according to the scale?

e) What do you think? How is Gallup using these shades of green in this interactive data map? Is their use misleading? If so, what might be better?

If you’re a research methods instructor or student and would like us to consider your guest post for everydayresearchmethods.com, please contact Dr. Morling. If, as an instructor, you write your own critical thinking questions to accompany the entry, we will credit you as a guest blogger.