Complex Experiments (Factorial Designs)

10/20/2017

Should she have put her phone away in the next room? That depends. Photo: Suwat Sirivutcharungchit/Shutterstock

Students, if you're not familiar with the study tips on The Learning Scientists website, you should be. This page in particular sums up six evidence-based things you should be doing while you study (spoiler alert: The list does not include highlighting!).

The Learning Scientists' latest blog post sums up the results of an experimental study on where your phone should be while you engage in cognitive tasks. It's titled, "Separation from your cellphone boosts your cognitive capacity." Take a look at the description to get an overview of the research design:

They invited students to participate in an experiment where students were randomly assigned into one of three conditions.

In the "other room" condition, students were asked to leave their belongings (including their cellphones) in the lobby before coming into the room where the experiment would take place.

In other two conditions, students were asked to take their belongings with them to the experiment room, and were either told to leave the cellphone out of sight, e.g., in their bags or pockets (bag/pocket condition) or place it face down on the desk within sight (desk condition).

Then, participants worked on two cognitive tasks: One working memory task – called Automated Operation Span task (OSpan) – where people are asked to actively process information while holding other information in mind....For the other task – the Raven’s Standard Progressive Matrices (RSPM) – participants had to identify the missing piece in a matrix pattern. This test is used to assess fluid intelligence and your performance depends to a large extent on the available attentional capacities to identify the underlying rule of the pattern matrix.

a) Based on the description, what kind of experiment was this: Concurrent measures? Repeated measures? Posttest-only? or pretest/posttest?

b) What is the independent variable here? There are two dependent variables in this design. What are they? (Note: You might recognize the OSpan task from Chapter 8; it was used in a correlational study about ability to multitask.)

c) What results would you predict from this study? Take a moment to sketch your prediction in graph form. Then click over to the blog post and scroll to the graphs they've made of the results.Do they match your own prediction?

You can stop working here if you're studying Chapter 10. But if you're studying Chapter 12, keep reading, because there's more! The second part of the blog post is headed "Cellphone dependence as moderator". Get ready for a factorial design.

The researchers separated people into two new participant variable (PV) groups: Those who reported feeling dependent on their cellphone throughout the day, and those who did not. They then used this PV in combination with the IV of the original design.

d) Given the description, how would you state this design? (Use the form: __ X __ factorial.)

Here are the results:

For people who reported a strong dependence, putting the cellphone in the bag or leaving it in another room made a tremendous difference for their cognitive capacity: They performed much better in these two conditions compared to the one where the phone was on the desk.

For people who reported a weaker dependence, it made no difference where the phone was. Thus, their performance was not affected by the location of the phone.

e) Sketch a bar graph or line graph of the factorial results described above. You can do it either way, but I'd recommend putting the "cell phone condition" IV on the x-axis.

f) Do you see an interaction in the results? (You should, because the term "moderator" is a sign of an interaction)

g) Let's return to the headline, "Separation from your cellphone boosts your cognitive capacity." Does the headline seem appropriate for this study? Why or why not?

Good news: The published article on which this blog post was based is open source! You can view it here.

04/20/2017

What resume cues did the researchers use to manipulate gender? Social class? Photo: Andrey Popov/Shutterstock

The Harvard Business Review described a study that tested biases among people who read internship applications. Not just any internships, either: These are internships at prestigious law firms. The HBR reports that these internships..

... open doors to even more lucrative employment in the private sector as well as prestigious judiciary and government roles. For these reasons, employment in top law firms has been called the legal profession’s 1%.

Here's a preview of the design of the study:

Now imagine four applicants, all of whom attend the same, selective second-tier law school. They all have phenomenal grade point averages, are on law review, and have identical, highly relevant work experiences. The only differences are whether they are male or female and if their extracurricular activities suggest they come from a higher-class or lower-class background. Who gets invited to interview?

a) Given this brief description, identify the design of the study. What are the independent variables? What are the levels of each IV? What is the dependent variable? Do you think the IVs were manipulated as independent groups or within groups?

Now, here's some more detail:

We uncovered this through a field experiment with the country’s largest law firms. Specifically, we used a technique — known as the resume audit method — that is widely seen as the gold standard for measuring employment discrimination. This method involves randomly assigning different items to the resumes and sending applications to real employers to see how they affect the probability of being called back for a job interview. All in all, we sent fictitious resumes to 316 offices of 147 top law firms in 14 cities, from candidates who were supposedly trying to land a summer internship position. All applicants were in the top 1% of their class and were on law review, but came from second-tier law schools.

We signaled gender by varying the applicant’s first name (James or Julia).

...to capture the economic component of class, our lower-class applicants received an award for student-athletes on financial aid. To incorporate its educational competent, they listed being a peer tutor for fellow first-generation college students. By contrast, our higher class candidate pursued traditionally upper-class hobbies and sports, such sailing, polo, and classical music, while the lower-class candidate participated in activities with lower financial barriers to entry (e.g., pick-up soccer, track and field team) and those distinctly rejected by higher-class individuals (e.g., country music). But crucially, all educational, academic, and work-related achievements were identical between our four fictitious candidates.

Now here are the results:

Even though all educational and work-related histories were the same, employers overwhelmingly favored the higher-class man. He had a callback rate more than four times of other applicants and received more invitations to interview than all other applicants in our study combined. But most strikingly, he did significantly better than the higher-class woman, whose resume was identical to his, other than the first name.

c) Interrogate the construct validity of the operationalization of social class. Do you think this manipulation was accurate and valid?

d) Using the three criteria for causation, interrogate the internal validity of the study. What controls do you notice that reduce design confounds. Is there any concern about selection effects? Why or why not?

e) What interaction in the results shows a moderator?

f) Using the data provided in the story, do you think that the study showed a main effect for gender? A main effect for social class? How about an interaction? Describe each of these effects.

g) Finally, the researchers dug deeper by conducting a follow-up study to investigate why the law firms were biased. Read this section of the article and decide, What mediating theory does the follow-up study propose?

Why did the higher-class man do so much better than the higher-class woman? To further explore this issue, we conducted a follow-up experiment with a sample of 210 practicing attorneys from around the country. We asked each attorney to evaluate one of the same resumes we used in our field experiment and...rate their candidate on factors proven to influence how favorably people view job candidates but that vary between men and women. These included perceptions of the candidate’s competence, likability, fit with an organization’s culture and clientele, and career commitment.

Just like the employers in our audit study, the attorneys we surveyed favored interviewing the higher-class man above all applicants, including the higher-class woman. This time, though, we were able to understand why. Attorneys viewed higher-class candidates of either gender as being better fits with the culture and clientele of large law firms; lower-class candidates were seen as misfits and rejected. In fact, some attorneys even steered the lower-class candidates to less prestigious and lucrative sectors of legal practice, such as government and nonprofit roles, positions that tend to be more socioeconomically diverse than jobs at top law firms.

Thanks to Dr. Kathleen Lewis of Point Park University for sending this example my way and for writing the questions!

09/20/2014

When teachers wrote encouraging notes (rather than neutral notes) on returned papers, students were motivated to revise them. Photo: Fuse/Getty Images

How do we get students to work harder? Standards in schools may be rising, which means that students are more likely to encounter failures along the road to mastery of a subject. What motivates them to keep getting up, after academically falling down?

This piece in the Atlantic describes the impact of different motivational interventions on students from different backgrounds. In the study, students wrote an essay for their teachers, and the teachers graded their essays like they normally would, adding comments to the essay about what the students need to revise. But the researchers intervened before the essays got handed back. This passage from the journalist's story describes a 2x2 factorial design. See if you can identify the IV, PV, and DV in this example.

... the researchers randomly attached one of two sticky notes to each essay. None of the students were aware that they were part of a study and thought their teachers had written the notes. Half of them received a bland message saying, "I'm giving you these comments so that you'll have feedback on your paper." The other half received a note saying, “I’m giving you these comments because I have very high expectations and I know you can reach them”—a comment intended to signal teachers' investment in their students' success.

Then teachers offered the students an opportunity to revise their essays.

The results were striking. Among white students, 87 percent of those who received the encouraging teacher message turned in new essays, compared to 62 percent of those who got the bland note. Among African American students, the effect was even greater, with 72 percent in the encouraged group doing the revision, compared to only 17 percent of those randomly chosen to get the bland message. And the revised essays received higher scores from both the students' teachers and outside graders hired for the study.

a) List this study's IV, PV, and DV.b) For each IV and PV, indicate: Is it within groups or independent groups? Is it manipulated or measured?c) Using the results reported in the quoted passage, create a graph of the results. d) Using marginal means or using your visual inspection of the graph you made in (c), estimate and describe this study's main effects and interactions.

e) Challenge question: Like most experimental studies, this one used additional dependent variables other than the primary one described above. In the study described, the researchers also measured students' trust in their teachers. What results do you think they may have found? Sketch a graph showing the results you'd hypothesize for this DV.

06/10/2014

You might not know the name "vocal fry," but if you've been on a college campus lately, you've definitely heard it: Vocal fry is a creaky, low sound that up to 2/3 of college women put at the end of sentences. A friend once called it the "sticky voice."

Researchers at the University of Miami and Duke University asked seven male and seven female young people to say the phrase “Thank you for considering me for this opportunity” in both a normal tone and in vocal fry. Then, 800 men and women of a variety of ages were invited through an online survey to listen to the samples and [to rate each] speaker (normal or fry) [on how] educated, competent, trustworthy, attractive, and appealing [they were] as a job candidate.

a) This is a factorial design. What are the independent variables? For each independent variable, indicate if it is independent groups or within groups. Then state the design as precisely as you can.

b) Note that this study has more than one dependent variable. Name at least three dependent variables,

Here's a summary of the results:

For each trait, the listeners preferred the normal voice to the fry voice for both the male and female speakers. They were less likely to say they'd want to hire the person with the fry voice, [and] they found them to be less trustworthy. When making hiring judgments, people preferred a normal voice 86 percent of the time for female speakers and 83 percent of the time for male speakers. Women using fry were viewed more negatively than men doing so, and the negative perceptions were stronger when the listener was also a woman.

c) Graph the results for the dependent variable, "rated negativity," described in the last sentence above.

d) In the graph that you drew, is there an interaction? (There'd better be!)

e) What about main effects? Based on your inspection of the graph you made, decide whether there will be main effects for the two independent variables in this study.

f) given the results and method of the study, did the study design and results allow the journalist to say that "Vocal fry may hurt women's job prospects?" How might you improve this headline to make it more faithful to the actual study?

Read this study description and identify the three independent variables in the design. Identify the dependent variable, too.

The study...included 18 kids ages 6 to 10, whose levels of introversion and extroversion were rated on a scale by teachers and counselors.

On one day, the kids were served breakfast by adults; they were given a large [or a small] bowl, and then told the adults how much cereal and milk they wanted to have. On another day, the kids served themselves breakfast [after being given a large or small bowl]. The amount of food served -- whether by the adults, or the kids themselves -- was secretly weighed by scales hidden in the tables.

a) What are the three IV's and the DV in this study?

b) Try graphing the result. Here's your hint: you'll have two graphs: one for kids being served by adults, and the other for kids serving themselves. Each graph will have the other two IV's on it. Use this description of results to estimate the pattern:

Extroverted kids served themselves 33.1 percent more breakfast when they had the larger bowl, compared with introverted kids, who only served themselves 5.6 percent more when they had a larger bowl.

When the adults were serving for them, both extroverted and introverted kids asked for more than 50 percent more when they had a bigger bowl.

Suggested answers

a) The three IVs are:

Personality type: Extroversion/introversion (this is actually a participant variable, not a true IV). This was between subjects.Bowl size: Large or small, between subjects.

Server: Self or adult, between subjects.

b) You can check your graph against the 2x2x2 bar graph in the original paper. Click here, and scroll to Figure 1.

11/10/2013

Wired magazine presents a set of datagraphics depicting their ingenious analysis of the corpus of recipes on the website foodnetwork.com. I bet you'll have fun looking through the graphics. They correlated the ingredients list in each recipe with the average number of five-star rating points the recipe has received from readers.

For example, one graph shows that the more ingredients a recipe has, the higher it gets rated (Do more ingredients taste better? Or do people just have to justify all that prep time?)

The lead graphic shows the five-star ratings of recipes that either contain bacon or not. You can check it out here. Across almost all categories (sandwiches, asparagus recipes, kale recipes, and spinach salad recipes), those containing bacon are rated higher than those that aren't.

But there are two exceptions to this pattern: bacon as an ingredient is associated with lower ratings of pasta and dessert recipes. Though bacon generally is associated with higher recipe ratings, this doesn't apply to pasta and dessert, apparently. (Just say no to bacon ice cream.)

In these data, recipe, not person, is the unit of analysis. But your research methods lessons still come in handy. For example:

a) Can we conclude from Wired's data that adding bacon will cause a sandwich to be rated more positively? Why or why not?

b) Do you see what I see--that there's an interaction between the bacon variable (bacon or no bacon) and the type of recipe variable? Whether bacon recipes get higher ratings depends on what kind of recipe it is.

c) How might you extend these correlational data into an experiment? Which variable would you manipulate? Which one would you measure?

05/10/2013

How would you respond if you knew that people might stereotype you negatively? That's what one study recently asked. They wondered how people would approach a social situation if they were worried about the stereotypes the other person might have about them.

The studies in question can provide you with some handy practice on factorial designs.

Scientific American wrote about the study by Rebecca Neel and her fellow researchers. They explain that they recruited 75 college students, some of whom were overweight and some of whom were not.

In the first study, the participants answered questions about obese people and other stereotyped groups. Scientific American continues:

The students also were asked to envision meeting someone new and then to choose how they'd make a good impression from options such as arriving on time, wearing clean clothes, smiling and looking relaxed. Some students answered the group [stereotype] questions first so they'd have group-related stereotypes in mind when they got to the first-impressions' questions. Others completed the study the other way around.

The results showed that thinking about stereotyping changed people's behavior. Overweight students who'd first answered questions about obese people were more likely than other participants to rank "wearing clean clothes" as a very important way to make a good first impression. Normal-weight students and overweight students who hadn't been primed to think of stereotypes were more likely to prioritize arriving on time.

a) For the study above, the dependent variable could be described as the degree to which participants prioritized wearing clean clothes (over arriving on time). In addition to this DV, there is an independent variable and a participant variable in this factorial design. What are they?

b) How many levels are there in each of the IV's/PV's? Are these variables independent groups or within-groups? What kind of design is it (use this format: ___ x ___)?

c) Sketch a graph of what the results are, as they are described in the paragraph above. What main effects and interactions are present in the graph that you sketched? ﻿

The second study reported is also a factorial design. It is more difficult to find the IV's in this description, but give it a shot:

In a second study, researchers repeated the test with overweight men and black men. When prompted to think of stereotypes, overweight men ranked wearing clean clothes as the most important step toward making a good first impression. Black men, who are often stereotyped as violent and anti-social, prioritized smiling.

d) What are the IV's/PV's in this study? What is the dependent variable? (There seem to be two key dependent variables in this version of the study.)

e) Make two graphs of the 2x2 design above. One graph is for the first DV, and another graph should have the same IV's/PV's as the first graph, but will have the second DV on its y-axis.

02/13/2013

You don't see people blogging about interaction effects, at least not every day. I just ran into this 2011 Ben Goldacre column about interactions, which he describes in my favorite way, as a "difference in differences." Take a look here. Instructors, this might be a handy example for teaching Chapter 11.

01/10/2013

Another article from the mindfulness desk, this time graphable as an interaction. Read on.

A recent study by psychologist Richard Petty and his colleagues investigated the situations in which people's thoughts have more power over them. Specifically, they wanted to know if thoughts would affect people less when they wrote them down and then threw them away--as in literally, in a trash can! Here's how the study was covered by the Huffington Post.

This study is a factorial design. Read the journalist's description below very carefully, attempting to locate the two independent variables and the dependent variable.

[The first experiment] included 83 high-schoolers in Spain who were given three minutes to write their negative or positive thoughts about their own body image.

After writing down these thoughts, all of them were asked to read them back over and think about them. Half of them were then asked to throw away those written thoughts in the trash, while the others were not instructed to throw away their thoughts and were instead asked to proof-read what they had written. Then, researchers had the study participants rate their attitudes on their own body image on a scale -- for example, if they liked or disliked their bodies, thought they were attractive or unattractive, etc.

Researchers found that for the students who were not asked to throw away their written thoughts on their self-body image, what they had written down seemed to have an effect on how they rated their body image afterward. For example, someone who wrote down a lot of positive thoughts about themselves were likelier to rate themselves higher on the body image scale.

However, for the students who were asked to throw away their written thoughts, what they wrote down didn't seem to have any effect on how they rated themselves afterward.

a) What are the two IV's in this study? Name each IV. Then list the levels of each IV. Are the levels of each IV manipulated as independent groups or within-groups?

b) What is the main DV in this study?

c) Sketch a graph of the outcome of this study.

d) Estimate the main effects and interactions that Petty and his colleagues probably obtained.

The journalist also describes a second study in the paper. Here's the journalist's description:

Some of the study participants were then asked to drag that file into the computer's recycling bin; others were instructed to just drag the file to a storage disk. Some of them were also asked to just imagine that the file was moved to the recycling bin.

The researchers found that those who actually dragged the file to the recycling bin were less affected by the thoughts they'd typed out, compared with the ones who just saved to them another disk, or those who just imagined moving them to the recycling bin.

e) The study above is a conceptual replication of the first. Instead of using a real trash can to operationalize the concept of "trashing," they used a virtual one--on a computer. Can you think of another way to operationalize "trashing?"

f) The blogger's description of this second study is much less detailed than the blogger's description of the first. What details are left out of this description?

Suggested answers

a) The first IV is whether people wrote about positive or negative thoughts. It has two levels (positive or negative) and is independent groups.

The second IV is whether people proofread what they wrote or threw it in the trashcan. It has two levels (proofread or trash) and is independent groups.

b) The DV is body image, operationalized by a body image scale.

c) Based on my reading of the results, the graph might look something like this:

If your library has access to the journal Psychological Science, then you can look up the real graph of the results for this study. The study shows them in Figure 1.

d) The graph I sketched shows a probable main effect for the variable thought type, such that positive thoughts lead to higher body image than negative thoughts. There is probably no main effect for the trashed vs. proofread IV. There is an interaction, such that when people trashed their thoughts, positive messages led to only a little more positive body image than negative ones; but when people did not trash their thoughts, positive messages led to a lot more positive body image than negative ones.

e) Another way to operationalize "trashing" might be with a paper shredder, or by burning the paper you write your thoughts on. What are your ideas?

f) The second study description doesn't mention what topic participants wrote about (body image again? or something new?). The article mentions one IV (which has three levels--whether people dragged the file to the recycle bin, dragged it to the save icon, or just imagined dragging it to the recycle bin). But the article doesn't mention if there was another IV (such as positive vs. negative thoughts). It also doesn't mention how they operationalized the DV--they just say it was about being "affected by the thoughts." You can read the original published article in Psychological Scienceto look up these details (see p. 5).

02/20/2012

This MSNBC article discusses the difference in life focus and how that affects one’s level of retaliation while intoxicated.

The average age of the study's 495 volunteers was 23, all of whom described themselves as social drinkers and none of whom had any past or present drug, alcohol, or psychiatric-related problems. They each took a questionnaire designed to measure which of the participants were future-focused and which were more impulsive.

a.) Why do you think the researchers made sure the participants did not have any previous drug, alcohol, or psychological issues?

b.) There were 495 volunteers. Do you think they used a biased or unbiased sampling method?

The researchers gave half the volunteers alcohol and the other half no alcohol. Each participant then played a speed reaction game with a confederate where the winner shocked the opponent. “As the game wore on, the shocks got longer and more intense, making it seem like the opponent was getting meaner and meaner with every win.” Those who were drunk retaliated more than those who were not drunk. And finally, those who were drunk and impulsive had the highest likelihood of retaliation.

The article quoted the study’s author, Brad Bushman, who summarized the results:

"The less people thought about the future, the more likely they were to retaliate, but especially when they were drunk. People who were present-focused and drunk shocked their opponents longer and harder than anyone else in the study,” Bushman explained. "Alcohol didn’t have much effect on the aggressiveness of people who were future-focused."

c.) What are the independent variables and the dependent variable? What is the design of this study?

d.) Sketch a graph of the results, according to Bushman’s quoted description above. Could you say if there are main effects or interactions?

Suggested answers

a.) For one thing, it’s probably not a good idea to get people drunk when they have a history of substance abuse. But also, such participant variables could possibly create unsystematic variability. This unsystematic variability would get in the way of the researchers' ability to detect a true difference between groups.

b.) It’s difficult to tell from the journalist’s description alone, but it seems like the participants self-selected for the study, a method that leads to external validity issues. This is a biased sampling method. However, the study was focused more on internal validity (comparing the drunk and sober conditions) than on external validity.

d.) Your graph would have “degree of retaliation” on the y-axis. One of the IV’s would be on the x-axis, and the other would be the color of the lines or bars.

It seems that there is a main effect for drinking alcohol such that being drunk leads to greater retaliation than when sober. There is a main effect for life focus such that those who are present focused (impulsive) have a greater rate of retaliation than those who are future focused. And it seems that there is an interaction. For future-focused people alcohol does not make a difference, but for those who are present focused (impulsive) being drunk causes greater retaliation.

If you’re a research methods instructor or student and would like us to consider your guest post for everydayresearchmethods.com, please contact Dr. Morling. If, as an instructor, you write your own critical thinking questions to accompany the entry, we will credit you as a guest blogger.