Materials
The authors provided us with the original stimuli. The original study employed 20 comparative statements that were translated from Dutch to English by a native Dutch speaker and proofread by a native English speaker. Because of differences between the two languages, some statements had to be slightly adjusted (e.g., “mannen zijn egoistischer” was translated to “men are more egoistic”). Of these 20 statements, 5 were about typically positive female traits, 5 about typically negative female traits, 5 about positive male traits, and 5 about negative male traits.

Procedure
The study used a 2 (Framing: “more than” vs. “less than”) × 2 (Stereotype-Consistency: consistent vs inconsistent) × 2 (Desirability: desirable vs undesirable) design. Framing and consistency was manipulated between-subjects while desirability was varied within. Participants were randomly assigned to one of the four conditions and asked to indicate how much they agreed with each of the 20 statements on a scale of 1 (fully disagree) to 7 (fully agree).

Results and Discussion
The main finding of the original study was a main-effect of more-than framing when comparing the average agreement with all 20 statements across conditions. We replicate the main effect of framing with the same effect size. Participants agreed more with the “more than” statements (M = 3.67, SD = 1.27) than the “less than” statements (M = 2.85, SD = 1.03; F(1, 168) = 25.82, p < .001, η2p = 0.133). Just like the original study, we also found an expected main-effect of consistency with participants agreeing more with stereotype-consistent statements (M = 3.89, SD = 1.29; F(1, 168) = 45.75, p < .001, η2p = 0.214) as compared to inconsistent statements (M = 2.76, SD = 0.89). Unlike the original study, we also found a significant interaction-effect, F(1, 168) = 5.58, p = 0.02, η2p = 0.032, such that the effect of more-than vs. less-than statements was larger in the consistent conditions as compared to the inconsistent conditions (see figure 1).
We also didn’t find an interaction effect between framing and desirability, F(1,170) = 0.013, p = 0.91, η2p = 0.000. For the desirable items, people still agreed more with more-than statements (M = 3.60, SD = 1.20) as compared to less-than statements (M = 2.77, SD = 1.02). For the undesirable items, people also agreed more with more-than statements (M = 3.74, SD = 1.38) as compared to less-than statements (M = 2.92, SD = 1.09).

Replication-data can be downloaded from: https://sites.google.com/site/erkevers/home

Any Known Methodological Differences (between original and present study)?

*The original study was a pencil and paper study, our study was conducted online.
*The original study used Belgian participants, ours were American.
*The original study used statements presented in Dutch, we used statements presented in English.

The effect-sizes of the original paper and our replication are very similar.
*The original paper reported a partial eta squared of framing of 0.14, we find an eta squared of 0.14 as well.
*The original effect of consistency is .49 in the original study while it is .21 in our replication.
*The original study does not find an interaction between consistency and framing (p = .49, partial eta squared is .01), we do; p = .02, partial eta squared is .03.

I have complied with ethical standards for experimentation on human beings and, if necessary, have
obtained appropriate permission from an Institutional Review Board or other oversight group.