[ODP] Country of origin and use of social benefits: A pilot study of stereotype accur

3) "There are several measures of inter-rater consistency. Perhaps the simplest is to calculate the mean correlation between raters. Figure 1 shows the distribution of rater intercorrelations."

No explanation is given what is being correlated here.

4) "One method is to correlate the estimates with the real values (Jussim, 2012, p. 205)."

I would clarify, "... to correlate participants' estimates of group values with the real group values..."

5) "Using the Pearson correlation as the measure of accuracy, the raters' accuracy scores are .78 and .89, and their inter-correlation is .51 (individual-level)."

Isn't that .51 just the inter-rater correlation which tells us nothing about accuracy? What does it do there? Very confusing. Also I'd use values in that example that yield accuracy scores more similar to those in the real data (.78 and .89 are much higher than real accuracy scores).

6) "Figure 2 shows the distribution of Pearson correlations."

Clarify, e.g., "... of Pearson correlations, each data point being a correlation between a participant's estimates of group values and the real group values." (Assuming I get this right.)

7) Regarding Table 2, why do you report correlations between individual accuracy measures rather than the accuracy measures themselves? I would merge section 5.2 to the beginning of section 5 so that at least mean/median individual accuracy measures would be presented before correlations between them.

Thank you for another set of good suggestions. We have implemented most of them.

1)
Fixed.

2)
Switched to using "scale".

3)
Changed to "There are several measures of inter-rater consistency. Perhaps the simplest is to calculate all the correlations between raters' estimates."

4)
Added your version.

5)
Yes. It is there for comparison purposes.

I have changed the numbers so they are less accurate: mean cor = 51, aggregate .68.

6)
Used your version.

7)
I don't understand what you want. We have 48 subjects in the final sample, so reporting the accuracy scores by subject would result in a table with 48 rows.

The reason the summary statistics are presented at the end is that after inspecting the Pearson distribution plot, one further outlier is removed. If the summary statistics were presented first, they would not reflect this extra exclusion.

8)
Fixed.

9)
Fixed.

10)
Moved to footnote. Removed link to Peter's blog.

11)
Added "may".

---

Sean has informed me by email that he approves of the paper after having reviewed the questionnaire. He should make a post here soon about it.

Thank you Emil for posting a translated version of the questionnaire. I have reviewed the items and I am recommending publication of the article.

My only real concern was simply being thorough, as my previous review stated I feel the authors did a good job being measured in their conclusions given that the study was a pilot test. I feel it is important for us as a field to publish most if not all of the studies we as researchers conduct. This helps to reduce publication bias and the inflation of effect sizes in meta-analyses.