How we Perceive Racial Demographics

Last year I conducted a short online survey to (attempt to) answer a simple question:How accurately do people know the racial demographics of their neighborhood?

This was prompted by overhearing a great many generalizations about the racial composition of Seattle, and the UW in particular. The survey was straight forward: simply provide your guesses for the % of each race in your neighborhood, as well as a few details about yourself (age, gender, race, and most importantly ZIP code in the USA). The ZIP code was used to compare the user-estimated %'s to data from the US 2010 census.

I'd like to share a bit of what I learned...

1. Respondents, or, The Kindness of Strangers

I used Reddit, Facebook, Twitter, and a link on my own sidebar to advertise the survey. The survey stayed open about 2 weeks, and gathered 757 respondents. After removing nonsense answers I had 748. Pretty good!

(For comparison: my other notable survey was on the frequency of washing trousers, which received 833 hits in only 5 days)

Respondents were 63% male, 82% white, and largely from big cities. This is a byproduct of where I cultivated my test subjects from. I reject the notion presented by one snarky commentator, that such a sample makes my results "worthless". Instead, I think this is a useful preliminary look at something I've not seen investigated (though I'm not a social psychologist or geographer, but would love to chat with some if you know of any who are interested in developing this with me!)

2. Perceived Demographics

Previously I showed some of the initial results of the survey (with only 380 respondents), which suggested that people tend to exaggerate minority demographics, but overall got the trend correct. Here is the correlation between perception and reality...

The grey line is the 1-to-1 agreement, the orange is the median (calculated as a function of the % guessed, hence the funny shape in the middle panel). Here is my interpretation of these results:

In areas with slightly lower white populations, people can vastly under-predict the % of the white population. Over-predicting the white population is rare.

Conversely, people frequently over-predict the prevalence of non-whites, particularly the black population in their neighborhoods. (Here I've only shown black and asian populations, because they were the best sampled in parameter space.)

Broadly, the trend is actually pretty good. Despite the systematic results, the answers to roughly follow the 1-to-1 line, indicating that people have some sense of reality. If they were wildly guessing, or completely out of touch, these would just be random scatter plots!

3. Results by Age, Gender, and Race

Next I want to show some histograms indicating how close to the actual demographic composition people got, and break them up by sub-groups of respondents. The goal here is to answer the secondary question: which group of people get closest to the actual racial demographics?

First, how well does everyone together do?

The histogram above shows the distribution of "total % incorrect", which is simply the geometric distance from the "line of reality". In other words, find the difference between % guessed and % actual for each race, square the differences, add them, and take the square root. The point: people usually get the total % of racial composition correct to within 10-15%. I found this very encouraging!

Next, I compared young (age < 25yrs) to "old" (age > 25). I chose these bins because they had roughly equal numbers of respondents. These two distributions look basically identical.

Then we compare male versus female. Again these two distributions look basically identical...

I'm not claiming anything robust with these numbers, and haven't run any kind of population comparison tests (e.g. K-S test). These are just a handful of ways to cut this sample up, but unfortunately (or fortunately perhaps from a social standpoint) any underlying difference in perception between sub-groups is small. Further study is definitely needed.

4. Studying the Respondents

Finally, and most amusingly, about 32% of respondents provided me their email address. I included this as an optional response box, hoping to mostly attract some people to read my blog. I made no prediction on how many people would actually provide it! There must be existing research on the expected return of such optional questions...

For laughs, here's some (non-robust) statistics on who provided me their email: 35% of females versus 30% of males, 339% of whites versus 28% of non-whites. Lastly, the average population surrounding respondents who provided their email was on average 10% larger than those who did not! In other words, the people most likely to provide their emails were white females in more urbanized areas.
Again, the significance of this final result is highly dubious. It leaves us with the most common answer in science: we need more data.

I once had a very smart teacher who said

people can't reliably differentiate better than 10% by eye.

It seems he was right, and that people understand racial demographics overall to about 10%. This study raised in my mind as many questions as it answered, which I found entertaining. I'd love to redo this study with a (much) larger sample, and more controls.