Questions on the 2010 Census

The 2010 US Census form had arrived in my mail while I was in India. Now back in the US, I took out the form and was immediately struck by several things.

The form is simplicity itself – there are just ten questions (repeated for each person) but the questions are interesting for what is asked and what’s unasked.

I’ve noticed that others have commented about the race question which categorizes “Asian Indians” as a separate race, distinct from “Other Asian” (which the form helpfully mentions as including Pakistanis, and presumably also Sri Lankans and Bangladeshis) . Yes, I didn’t know that Indians were a separate race, as distinct from Pakistanis, and I also didn’t know that Thai and Cambodian and Laotian were races rather than nationalities.

I didn’t know, either, that all Indians were a homogeneous race – that, say, the Manipuris, the Tamilians and the Punjabis were all racially identical (and completely distinct, mind you, from the Burmese, the Sri Lankans and the Pakistanis they are geographically closer to).

There have also been comments about how to categorize people of mixed racial descent – what if you had both Indian and Chinese origins? Answer : you mark both boxes – the form allows you to mark as many boxes as you want, and safely leave it to the Census Gods to interpret the results.

But what I don’t understand is this – why are 2 out of the ten questions in the Census about race? I notice that the Census does not ask you about (a) your marital status, (b) number of children (including any you may only have visitation rights to), (c) education levels, (d) annual income, (e) occupation, (f) religion if any. In other words, they ask you practically no demographic data. It is just a basic counting of heads.

I mean, people answer more detailed questions when they are opening a bank account. And I always thought the Census was the one great opportunity for the Govt. to get as much demographic info as possible.

But the Census Gods do want to know if you own your home or rent it, and what your race is.

On the race question, the census website says:

Asked since 1790. Race is key to implementing many federal laws and is needed to monitor compliance with the Voting Rights Act and the Civil Rights Act. State governments use the data to determine congressional, state and local voting districts. Race data are also used to assess fairness of employment practices, to monitor racial disparities in characteristics such as health and education and to plan and obtain funds for public services.

Which still doesn’t tell me anything. I’m not even sure how the Census Gods can monitor “racial disparities in health and education” when they only ask me about my race, not about either my health or my education. Besides, how do they know that any disparities in education between say, Indians and Pakistanis or Indians and Chinese are racial?

As I mentioned earlier, there are just 10 questions in the form , and four of those questions are devoted to asking your name, age, sex and telephone number. But two of the remaining questions ask about your race/ ethnicity.

The goal of the census is to get an accurate headcount of every individual in the country. All of the questions on the form are aimed at that purpose – that’s why there’s such an emphasis on the living place and the relationships among the people living in the household. Race, gender, and age are key identifiers that allow the Census to analyze the data to make sure they’re not double-counting any individual.

The addition of more questions on the census decreases the rate of response, which is basically why the Census is going with a very short form and aren’t sending out any of the “long forms” of previous censuses. Instead, the Census conducts the American Community Survey by taking a sample of households in the US every month and asking them an extremely extensive list of questions. The monthly format means that they have a continuously evolving snapshot of American demographics, and helps keep the costs down quite a bit (since you just have a dedicated group working on that, rather than needing to rehire workers every 10 years).

First of all, I thought the goal of the census was to, you know, collect data, not to create jobs. If you want to create jobs, I imagine there are about a thousand ways to do that more effectively than to train people to collect census data, and then fire them a year later when all the census data is collected. I’m reminded of the apocryphal tale of the businessman in China:

While touring China, he came upon a team of nearly 100 workers building an earthen dam with shovels. The businessman commented to a local official that, with an earth-moving machine, a single worker could create the dam in an afternoon. The official’s curious response was, “Yes, but think of all the unemployment that would create.” “Oh,” said the businessman, “I thought you were building a dam. If it’s jobs you want to create, then take away their shovels and give them spoons!”

Secondly, I’m not suggesting that the census be replaced by anything. As an economist, I’m a huge fan of the census! But in social science you always have trade-offs in data collection between getting very detailed information from a smaller sample and getting less detailed information for a larger sample, given your budget constraint. The ACS is the Census Bureau’s attempt at solving this problem – they conduct one comprehensive sample to get an accurate headcount regularly, and have a continuously revolving extremely detailed sample. Previously, they sent a “long form” census to one in every six household – i.e. they used a representative sample – but the costs of trying to get all of those questions answered and returned without biasing the sample (given that different types of households had different rates of response) were too high. They have never been able to get detailed, non-biased information from the entire population because it’s simply too cost-prohibitive.

Finally, you misunderstand what I mean by “key identifier”. If the Madison Jones in Fairfax VA checks that she’s female and black and born on 5-23-80, and the Madison Jones in Arlington VA checks that he’s male and both black and white and born on 4-28-84, you have enough identifiers to determine pretty clearly that this is not the same Madison Jones. If, on the other hand, you have a Jeremy Johnson who checks that he’s male, both Samoan and black, and born on 6-23-05 on two seperate households’ forms, you have reason to think that it might be the same kid (who happens to shuttle between his divorced parents’ houses, or whatever). Then you can follow up by calling the phone number of the households in question, or sending a Census worker out there to inquire about it, so that he doesn’t get double-counted.

I really appreciate your quick and detailed responses to my comments. My point was that creating hundreds of jobs is not a bad thing. Sure, the Census’s purpose is to collect data, and my point is – as much of it as possible, not just a headcount.

I agree that the rationale for a smaller questionnaire was that more people would respond. But I really don’t understand why the ease of data collection should even figure into the questionnaire. It’s basically because the govt. wants to limit going door to door and would rather try to get whatever little data they can get mailed in. It’s ironic that in India, a country with a billion people where it is even more difficult, logistically,to collect Census data, the info collected is much more comprehensive.

What if the Madison Jones in question checks “Asian Indian” in one questionnaire, “Other Asian” in another? Or, using your example, “Samoan” in one, “Black” in the other?

Creating hundreds of short-term jobs actually is a bad thing – if you’re giving up the opportunity to create hundreds of long-term jobs or the opportunity to create thousands of short-term jobs. Just saying that the government should create jobs – without considering what things you’re giving up to fund that process – is not good policy planning.

Specifically regarding the census – suppose you decide that you’d like to create jobs, gven the recession, so you kill the ACS and have detailed census questions instead (requiring more workers to go door-to-door, since the rate of form return would drop tremendously). What happens in ten years when you don’t have a recession, but you need to collect data again? Hire a bunch of people at much higher wages (since otherwise they aren’t interested in the job b/c the economy is booming)? What about in five years, when you need an update of information, but you don’t have the funds to hire a ton of workers? Like I said – figure out what your goal is – in this case, the goal of the Census Bureau is to collect data – not to decrease unemployment.

But I really don’t understand why the ease of data collection should even figure into the questionnaire. It’s basically because the govt. wants to limit going door to door and would rather try to get whatever little data they can get mailed in.

Yes, because the government is also concerned with cost. Which is a good thing. One difference between India and the US is that in the US the cost of labor is much higher, which means that going door to door gets expensive quickly. As it is, it’s expected to cost $14.5 billion dollars just to conduct the census.

Also, the expectations regarding quality of data are also much higher in the US than in India. It’s understood that the data in India won’t be as good as you’d expect in a developed nation. I’m not trying to malign India with that comment – it’s just the truth, one that everyone who works in or is interested in development economics is (or should be) aware of. Frankly, I expect the true (which may not be the same as goverment-reported) margins of errors from the India census to be worse than the marign of errors you’re going to get using the ACS sample data and extrapolating to the entire population. A very good representative sample is generally a much better choice than attempting (and failing) to do a complete detailed census of the population.

What if the Madison Jones in question checks “Asian Indian” in one questionnaire, “Other Asian” in another? Or, using your example, “Samoan” in one, “Black” in the other?

There are the other questions – gender, birthday – that might help clue you in on whether this is a situation with two different individuals or the same individual. However, one expects that generally Madison Jones doesn’t think of himself as black while filling in one census form, and then have an identiy crises and identify as Samoan in the second form he fills out. Generally people identify themselves as something, find the box to check that they best fit into, and then continute to check that box. Obviously, you can screw with the census form – you could check all the boxes and pretend to be the most multiracial individual EVER. But most people aren’t dicks, so they do their best to be accurate & consistent in their answers. These checks that the census does are simply to help mistakes from getting through the system.

The US is continuing to have a census every ten years while also having the ACS.

The US is not having an extremely detailed census as well as an extremely detailed ACS because the marginal gain in data quality from that does not warrent the high cost of doing so.

I guess I don’t see your point on the Madison Jones thing – are you trying to suggest that there may, occasionally, be mistakes made in the census? Of course there will be! The goal is to keep the mistakes as low as possible, given the constraints of the budget.

You can end the discussion if you like, but I didn’t mean anything regarding the “most people aren’t dicks” comment, other than what I said – most people aren’t dicks. They’re going to fill out the census as best as they can, ’cause they’ve been told it’s important (after being reminded multiple times, possibly), and the Census will do the best it can to keep mistakes to an absolute minimum. Some people will be missed, and some people will be double-counted, but that’s life.

Oh, and regarding why race is broken up the way it is on the census… well, that’s basically they’re just simplifications based on historical immigration and societal trends. Hispanic is considered an ethnicity, rather than a race because “race” in America is typically thought of as a descriptor of what you look like – and you can have very fair or dark-skinned Hispanics, who will likely face differing amounts of discrimination due to their skin tones. South Asians are grouped together for a number of reasons: they look alike to most Americans, which means they’ll face similar sorts of discrimination, some aggregation is required to keep the number of boxes to check down to a minimum, and to have workable data at the end of the day (since the ACS uses the same racial groups, and samples such a relatively small group, you need enough of each group to make statistically significant inferences). Some groups are split off (native Hawaiians, Samoans, etc) because of government programs, historical legacy, or because in the testing of the census questions people were getting confused and marking the wrong answers.

But that’s the irony – South Asians are not grouped together. And if Hispanics are an ethnicity – why not South Asians? The Census offers the rationale that this question has been asked since 1790, in effect saying “this is how it’s always been done.”

I hadn’t realized that there was a write-in section for the Asian category on the census – which is even better than just grouping all South Asians together (obviously – finer details are always better). As for why Asian Indians get their own group, and Pakistanis don’t, there are about 2.7 million Asian Indians in the US, and only about 200,000 Pakistanis (as of 2005). Other South Asian groups are too small to for the Census to disaggregate on their website, and I don’t feel like trawling through their data for the info.

As for why Hispanics get their own ethnicity and South Asians don’t – there are around 3 million South Asians, and around 42 million Hispanics, for starters. More importantly, failure to separate out Hispanic/non-Hispanic and the racial categories leads to bad data, as people get confused. If you don’t have a Hispanic category, people who identify that way write it in, or mark several things that white, black, and Native American in an attempt to make their heritage clear. Or they just mark according to their skin tone. If you prefer, you can think of the Hispanic ethnic category as a “Are you or your ancestors originally from a country south of the Continental US?”

In contrast, if you (or your ancestors) are from Asia, there is little confusion about what to mark – Asian (either one of the specific Asian categories laid out, or write-in on the “other” part). There’s no confusion, so there’s no need to make the question more specific by adding more ethnicities or anything else.

The main reason the government asks questions about race for their databases is to find out how many people identify in the various racial/ethnic groups that are considered relevant for social programs and research. It’s not like the US govt goes, “Oh, there are one million Indian people living in California. Cool!” and leaves it at that. The information is useful to researchers trying to understand these populations, and to government programs aimed at helping these populations.

The reason these specific categories (white, black, Indian Asian, etc) are chosen is because they’re the functionally relevant ones. And, yes, that can seem like “this is how it’s always been done” – but that’s because demographics don’t change that quickly, so categories remain relevant. It simply doesn’t matter to the US govt how many Bengalis there are in the US vs. Punjabis, because there are no government programs specifically for these groups, and there aren’t enough of either group to make it seem worthwhile to start one. There’s no particular historical problem with either group (like there are with, say American Samoans , which was part of the US’s brief flirtation with colonizing) or Native Americans, which might be cause to be concerned with exactly how many there are and where they live. In short, the goal is to break down the categories as much as they need to be, and no further – because more details increase the likelihood of mistakes and bad data. That’s the goal of pretty much every effort to collect quantifiable demographic data.

Because it’s not just region of origin of you or your parents that matters in the US – or else Obama would have checked both white & black on his Census form. “Race” (as the census is using the term) is a sociological construct, and it differs from society to society. In the US, our understanding of race is informed particularly by our immigration trends, and, of course, our history. Obama identifies as black, despite his half-white heritage, in part because that’s how society views him.

The first census in the US wanted to know about the number of free white people, the number of free people of any other race, and the number of slaves – because that’s what mattered then. That’s the mentality people used to categorize other people. During the wave of immigration from Southern Europe, Italians and Greeks weren’t considered white by the then WASPy majority – now they are firmly in the white category. Demographers suspect that a large portion of the white Hispanic community will just be identifying as “white” in 30 or so years, as cultural ties to their heritage fade and since “whiteness” has higher status in the US than “Hispaniness” does.

In other societies, other racial/ethnic breakdowns of individuals matter more. In India you have the various ethnic groups, most with their own language and cultural practices. In Brazil, you basically identify by skin tone – it’s white, black, brown, yellow, and indigenous, nevermind where you or your parents are from. In Germany, until recently, you weren’t a citizen unless your parents were German, so societal breakdown is by the region your family is from + all the foreigners, who couldn’t become citizens. Etc, etc.

Interesting that the author complains about a mix up over race and national origin and then comes up with a non existant word called “Tamilian”. I think you will find the people you refer to are Tamil.

It’s not really a non-existent word, you know. Apart from the fact that millions of Tamilians use it everyday, I can also point you to Wikipedia which says the Tamil people are“also called Tamils or Tamilians“. I rest my case.

Also, I prefer using “Tamilian” because it is the closest English usage to the actual Tamil word “Thamizhan” which is used to refer to the Tamil people,