Who Am I?

Aatish Bhatia is a science writer and an Associate Director, Engineering Education, in Princeton University's Council on Science and Technology. He holds a Ph.D. in physics from Rutgers University and a B.A. from Swarthmore College.
More info.

Category: Social Science

Update (8 January 2013): After I wrote this article, I heard that Mother Jones put their data of US mass shootings online. Going through this data, I realized that I made a number of errors in transcribing the data from their website. I have corrected the numbers and graphs in the plots below. These changes actually make the data fit more poorly to a Poisson distribution, weakening my original claim. I apologize for my sloppiness in this regard.

In the wake of the tragic massacre at Sandy Hook Elementary School, there’s been a lot of discussion about whether mass shootings in the United States are on the rise. Somesources argue that mass shootings are on the rise, while othersargue that the rate has stayed more-or-less constant.

It’s not clear whether we’re seeing a real uptick, or just a cluster of events that are more or less distributed at random. You’ve got to remember – random events will occur in clusters just by sheer chance. So we don’t really know whether the fact that there are many of them in the year 2012 represents a trend or just a very unlucky year.

I recently wrote a post about randomness and rare events. The main lesson from that article is that randomness isn’t the same thing as uniformity. For example, if on average, sharks attack swimmers 3 times a year, then just by chance, you will expect to see years in which no swimmers are attacked, and years in which 7 swimmers are attacked. To our eyes, streaks like this don’t seem random. But, as I argue in my previous post, we are typically not good judges of randomness. In particular, we vastly underestimate the likelihood of such streaks. And so the question is, how can you test whether a set of events is random?

Here’s how. There is a formula that tells you how many times you expect to see streaks arise from a random process. It’s called the Poisson distribution, and it assumes that your events are rare, have a fixed average rate, and are independent (i.e. that events are just as likely to occur at any time). You can then compare the number of predicted streaks to the real number of streaks in your data, and mathematically test whether a set of events is random or not.

To summarize: if the incidences of mass shootings in the US match a Poisson distribution, then this argues that the streaks (years with unusually high number of shootings) are expected due to chance. If the data doesn’t fit a Poisson distribution, then this suggests that it violates one of the assumptions – either mass shootings are not independent events, or the rate is falling, or it’s on the rise.

The data. I downloaded data for mass shootings in the United States occurring from 1982 to 2012, from this comprehensive Mother Jones article on mass shootings. I used their numbers because they compiled information from multiple credible sources, and they clearly outlined the criteria they used to classify a crime as a mass shooting. (Update: this link has the data in easily accessible formats)

Their data shows a total of 62 mass shootings in 31 years – an average of 2 mass shootings per year. However, 2012 was the most violent year on record, clocking in 7 mass shootings. Is this an outlier, or would you expect to see streaks this large, simply due to chance?

To get at this question, I counted years in which there were 0 mass shootings, 1 mass shooting, 2 mass shootings, and so on..

Number of Mass Shootings in a Year

Number of Years

0

3

1

13

2

5

3

5

4

3

5

1

6

0

7

1

Out of 31 years of data, we find one year with 7 mass shootings, and four three years with no mass shootings. Are these values consistent with an average of 2 mass shootings a year?

To find out, we can compare these counts to a Poisson distribution with an average value of 2.

In the graph above, the blue bars represent the observed instances of 0,1,2,3.. mass shootings in a year. For example, the long blue bar tells us that there were 10 years with one mass shooting per year. The red dotted curve is the Poisson distribution – these are the outcomes that one expects from a random process with an average value of 2 per year. To my eye, the red curve sort of fits the data, but not quite.

Number of mass shootings in a year

Observed number of years

Expected number of years (Poisson)

0

3

4.2

1

13

8.39

2

5

8.39

3

5

5.59

4

3

2.8

5

1

1.12

6

0

0.37

7

1

0.11

But instead of trusting my eye, we can use statistics to compare these two curves. I used a chi-squared test to test whether the two distribution were significantly different, and found a p-value of 0.18 0.09. What does this mean? It suggests that there is no isn’t strong evidence of clustering beyond what you would expect from a random process. In other words, the occurrences of mass shootings from 1982-2012 are consistent not inconsistent with the assumption that shootings are independent events, occurring at an average rate of 2 per year. However, a p-value of 0.18 0.09 is not particularly high, and if we see a few more years another year as extreme as 2012, it’s likely that this will rule out the hypothesis that mass shootings are random events.

What do I conclude from this? If mass shootings are really occurring at random, then this suggests that they are extreme, hard-to-predict events, and are perhaps not the most relevant measure of the overall harm caused by gun violence. (Update: That last claim is my deduction and not a conclusion of the above analysis – In response to some of the comments at hackernews, I wanted to clarify this point.) I agree with Steven Pinker’s take, and with this analysis by Chris Uggen, who says:

a narrow focus on stopping mass shootings is less likely to produce beneficial changes than a broader-based effort to reduce homicide and other violence. We can and should take steps to prevent mass shootings, of course, but these rare and terrible crimes are like rare and terrible diseases — and a strategy to address them is best considered within the context of more common and deadlier threats to population health.

We are compelled to pay attention to extreme events. In the words of Steven Pinker, “we estimate risk with vivid examples that we recall“. But as much as we should try to prevent these horrific extreme events from taking place, we should not use them as the sole basis for making inferences that determine policy. The outliers are a tragic part of the overall story, but we also need to pay attention to the rest of the distribution.

Update: This post was an Editor’s pick by Cristy Gelling at Science Seeker, and was included in Bora Zivkovic‘s top 10 science blog posts of the week.

Lately, I’ve got colors on the brain. In part I of this post I talked about the common roads that different cultures travel down as they name the colors in their world. And I came across the idea that color names are, in some sense, culturally universal. The way that languages carve up the visual spectrum isn’t arbitrary. Different cultures with independent histories often end up with the same colors in their vocabulary. Of course, the word that they use for red might be quite different – red, rouge, laal, whatever. Yet the concept of redness, that vivid region of the visual spectrum that we associate with fire, strawberries, blood or ketchup, is something that most cultures share.

So what? Does any of this really matter, when it comes to actually navigating the world? Shakespeare famously said that a rose by any other name smells just as sweet.So does red by another name look just as deep? And what if you didn’t have a name for red? Would it lose any of its luster? Would it be any harder to spot those red berries in the bush?

Rose coloured glasses by jan_clickr

This question goes back to an idea by the American linguist Benjamin Whorf, who suggested that our language determines how we perceive the world. In his own words,

We cut nature up, organize it into concepts, and ascribe significances as we do, largely because we are parties to an agreement to organize it in this way—an agreement that holds throughout our speech community and is codified in the patterns of our language […] all observers are not led by the same physical evidence to the same picture of the universe, unless their linguistic backgrounds are similar

This idea is known as linguistic relativity, and is commonly described by the blatantly false adage that Eskimos have a truckload of words to describe snow. (The number of Eskimo words for snow probably tells you more about gullibility and sloppy fact-checking than it does about language.)

Hyperbole aside, color actually provides a neat way to test Whorf’s hypothesis. A study in 1984 by Paul Kay and colleagues compared English speakers to members of the Tarahumara tribe of Northwest Mexico. The Tarahumara language falls into the Uto-Aztecan language family, a Native American language family spoken near the mountains of North America. And like most world languages, the Tarahumara language doesn’t distinguish blue from green.

The researchers discovered that, compared to the Tarahumara, English speakers do indeed see blue and green as more distinct. Having a word for blue seems to make the color ‘pop’ a little more in our minds. But it was a fragile effect, and any verbal distraction would make it disappear. The implication is that language may affect how we see the world. Somehow, the linguistic distinction between blue and green may heighten the perceived difference between them. Smells like Whorf’s idea to me.

Do you see what I see? A young girl from the Tarahumara tribe, whose language doesn’t distinguish green from blue. Photo credit: Fano Quiriego

That was 1984. What have we learnt since? In 2006, a study led by Aubrey Gilbert made a rather surprising discovery. Imagine that you’re a subject in their experiment. You’re asked to stare at the cross in the middle of the screen. A circle of colored tiles appear. One of the tiles is different from the others. Sometimes it will be on the left, and other times on the right. Your task is to spot whether the odd-color-out is on the left or on the right. Keep your eyes on the cross.

That’s easy enough. What’s the catch?

Well, sometimes you’ll also get a picture that looks like this.

See the difference? In one case, English speakers have different words for the two colors, blue and green. So there’s a concept that builds a wall between them. But in other cases like above, the two colors are conceptually the same.

Here’s what the researchers wanted to know. If you have a word to distinguish two colors, does that make you any better at telling them apart? More generally, does the linguistic baggage that we carry effect how we perceive the world? This study was designed to address Whorf’s idea head on.

A friend of mine recently pointed me towards an incredible resource. It’s called the Annual Status of Education Report (or ASER, which means impact in Hindi). ASER is an ambitious survey of the state of Indian rural education, conducted yearly since 2005, and their 2011 report came out a few days ago.

The level of organization here is truly impressive. It’s the largest survey conducted outside the government, combining the efforts of over 25,000 young volunteers from local organizations. Together, they survey nearly 300,000 households in over 16,000 villages in all states of India, and conduct basic level reading and numeracy tests on over 700,000 children.

Behind this coordinated effort is a simple and powerful idea, that effective policy needs to be based on evidence. The report takes a refreshingly no-nonsense approach. Rather than starting off with a long list of dignitaries to thank and lofty goals to implement, ASER gets right down to the point, with figures and tables. They focus on two basic goals. How many children are enrolled in schools (and what kind of school)? And are these children learning the very basics of reading and numeracy? By comparing trends of schooling and learning in different states, they have put together the most detailed picture so far of what’s working and what isn’t in rural education. The general picture that is emerging is one of rising enrollment but declining learning outcomes, from levels that were already low.

So let’s get down to the data. While reading through the report, some surprising facts and numbers jumped out at me.

More kids are going to school than ever before. Among 6 to 14 year olds in rural India, 97% are attending school. The toughest demographic to keep in school is 11 to 14 year-old girls, and even here the numbers are improving. Attendance in this age range has gone up from 90% to 95%. This is a remarkable achievement, and a necessary first step towards a right to education.

The graph shows the percentage of children who are NOT in school. Attendance is on the rise, so these numbers are falling.

Over a quarter of these children are now enrolled in private schools. With the new Right to Education Act, government schools are now free and, according to the statistics, are performing better than rural private schools. Nonetheless, private school education is on the rise, suggesting that there is still not enough access to the government school network.

Teachers are attending school regularly. Their attendance is at 87% (on the day of the survey). Gujarat is doing particularly well with 96% of teachers attending, and ten states have greater than 90% teacher attendance. However, as these results are based on a single day of measurement, you should take them with a grain of salt.

But the students aren’t. Student attendance is at 71%, a number that has dropped in the last four years. Some states have dropped over 10 percent here. Bihar is at the bottom of the list here, with 50% student attendance.

A quarter of all students are attending school in a language they don’t speak at home.

Half of all rural schools do not have a functioning toilet. Nearly a quarter do not have separate girls toilets. A quarter do not have access to drinking water. Adequate drinking water and functioning, separate toilets for boys and girls are now a mandated requirement by the Right to Education Act that came into effect in 2010.

More than half of students in the fifth grade can’t read at second grade level. Similar statistics arise for basic math levels. The ability to read complete sentences or add and subtract numbers is not a very ambitious standard for learning, and Indian schools are failing to achieve even this.

The percentage of fifth graders who can't perform at second grade level is on the rise.

What’s more, the math and reading levels are falling further. Learning outcomes have fallen over the last six years. Some states have dropped by over 10 percent in the last year alone.

Girls from the Birbhum village district in West Bengal, escaping the summer sun. Image credit: basoo!

I’m back at home in India, and visited my local toy store today, looking for a science kit for a wide-eyed young friend. A woman walks in, seeking a toy for a one-year-old child. “A boy, not a girl”, she hastens to add. The shopkeeper smiles, and says that at one year of age there isn’t really a difference. “I know”, replies the woman, “but I don’t want you to pick out a doll.”

This is a small example, but I find it sad how we impose these gender roles onto infants. You don’t need to be a sociologist to realize that much of one’s gender identity depends on society. If you ask an adolescent girl growing up in the United States what she wants to be when she grows up, her answer will be quite different from that of a girl in India or Afghanistan. Every society creates certain expectations for its children, and this affects the kinds of educational opportunities and careers they aspire towards. Crucially, study after study has shown that these ambitions really matter. What a child believes about their capabilities has a strong bearing on what they will actually achieve.

In the developing world, girls are routinely subject to lower expectations than boys. This bias creates an inequality in educational and societal opportunities. This raises an important question. Is it possible to reduce the gender gap in a society by changing the beliefs of individuals? A clever new study to be published in Scienceargues that in rural India, the answer is yes. The authors argue that the presence of a prominent female role model in an Indian village reduces the gender gap in that village.

Think for a moment about how you would test such a claim. Well, you’d have to randomly divide villages into two different groups. In the first set of villages, you make no change. In the second set, you put a woman in charge of each village. Then you wait and see how things change. It’s important that you choose the villages at random, because it ensures that there won’t be any other difference between the two groups. If you do see a difference develop, you can conclude that it must be caused by the change you made – in this case, the presence of a woman leader.

The insight by the authors was that India has already implemented such an experiment.