Friday, May 31, 2013

As I understand it, the recent scandal involving the IRS is about applications for a particular kind of tax-exempt status, not about audits. But people have started to talk about audits. For example, at the recent Republican state party convention in Virginia, Bobby Jindal said "By being here today, every one of you has just signed up for an audit by the I.R.S.” That was probably meant as a joke, but Peggy Noonan didn't seem to be joking when she said "The second part of the scandal is the auditing of political activists who have opposed the administration." Of course, there's no data on the political views of people who have been audited, but I thought it would be interesting to see if conservatives were more likely to think that they might be audited. It turns out I had to go back to 1997 to find a survey with suitable data (sponsored by CBS News and the New York Times). This survey asked people whether they thought it was "very likely," "somewhat likely," "not very likely," or "not at all likely" that they would be audited by the IRS in their lifetime. (About 6% of people said that they had already been audited--there was no significant difference by political views, which isn't surprising given the small numbers involved).

There was a pretty strong correlation between general views of the tax system and the perceived chance of being audited: 15% of the people who thought that the system was "quite unfair to people like yourself" thought they were very likely to be audited, while only 2% of the few who thought it was "quite fair" thought they were very likely to be audited. However, there was no significant correlation between general political views, views of Bill Clinton, or vote in the 1996 election and the subjective probability of being audited.

Since nothing much seemed to be coming up with political views, I looked at a few other variables. The survey, which was taken between April 2 and 5, asked people if they had filed their taxes yet. This turned out to be strong predictor: people who hadn't filed yet were more likely to think that they would be audited. In fact, when you controlled for views about the general fairness of the system, whether you paid more than your fair share in taxes, and whether you itemized deductions, it was a still statistically significant predictor. Of course, that raises the question of causal order--did people who thought they were likely to be audited put off filing to delay the unpleasantness? Or maybe both the expectation of being audited and late filing reflected having complicated finances.

Saturday, May 25, 2013

A couple of months ago, Ken Stern had an article in the Atlantic called "Why the rich don't give to charity." He said "In 2011, the wealthiest Americans—those with earnings in the top 20 percent—contributed on average 1.3 percent of their income to charity. By comparison, Americans at the base of the income pyramid—those in the bottom 20 percent—donated 3.2 percent of their income." He didn't give any details about the source of these figures, but in a letter to this month's Atlantic, Jonathan Meer (a professor of economics at Texas A&M) made a convincing case that they couldn't be believed: basically, the pattern followed from the way the sample was selected. In 1996, the General Social Survey asked a representative sample of Americans about how much they gave to various charitable causes. I combined those causes into two groups, religious and non-religious (which included health, education, human services, the arts, and others), and divided the totals by reported income. On the average, people reported giving about 9% of their income to religious causes and 5% to non-religious. These figures seem likely to be overestimates. However, reported giving was very skewed: the median contribution to both was 0%, while a few people claimed to give over 100% of their income. I suspect that most of the very high figures were people who did donate generously but inadvertently exaggerated or counted one donation in several categories. I then divided people into groups of approximately the lowest 20%, middle 60%, and highest 20% of household income. The results:

The means don't show a clear pattern, but they are strongly affected by the few people who claimed extremely large donations relative to their income. The medians and 75th percentiles both show a clear tendency for more affluent people to give a larger share of their income. I also did an analysis using a more sophisticated technique that essentially involves predicting ranks (proportional hazards regression). According to that, more affluent people definitely gave a larger share of their income to non-religious causes (a t-ratio of 3.7); there was some evidence, but not definitive evidence, that they gave a larger share to religious causes (t-ratio of 1.9). The measurement of income in the GSS isn't precise enough to distinguish the truly rich (the top category was $75,000 and up, which would be about $111,000 in today's dollars). But it's pretty clear that "affluent" people give more of their incomes to charity than poor people do. Of course, this doesn't necessarily mean that they are more generous: you could certainly argue that a person who gives 1% of $10,000 is making a bigger sacrifice than someone who gives 5% of $100,000 (and gets a tax deduction). But more affluent people have more money to spare after paying the bills.

Monday, May 20, 2013

There has been a lot of comparative research on "subjective well-being." Usually it is measured by a question about how happy you are, using a three- or four-category scale ("very happy," "pretty happy," "not too happy," and sometimes "not happy at all"). Sometimes people are asked to rate their satisfaction with life. A less common way is to ask people about whether they have felt various emotions recently. The ratio or difference between prevalence of positive and negative emotions can then be used to make a measure of subjective well-being. The 1990 World Values Survey asked all three kinds of questions: for the exact questions about emotions, see this post. They fell pretty clearly into two groups: excited, proud, pleased about accomplishing something, "on top of the world," and "things are going your way" versus restless, bored, lonely, depressed, and upset at criticism. I took the logarithm of the ratio of positive to negative emotions.

At the national level, there are substantial correlations among the three measures, but they are far from identical. Ratings of happiness and satisfaction have a higher correlation than either one has with the measure of good to bad emotions. This figure shows the averages for happiness (higher indicates more unhappy) vs. emotions. There are a number of cases for which they don't match up that well. The biggest discrepancies are for Slovenia, Bulgaria, and Latvia, which are among the least happy countries but towards the middle in terms of emotions, and Turkey and Chile, which are about average in happiness but well below average in terms of emotions.

Putting all three indicators together, the countries that were highest in subjective well-being are Switzerland, Sweden, Iceland, Denmark, and Ireland. Going on self-rated happiness alone, the leaders were the Netherlands, Iceland, Ireland, Denmark, and Belgium (Switzerland was seventh). Either way, countries of the former Soviet Union and eastern Europe dominate the lower ranks.

Monday, May 13, 2013

In 1949, the Gallup Poll asked "Do you think a cure for cancer will be found in the next fifty years?" 88% thought that a cure would be found, and only 7% did not. The same basic question has been asked a number of times since then with only minor differences in wording, but with time horizons varying from 10 years to a century. I estimated the relationship between the time horizon and the percent agreeing (the reciprocal seemed to give the best fit--that is, going from 10 to 20 years made about as much difference as going from 50 to 100). The adjusted percentages of people agreeing (as a percentage of those who agreed or disagreed) are shown in the figure. The adjustment estimates the percent that would have agreed if the question had asked about the next 50 years. There is a definite downward trend, but a substantial majority still expects to see a cure.

Saturday, May 11, 2013

This is a departure from my usual topics, but it's something that I wanted to get off my chest. Last week, the New York Times had an article on the economist Larry Summers, which quoted him talking to the Harvard basketball team: " Then Summers paused and asked the assembled players a rhetorical question: Did they believe a shooter could get a 'hot hand' and go on a streak in which he made shot after shot after shot? All the players nodded uniformly. Summers paused again, relishing the moment. 'The answer is no,' he said. 'People apply patterns to random data.' A statistical analysis of player performance reveals that streaks are random events." That reminded me of something I've noticed before: many sophisticated observers seem to think that the "hot hand" has been definitively refuted, not just called into question. A few days later, the Times had an article on some research suggesting that hot hands do exist (at least in free-throw shooting and bowling). But even that referred to ""the magisterial past research on hot hands."

In fact, the research has a serious flaw, at least as applied to things like field goal shooting in basketball. This is the assumption that "chance" can be represented by a model in which the probability of success in each trial is independent and identically distributed. For example, if a player is a 50% shooter, he has a probability of .5 of making every shot.

The probability of success is clearly not identically distributed: there are easy shots and hard shots. The paper that started the "hot hand" research (Gilovich, Vallone, and Tversky, "The Hot Hand in Basketball," Cognitive Science 1985), acknowledged this point but said that the model of an identical distribution was "indistinguishable" from a more realistic model in which "each player has an ensemble of shots that vary in difficulty . . . and each shot is randomly selected from that ensemble."

But in fact, random variation in the difficulty of shots will tend to hide any evidence of a "hot hand." For example, one way of trying to detect the hot hand is to look at the occurrence of streaks: are they more common than would be predicted by the model of an identical distribution? Suppose that someone takes two shots and has a .5 chance of making each one. If the chance of making the second is independent of success on the first, there is a .25 chance of two misses, and a .25 chance of two hits. Suppose that someone takes two shots, an easy one (.9 probability of success), and a difficult on (.10 probability). Then he has a .09 chance of making both and a .09 chance of missing both. That is, the chance of a "streak" is .5 if the shots don't vary in difficulty and .18 if they do, even though the average chance of success is the same in both cases. *

Random variation in the difficulty of shots will also tend to drive any correlation between success in successive shots toward zero. I did a simulation in which a player had three states: "hot" (field goal percentage of about 75%), ordinary (.5), and "cold" (about 25%). There is a 90% chance that a player will be in the same state as he was on the last shot: otherwise, he randomly shifts to a new state with the probability of 80% normal, 10% hot, and 10% cold. This degree of variation seems large enough to make a practical difference, but it turns out to be hard to detect if you allow for random variation in shot difficulty.

I assumed that shot difficulty followed a normal distribution with mean 0 and standard deviation of 1, and that the probability of making a shot was (exp(x+s)/(1+exp(x+s)), where x is the difficulty variable and s is -1,0, or 1 depending on whether a player is cold, normal, or hot. Then the correlation between success and success in the previous shot is only about .02; the correlation between success and the number of successes in the three previous shots is a little less than .04. It takes a sample of about 3,000 shots to have a 50% chance of getting a statistically significant association between success and the number of successes in the last three (around 10,000 to have a 50% chance of a statistically significant association between success in two successive shots).

Of course, I have no particular reason to think that the difficulty of successive shots is normally distributed with a standard deviation of 1, but that's the general problem: results are sensitive to the assumptions we make about an unobserved variable. And this is assuming that the difficulty of successive shots is independent. Suppose there's a slight negative correlation between them (maybe a player who just missed a shot gets more selective). That would make it even harder to detect a hot hand.

The research questioning hot hands made a valid point: that people have a tendency to overinterpret data and think there's a reason for differences that are really just due to chance. But I think the conviction that hot (and cold) hands simply don't exist reflects social scientists' love for reaching counter-intuitive conclusions rather than a justified inference from the data.

*[Note of May 31] On further review, I realize that this paragraph is wrong: the probability of a streak of given length is not affected by random variation in the probability of success. So the strategy of comparing the length of actual streaks to the expected length under the assumption of independent and identically distributed events is a valid one. I stand by my general point about needing a lot of data to identify "hot hands."

Thursday, May 2, 2013

In June 1968, a Gallup Poll had the following question: "Some people are calling this country a 'sick society.' Do you agree or disagree with them?" 36% said they agreed, 58% disagreed, and 6% weren't sure. In December 1985, a Los Angeles Times poll asked the same question: 39% agreed and 55% disagreed, again with 6% not sure. This is a case where the absence of a difference is interesting. The 1968 poll was taken just a week after the assassination of Robert Kennedy and a few months after the assassination of Martin Luther King, with the Vietnam war at its height. In contrast, 1985 was "morning in America": Ronald Reagan had been re-elected in a landslide, the economy was doing well, and tensions between the United States and Soviet Union seemed to be easing.

The question in the LA Time survey followed a fairly long series of questions about AIDS, which may have led people to focus on the negative. But it still seems surprising.

About Me

I am a professor of sociology at the University of Connecticut, and editor of the journal Comparative Sociology. My book, Hypothesis Testing and Model Selection in the Social Sciences, was published by The Guilford Press in April 2016.