One of the important questions to team management is how to put fans in the stands. Obviously winning ballgames helps, but the question is how much and in what way? Using data going back to 1950, I set out to create a model to help answer this question.

Attendance has varied wildly from baseball's inception. In 1950, the St. Louis Browns drew just 3,300 fans per game. Meanwhile, teams now routinely draw more than 10 times that amount. Clearly the shape of this data is not going to be linear. To deal with this, I transformed the data by taking the log of the per game attendance and used that to build my models.

There were several things I wanted to test out. For one, what is the relationship between attendance and WPCT in a particular year? What relationship is there for the previous year? Did it matter if the team made the playoffs? How about if they made the playoffs the year before? How about if they had recently won the World Series? How much did a new park affect attendance, and how long did it take before this effect wore off? Likewise with a team that just moved to a new city?

These were all questions I wanted to find out. I decided to get a little more rigorous than usual and create a mixed model to tackle some of the assumptions that might be broken by using a plain old general linear model. Obviously attendance varies by team as well as by the factors above. However, to get at the questions above we don't really care exactly what the effect sizes are for each team. The mixed model allows us to model this "random effect" of different attendance baselines by team. The mixed model also accounts for the fact that the errors are likely to be correlated from year to year by team. By accounting for this autocorrelation, we can get a better model. These changes have the advantage of more properly estimating the variance and standard errors of each of the variables that we care about.

So enough with the stats, what was important?

Winning Games

For one, not surprisingly, team winning percentage is very important. The average .500, non-playoff team that does not have a new park or any other advantages draws about 24,500 fans. Every extra game won adds about 300 fans per game. Of course, the relationship is not linear, but that's an approximate estimate. All else being equal, a .400 team will draw about 20,100 fans, while a .600 team will draw 29,900 - difference of about 10,000 fans per game. Obviously, winning teams draw more fans and the effect is quite large.

But, that's not even the half of it. As you might expect, the team WPCT from the year before also has a very large effect. This effect is not as large, but a .500 team who was a .400 team the year before draws 22,700, while a .500 team who was a .600 team the year before draws 26,400. This "year before" effect makes sense. At the beginning of the season, fans don't really have an idea if their team will be good, so it makes sense that they use last year's performance as a guide. The previous season's success draws fans back to the park, even if that success isn't repeated the following year.

The effect of winning continues up to two seasons later. As you would expect, however, the effect is smaller. Having a .600 team three seasons ago only puts about an extra 500-600 fans in the seats per game. However, this effect is statistically significant.

The chart below shows how the modeled effect of a team's WPCT is diminished as time goes on. The chart assumes that the team went .500 in all other years and that the team did not make the playoffs.

While WPCT obviously helped attendance overall, I was also curious to see if the slope of the lines changed depending on whether the team was over or under .500? Did an extra win provide different attendance value to a .450 team vs. a .550 team? I refit the model to test this out and found there was no significant difference. The relationship between WPCT and attendance was the same whether the team was good or bad. While of course an extra few wins won't help a poor team get any closer to a championship, it will help at the box office, and the attendance effect of those wins is just as important to poor teams as to good teams. Next time you deride a bad team for "wasting" their money by signing a free agent when they have no chance of winning anything, realize that the difference between stinky and mediocre can have a strong effect at the box office, even if it won't win any flags.

Making the Playoffs

Of course, that's without accounting for the attendance draw of being a playoff team. How does making the playoffs affect attendance? In general, a .550 team which did not make the playoffs will expect to draw 27,000, while a .550 team who did make the playoffs can expect to draw an extra 1,800 fans. Of course, this isn't an exact correlation. Whether a team makes the playoffs on the last day of the season has little effect on their average attendance for the year. However, the playoff effect is a proxy for the excitement surrounding the team being involved in a pennant race, and that's why it's included in the model.

How about the effect of making the playoffs the year before? As you might think, that has an even greater effect. Making the playoffs the year before raises a .500 team's attendance by about 3,000 fans per game - a major boost. Obviously making the playoffs raises hype around the team, and this appears to manifest itself in the form of increased attendance.

There are also significant interaction effects surround making the playoffs. Here we see evidence of a significant diminishing returns for playoff appearances. If you made the playoffs last year, making the playoffs this year won't create the same excitement as it might have had the team been experiencing success for the first time. The graph below shows expected attendance depending on when teams made the playoffs.

The first four bars on the graph above gives expected results. The effect of having made the playoffs diminishes over time. However, when teams make the playoffs multiple times in a short span, the results can be counterintuitive. The biggest oddity is that among teams who made the playoffs last year, attendance is higher for teams who miss rather than make the playoffs the following year. To me this doesn't make a lot of sense. The only explanation I can come up with is that a team coasting to a second straight playoff appearance might draw less because its fans will think "I'll watch 'em in the postseason", while fans may be more apt to come support a contender who fights for, but ultimately fails to earn a postseason spot. However, this explanation seems like a reach.

While I'm skeptical that making the playoffs can actually ever be a hindrance to attendance, the data definitely support the conclusion that multiple playoff appearances don't help attendance much more than just one playoff appearance. Simply put, making the playoffs isn't such a big deal after it's been done already. The data are an indication that fans can become bored with persistent winners (are you listening Braves?) and that excitement reaches a kind of maximum level which can't be exceeded no matter how successful the team has been in recent years. Of course, winning multiple years in a row is still better than never making the playoffs at all, as the chart above shows.

Winning the World Series

Winning the World Series has an effect, but it's not as strong as you might think. A .500 team which finished with a .600 WPCT last year and made the playoffs is expected to draw 28,100 fans. If that team also won the World Series last year, the expected attendance increases to 29,700. This increase of about 1,600 fans is significant, but it won't make or break the franchise. Additionally, the World Series effect lasts only for the year directly following the World Series victory. While winning a championship is every team's ultimate goal, it appears that making the playoffs has a stronger attendance effect than actually winning it all. As for appearing in the World Series, but not winning it, no statistically significant effect was found. Likewise for advancing beyond the first round of the playoffs.

A New Ballpark

None of these effects are as large however, as the new park effect. A look at its effect on attendance shows why every team has been clamoring for a new stadium. A regular, .500, non-playoff team usually draws 24,500. When the same team gets a new stadium, their expected attendance increases to 33,600. That's nearly a 10,000 fan per game increase! It's also a gift that keeps giving. The new park effect has a statistically significant effect for the next 10 years. The graph below shows the new park's effect on attendance. Adding all of the expected attendance increases over 10 years shows that the new park boosts attendance by about 4.5 million visitors. No wonder Selig and company have been so obsessed about building new ballparks.

One other caveat is that the effect of the winning is diminished during the first year of a new ballpark. Fans are apt to come out to see the park no matter whether the team is good or bad, and the effect of winning (or losing) games is only about half as important as usual.

A New Team

The effect is even larger for being a new expansion team (or a team moving to a new city). A .500 expansion team can expect an increase of over 10,000 fans more than what another .500 team would expect to draw. However, the expansion effect lasts a bit shorter than the new park effect, lasting about four or five years before the novelty wears off. The chart below shows the expansion effects.

The Team Brand

Another important factor is the team "brand". The mixed model modeled the team itself as a random effect. In essence, it assumed that there may be differences between teams inherent abilities to draw fans. For instance, the Dodgers and Pirates may draw very different crowds even if the product on the field is the same. The model assumed that this core ability to draw crowds was a random, normally distributed variable. One of the tests the model performed was whether or not there really was a difference in inherent ability to draw fans between teams.

As you would expect, the model did indeed identify that this was the case. This inherent difference could be due to factors such as the team history, the city itself, the type of fans it has, the market size, and how well run the team is.

According to the model, the average team playing .500 ball, not making the playoffs, etc would draw 24,500 fans. However, some teams have an inherent ability to draw better than others. Teams that are one standard deviation better in this regard will have a "base" attendance level of 28,700. Teams two standard deviations above the norm have a base drawing level of 33,800. On the other end of the spectrum, teams one SD below the norm draw 20,700, while teams two SD's below the norm draw 17,500. As you can see this inherent difference in team brand can have a major effect on the attendance. Even when all other factors are equal, the actual team playing makes a big difference. Next week I'll be delving into specific teams and whether they are over or underperforming with regards to fan attendance.

Another team-specific factor we might be interested in is whether the effects of winning differ by team. By making team WPCT a random effect rather than a fixed effect, we can test this out. Do some fans respond to winning better than others? In Chicago, it's thought that Cubs fans come out no matter how the team is playing, while White Sox fans are more apt to support the team in proportion to its success. Do we really find this kind of varying effect of team success? The model finds no such thing. Each team's fans are about the same "fair-weatheredness", and each city responds to winning and losing in about the same way. How about when we do this same test for the "playoff" variable? Again, no effect. Each teams' fans seem to respond to making the playoffs in about the same way.

Conclusions

The conclusions have significant implications on team building strategies as well as what teams should expect at the gate. While its every team's goal to win a World Series, teams must turn a profit and keep their fans happy as well. Attendance certainly drives the bottom line and is also a key measure of fan happiness. Hopefully this model sheds some light on the what the main drivers of attendance happen to be. The full model is at the link below. If there's interest, hopefully I can publish a little calculator that can be used to predict attendance based on the key variables.

Comments

Nice work. For the new park graph, it might be interesting to also see the last year(s) in the old park. This would show whether attendance ultimately reverts to the old park's levels (all else being equal).

Posted by: schmenkman at March 2, 2010 11:21 AM

A good start, but there is a large hole in your analysis, so large as to make the entire analysis suspect. What about ticket prices? You are looking at what may drive demand of tickets without the most obvious component. It's as if you are analyzing what makes people buy different models of cars without factoring in the price of each car. Sure, everyone would drive a Lexus if they could, but some make do with a Corola since they must.

Consider the effect of a team which has made the playoffs multiple times. What happens? Generally, the stadium may reach it's theoretical maximum capacity each game (Boston is an example) and therefore can and will raise ticket prices. True, their attendance may suffer slightly, but their overall revenues have increased significantly due to winning more games.

If you factor in price increases and decreases, I suspect you will find a clearer relationship between winning, winning in the playoffs, winning consecutive seasons, and so on than you do in the above analysis.

Posted by: Dan at March 2, 2010 12:41 PM

This is great. Thanks. The one other thing I've always wondered is whether teams have a market-specific attendance floor, a season total which they're nearly guaranteed. Obviously, teams have an attendance ceiling based on stadium capacity, but, for example, the Pirates poor record over the last 17 years suggests their attendance isn't likely to fall below about 19,000 fans per game.

Posted by: NaOH at March 2, 2010 1:04 PM

Schmenkman, After 10 years, it does revert back to the same levels as a non-new park. I guess what you're asking is whether attendance is down at really old parks that are about to be replaced? I'm not really sure - I'll have to look into that!

Dan, You make a point that ticket prices have an effect as well. However, there's been some other research I believe showing that prices have little effect on attendance. For most of history teams have made prices affordable so if you wanted to go to the park you could. That, and the fact that teams didn't change their prices a whole lot.

As for teams that sell out, there's only been a handful of those, and they were not included in the analysis dataset. I think you are right that in a perfect world the controlled for ticket prices, we would see a bit stronger of an effect, but I think it would be very slight.

Posted by: Sky Andrecheck at March 2, 2010 1:13 PM

NaOH, I think you are correct. An average team (with no new park) that plays .380 ball for three consecutive seasons can expect to draw about 17,000. Now, if the team "brand" is also in bad shape - bad city, bad ownership, etc - then that drops down to about 12,000.

Posted by: Sky Andrecheck at March 2, 2010 1:19 PM

Sky, you may be right, however the idea that ticket prices have little effect on ticket demand runs contrary to every thing I've ever learned about economics. It might be you would have to factor in the price of parking, concessions, etc, to get a true average price per patron, but I cannot imagine that total average price has no real bearing on the demand for tickets. Otherwise every park would charge $1,000 per ticket!

Management will make ticket prices affordable - affordable enough to maximize their revenues! If that means making them $20 per ticket, they will do that. If that means making them $100 per ticket, they will do that. If that means making them $1,000 per ticket, they will call themselves the New York Yankees. :)

Posted by: Dan at March 2, 2010 2:30 PM

Dan, you are definitely right that prices can have an impact, especially if you take it to the extremes. However, teams historically have not changed their prices to adjust for demand (though they are starting to now). If the team is good or the team stinks, the price is the same. Because of this, it's not going to confound the analysis very much.

Posted by: Sky Andrecheck at March 2, 2010 2:52 PM

A couple of comments.

First, the analysis is interesting, and the results are more-or-less consistent with a lot of published attendance studies. But it's not clear whether your analysis is a multiple regression analysis or s series of bivariate analyses. The results can change as you includ more explanatory variables, for any number of reasons. (For example, multicollinearity between the explanatory variables.)

Second, the "ticket price effect" is complicated, because determining an appropriate ticket price is not straightforward. There are, essentially, two approaches. The first takes annual ticket revenue and divides it by annual attendance (defined as tickets sold, not people who actually show up). The second constructs a weighted average ticket price, where the weights are the percentage of seats available at a particular price, regardless of whether they sell or not. These will almost certainly yield different "ticket prices." I believe the second approach is theoretically preferable, but there's a lot of disagreement about this.

The first approach (ticket price as average ticket revenue per attendance), for example, is likely to yield "ticket prices" that that decline as attendance rises, if fans tend to buy the best available seats first. So the declining-ticket-price-is-associated-with-higher-attendance finding is NOT lower-ticket-prices-lead-to-higher-attendance, BUT rather, higher-attendance-leads-to-lower-priced-tickets-being-purchased...the causation is from attendance to ticket prices, not from ticket prices to attendance.

The other ticket price weirdness is that a lot of published studies find that higher attendance is associated with higher, not lower, ticket prices...

Posted by: Donald A. Coffin at March 2, 2010 4:08 PM

Sky,

What about the attendance of teams with small market payrolls? Yes, most of those team don't win as often but there are examples of teams with small market payrolls who have won (oakland, Minnesota, Tampa, Florida) in the past but still struggle to put butts in seats.

I think one of the Texas Rangers biggest problems over the last 5-6 years has been an owner who put the number 5 media market team on a bottom 10 payroll.

The '08 team was over .500 in early August and on the fringe of contending for the Wild Card but that season saw the worst attendance the Rangers had seen in 20 years.

Thoughts?

Posted by: Josey Wales at March 2, 2010 6:28 PM

Obviously ticket prices matter. Ultimately, it seems that the most important issue to owners is revenue, not attendance. I think running the study based on gross ticket revenue (presumably parking and concession information is much harder to come by) would yield far more interesting results and be more informative as to whether investing in improved players has an adequate return.

Posted by: Doug B. at March 3, 2010 12:33 PM

Donald, it is indeed just one multi-variate analysis - sorry if there was any confusion.

Josey, Teams vary with respect to their inherent ability to draw fans. Some small market teams are undoubtably on the short end of that stick. As for the Rangers, it's a mystery why the Rangers attendance went down - my model doesn't explain everything unfortunately, but your hypothesis may be correct.

Doug, the fact is that individual teams historically haven't changed the price of their tickets significantly based on demand. Because this is the case, the # of seats sold is a good estimator of demand. I agree that using your approach would be cool, although I'm not sure where I can get that data. Also, I just read your article from a few years back - very nice.

Posted by: Sky Andrecheck at March 3, 2010 12:58 PM

Thanks for the shout out on the article. (Trading Economics 101). Been waiting to find time to write something new.

I looked and couldn't find information on gate receipts, which seems like information that would be in the public domain. Absent that, attendance is the next best proxy.

Posted by: Doug B. at March 4, 2010 12:33 PM

Sky, may I ask what stat package you used for the mixed model?

Posted by: Dave at March 9, 2010 9:52 PM

Sure - I used Proc MIXED in SAS.

Posted by: Sky Andrecheck at March 9, 2010 10:19 PM

I had looked at how attendance appeared to be affected by the relocation of franchises a few years ago, in response to a suggestion that northern New Jersey could host a third MLB team in the NY metro area:

http://www.boyofsummer.net/2004/03/i-got-ya-expos-right-here.html

I saw a high correlation with winning percentage as well, and that any effect of relocation, either into or out of a market, had a half life of about two or three years at most.