Tuesday, May 17, 2011

Winning causes payroll: study

Suppose you're a Martian who has just immigrated to North America, and you have no idea how baseball works. All you've got is a database full of statistics, and a black-box graph theory algorithm to try to figure out cause-and-effect relationships. What would you conclude?

In any case, after that, the author looks at how an increase in payroll or wins affects the future. He finds:

-- If you bump a team's payroll 10%, it wins an extra 2.5 games this year, but returns to normal afterwards.

-- If you bump a team's payroll 10%, payroll drops slowly from +10% to +2% over the next 10 years.

-- If you bump a team's winning percentage by 10%, it returns to baseline in subsequent years.

These are no big deal. However:

-- If you bump a team's winning percentage by 10%, it bumps the team's payroll by 10% immediately. Then payroll rises to +25% over the next three years, settling back down to +10% by year 10.

So, according to the algorithm, payroll doesn't seem to have long-lasting effects on winning. But winning appears to have long-lasting effects on payroll! Therefore:

"... while we found some evidence that winning affects payroll and payroll affests winning, the evidence suggests the effect of winning on payroll is the more direct, larger, and more lasting in magnitude one."

In summary: winning causes payroll.

That's what the black-box algorithm says. But, to any member of the Martian-American community who may be reading this, I would respectfully suggest: you're better off not putting too much faith in the results of this particular paper.

6 Comments:

As an NBA fan, "winning causes payroll" makes some sense. Teams feel pressured not to break-up their recent "success". See how the Bucks, Hawks, & Grizzlies talk themselves into re-signing players to bloated contracts, hoping that they will grow into contenders.

Suppose a second Martian came to North America to verify the results of the first one. Having some time on her hands, she takes the unprecedented step of actually watching a baseball game. When, in the first innings, Ichiro gets a hit, she thinks that it makes a lot more sense to say that ABs cause singles, rather than that singles cause ABs as the black box claimed. But as the innings continues, she sees that singles can cause ABs as well: by extending the innings. Somewhere around the bottom of the third, she realises that there is no way the black box can work out these sorts of causal relationships based on annual data: you would need gamelogs and a super-black-box.

Since even Martians have not yet developed super-black-boxes, she simplifies the problems. She throws out all the performance variables. Instead, for each team-year, she only considers two variables:

- team payroll (standardised in some way, perhaps fraction of total payroll); - team wins.

She then builds two predictive models:

- year N wins as a function of year N payroll (if you wanted to be careful, you'd use preseason payroll) - year (N+1) payroll as a function of year N wins and year N payroll.

These are not definitive models. It could be that good managers can talk their bosses into high payrolls, and, over and above that, are good at player selection. Furthermore, there are different ways to change payroll. Increasing your payroll by acquiring free agents will have a different effect from giving your existing players a pay rise. Finally, all the box score variables we dropped might matter after all: perhaps wins due to "luck" affect payroll differently from wins due to improved performance. The black box, however, either ignores these problems or solves them in obviously wrong ways. It's better to directly model the relationships you care about.

Winning -> Payroll might happen even in the absence of bad contracts: it might just be a natural result of having to pay one's young stars what they're worth after two or three years of the league minimum.