Wednesday, October 26, 2011

A team is overdue if the number of years since their last appearance is greater than the number of teams in their league. The table below lists all such teams as of now in no particular order. Once we get into 2012, the Indians will be 15 years from their last appearance.

Some are very overdue. Some not only have not been making it to the World Series, but they have been doing poorly otherwise, too. This may seem like alot of teams, but at the conclusion of the 1958 season, 11 of the 16 teams were overdue.

We have always had hapless teams. The Phillies went 35 years without a pennant before finally getting one in 1950 but then went another 30 years. The original Senators did not win one in their last 27 years in Washington. The St. Louis Browns went only once to the Series in over 50 years before moving to Baltimore. The White Sox went 40 years without one until 1959 and then went another 46 years.

It is probably hard to quantify "overdueness" over baseball's history to accurately assess how bad things are now. In the pre-1960 years, you could be overdue in just 8 years so it may not have felt so bad to the fans. Now it means a longer time period. Right now, it would take 7 years to clear out the back log of teams overdue and it would require all of the hapless teams to get in over this stretch (it does not seem likely).

The following table shows the teams that are at least half way to being overdue.

we get the Rangers having a 57% chance of winning any given game (this leaves out home field advantage).

The Rangers were even better in September, with a differenital of .298 (.916 - .618). The Cards had .125 (.807 - .682).

The Rangers, however, have a negative differential in the playoffs so far of -.017 (.764 - .781) while the Cards have .096 (.793 - .699). Combining the September differential and the playoff differential in a weighted average by games gives the Rangers a differential of .208 and the Cards .116.

Sunday, October 16, 2011

In all playoff games, the AL teams averaged 4.87 runs per game. The AL runs per game this year was 4.46 during the regular season.

The NL average so far is 4.63 (not counting tonight's game with the Cards leading 11-6 in the bottom of the 7th). If it ends up with that score, the NL average will be 4.88. If the Brewers could win 12-11 and tomorrow's game ends up 1-0, the NL average will be 4.82. The NL runs per game this year was 4.13 during the regular season.

Only 10 of the 30 games so far had both teams scoring 4 or fewer runs. Only 5 games had both teams scoring 3 or fewer runs.

The table below show's his GDPs and GDP rate for each year of his career. 1967 was much lower than any other year. I also show his SO rate for each year (using PA - IBB - SH). Data from Baseball Reference. I thought maybe if his strikeout rate had been alot higher that year it would account for the lower DP rate. But it does not look that way. The one thing Yaz did that year that he never did before was hit alot of HRs, 44. His previous high had been 20. So maybe putting the ball in the air more helped. Baseball Reference does have his FB/GB ratios for any year. But in other years when he hit 40+ HRs, the GDP rate was not so low (1969 & 1970). Maybe he was just faster that year.

How many runs did he save by hitting into fewer DPs? Averaging his rate over 1966 and 1968, I get about 9 DPs saved. What was it worth to not hit into a DP? Using Tangotiger's Run Expectancy Matrix, 1950-2010, my guess is that it would be worth between .122 and .198 runs in each case. Using the 1950-68 matrix, with a man on 1st and 2 outs, the run expectancy is .264. With 2 outs and none on, it is .066. So if there is a man on 1st and no outs and Yaz strike's out or beats the throw to first, you save .198 runs. Doing something similar with a man on 1st and 1 out gets us .122 runs saved.

The average of those two is .16 and over 9 DPs avoided, it is just 1.44 runs. That may not be much but when the Red Sox and Twins played the last game of the season tied for first, the lower DP total might have mattered.

If I used run values from the 1999 Big Bad Baseball Annual, a GDP was -.37 runs and other outs were worth -.09. So not hitting into a DP and making another kind of out saves .28 runs for a total of 2.52 runs over 9 DPs avoided.

Now maybe Yaz got hits instead of hitting into DPs. But the two other years when he hit over .320 he had about a 10% GDP rate, still much higher. So we can't simply say he hit better that year and that caused the lower GDP rate.

Friday, October 7, 2011

Oakland A's general manager Billy Beane and author Bill James are entrepreneurs who created a whole new way of running baseball teams based on statistics and this creative spirit is starting to have an impact in the business world.

The Nobel prize winning physicist Richard Feynman said that "science is the belief in the ignorance of the experts." By challenging the experts in baseball, Beane and James were true scientists, asking questions and looking at data in new ways. Excerpts:

"JOSHUA MILBERG has plenty of business cred: an M.B.A. from Yale, experience in the mayor’s office in Chicago, a job as a vice president for an energy consulting firm. But all of that, Mr. Milberg says, matters less than his reputation as “the data guy” — someone who can offer insights through statistical analysis. And for that, he and a growing number of young executives can credit none other than “Moneyball: The Art of Winning an Unfair Game,” by Michael Lewis."

The book "...examines how the Oakland Athletics achieved an amazing winning streak while having the smallest player payroll in Major League Baseball. (Short answer: creative use of data.)

These managers are savvier with data and more welcomed in business circles in part because of the book."

"At its heart, of course, “Moneyball” isn’t about baseball. It’s not even about statistics. Rather, it’s about challenging conventional wisdom with data."

"This evangelism has created opportunities for the analytically minded."

“Moneyball” traces Billy Beane’s use of unorthodox analytics to the work of Bill James. Working as a baseball outsider, Mr. James began self-publishing his analysis and commentary in 1977 and built a passionate following."

"Once people see the value of a batter’s O.P.S. — on-base plus slugging percentage, a key measure in the book — it’s a short step to applying similar principles in their own organizations."

"Generation Moneyball isn’t yet in charge. But as the Nobel laureate Max Planck once said, “A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it.”"

Monday, October 3, 2011

This analysis is at the team level. Let's start with runs scored. The following regression equation shows the relationship between team runs per game and team OPS:

R/G = 13.02*OPS - 5.04

I used all teams from 1996-2009. For OBP I used (H + BB)/(AB + BB). The r-squared was .897 and the standard error was .1608. That works out to 26.05 runs per season.

Now when I used 1.8*OBP + SLG (call it adjusted OPS or ADJOPS) instead of OPS, the equation was

R/G = 10.26*ADJOPS - 5.68

The r-squared was .907 and the standard error was .153. That works out to 24.77 runs per season. So using 1.8*OBP + SLG is a bit better. The standard error is 5% lower. It is also 1.28 runs lower. That would be worth about a tenth of a win.

I have also used a team's OPS differential to predict winning percentage. That is, its hitting OPS - the OPS it allows its opponents. Here is the regression equation:

Pct = .5 + 1.26*OPSDIFF

The r-squared was .798 and the standard error was .0311. That works out to 5.04 wins per season. I used all teams from 1989-2002.

Now the same analysis with the differential using 1.8*OBP + SLG (call it the ADJOPS differential):

Pct = .5 + .986*ADJOPSDIFF

The r-squared was .815 and the standard error was .0298. That works out to 4.83 wins per season. So again, as in the analysis of runs scored, 1.8*OBP + SLG does just a bit better.

One reason why 1.8*OBP + SLG only does slightly better is probably that the range of OBP and SLG across teams is not that great. But for individual players, the range varies much more. So it might make more sense to use 1.8*OBP + SLG instead of OPS in those cases.

Saturday, October 1, 2011

"The Diamondbacks might not have the Brewers’ marquee names. But the numbers, and the results, show two teams that are surprisingly similar going into Saturday’s Game 1 of the NL division series."

This sounds like an interesting question yet very few numbers are presented in the article. Actually no numbers that statistically compare the two teams. It turns out, that by luck, the Diamondbacks over achieved with runners on base, a trend that no team can keep up.

It looks like the Brewers are much better. Their hitters had an OPS of .750 while their pitchers allowed a .689. That gives them a differential of .061.

The Diamondbacks hitters had an OPS of .736 while their pitchers allowed a .725 OPS. That gives them a differential of .011. Well below the Brewers.

Based on some regression analysis I have done, we can project winning pct with the following equation:

Pct = .5 + 1.3*OPSDIFF

This gives the Brewers a pct of .579 or 93.85 wins. The Diamondbacks get .514 or about 83.32 wins. This seems like a very big difference.

Yet the Brewers actually only won 2 more games (96 vs. 94). And the Brewers runs differential (721 - 638 = 83) is only slightly higher than the Diamondbacks run differential of 69 (731 - 662).

What allowed the Diamondbacks to match the Brewers, at least on the surface? Some very good luck. They over achieved with runners on base. The Diamondback hitters had an OPS of .770 with runners on base (ROB) while their pitchers allowed .731. That gives them a differential of .039.

The Brewer hitters had an OPS with ROB of .745 while their pitchers allowed .719 for a differential of .026.

This seems to be something in favor of the Diamondbacks, but it really isn't.

Doing better with runners on base will improve your chances of winning. You will score more runs than expected and give up fewer runs than expected. Yet, in the long-run, individual players and pitchers (and teams) end up performing about the same in clutch situations (like ROB) as they do overall. Even the best clutch hitters and pitchers do just a little bit better in any clutch situation than they normally do given a large enough number of games.

For the Diamondbacks to to even look like they are as good as the Brewers, they had to be lucky. We cannot expect them to continue to over achieve so much with runners on base.