Monday, July 27, 2009

Disclaimer: This post doesn't really have any direction; it wanders around to no real end. It also seemed a lot more interesting when it was in my head than it did after it was on the screen.

As you know, between 1900 and 1960, each major league without exception was comprised of eight teams, with the team with the best regular season record taking the pennant. The expression "first division", still in limited but diluted use today, referred to the top four teams in the circuit. Over the course of those six decades, there was one league that clearly stood out from the others in terms of the gap between the first division and the second division.

If one endeavors to quantify the amount of balance in a league's standings, there are a number of different possible approaches. The statistically-minded among us might immediately think about measuring the standard deviation of team winning percentage in each league-season, for instance.

But what about measures that would specifically measure the discrepancy in quality between the first division and the second division? Of course there are a number of different routes that one could go, but among the most obvious simple approaches are:

1. the fifth place team's games behind the pennant winner--this will tell you how large the gap was between the top of the first division and the top of the second division. I'll call this GB(5)

2. the winning percentage of the fourth place team--this is the target you're shooting for if you want to be a first division team. I'll call it W%(4).

3. the winning percentage of the fifth place team--this tells you where the second division begins in terms of wins. I'll call it W%(5).

4. the fifth place team's games behind the fourth place team--this is the direct gap between the first and second divisions. I'll call this GB(4-5)

5. the aggregate winning percentage of the first division--or of the second division, but it doesn't matter, as these two mathematically must be complements. I'll call this FD%.

These three indicators are all related to the quality gap between the first and second divisions, but come at it from slightly different perspectives. The first four approaches both ignore the entirety of the second division except for the fifth place team, but by doing so they establish the boundary between the two divisions. The last combines each group, but does nothing to temper the influence of outliers (on the high or low ends).

You could of course come up with other measures, but this is not intended to be a rigorous statistical examination--I just want to be able to establish that the league-season in question was somewhat unusual, and these rudimentary measures are sufficient for that purpose.

This is not a trivia post--if you want to guess, do it now, because the league in question is the 1950 AL. Here are the standings:

The imbalance, perfectly divided into two groups, jumps right off the page. The Yankees captured their second straight pennant by three games over the Tigers, with the Red Sox and Indians also in the hunt. But the second division lagged far behind, with the Browns and A's losing the equivalent of 100 games in a 162 game schedule.

You certainly don't need to formally look at any data to know that these are odd standings. Oddities like this are what can make flipping through a baseball encyclopedia so rewarding for those who dabble in statistics. There is a silly but tangible sense of discovery when you find some unusual statistical line or set of standings that you had not been previously aware of.

Anyway, this is a sabermetric blog, so you're going to be stuck with some pseudo-analysis rather than just a "gee whiz!". First let's look at the standard deviation of W%, which as mentioned above speaks to the balance of wins in the league but not specifically to the chasm between the first and second divisions. Still, out of the 123 major league-seasons in the 1900-1960 period, the 1950 AL ranks 14th in standard deviation of W%:

Many of the highest standard deviations occurred in the first twenty years of the period, so the 1950 AL ranks second in the post-war, pre-expansion era. Still, it doesn't stand out as anything remarkable in this respect as just four years later the standard deviation would be greater in the junior circuit. Here are the 1954 standings:

In this case, the chasm was between third and fourth place rather than fourth and fifth, and the standard deviation was high largely due to the Indians' rampage coupled with a strong effort from the Yankees (their 103 wins would have won the AL pennant in any other season between 1947-1960).

Moving on to the measures that specifically address the difference between the first and second division, we first have GB(5), which tells us how close the top of the second division was from the pennant. The 1950 AL was well above average but not remarkable in this regard, ranking in a tie for sixteenth-highest with the 1934 AL:

Of course GB(5) is strongly related to the performance of the pennant winner. You can see from the table that many of the league-seasons featured the great teams of the period...the 1906-07 Cubs, 1954 Indians, 1927 Yankees, 1931 A's, and the like. So I also looked at GB(5) divided by wins of the pennant winner, which bumps the 1950 AL up to thirteenth place.

Next we have W%(4), the floor of the first division, and this is where the unique nature of the 1950 AL starts to shine through. Cleveland, in fourth place at 92-62 (.597), had the highest W% of any fourth place finisher of the period, and it wasn't even close:

As you can see, it was by no means the first time that the Indians had a standout record for a fourth-place finisher.

In terms of W%(5), the ceiling of the second division, the 1950 AL made the bottom three (lowest W% by a fifth-place team):

Here it's the 1931 AL that leads the way, with St. Louis topping the second division with a 63-91 mark. As you might imagine, the race for fifth was very close, with Boston just one game back and Detroit two.

So it is not surprise that when we look at the gap between the first and second divisions, no league managed to come within five games of the 1950 AL:

Finally, we have FD%, which is the aggregate W% of the first division clubs. The 1950 AL comes in sixth:

As you can see, the 1950 AL leads the way among all post-1932 leagues, with the aforementioned 1954 AL next in line.

I hope that the combination of the "look test" and the data above will be enough to demonstrate that the 1950 AL was a uniquely two-tiered circuit. For those of you who write good history articles, I think that a brief history of how the AL franchises fortunes ebbed and flowed so as to create the conditions necessary for this historic imbalance would be a very interesting piece.

I don't write good history articles, so the Cliff's Notes (and potentially misleading summary) of the second division could be as follows:

* The White Sox never really recovered from the Black Sox scandal, with eight games back in 1940 the closest they got to a pennant.

* The Senators were solid contenders in the mid-20s and early 30s, winning three pennants, but outside of that, there's a reason "First in war, first in peace, last in the American League" was in use.

* The Browns had to share St. Louis with the Cardinals, and while neither team was strong in the first twenty years of the century, the Cards blew by them in on-field success by the late-20s and became the more popular draw, despite Bill Veeck's desperate efforts to win the patronage of the city's fans.

* Connie Mack was never able to rebuild the A's all the way again after selling off his stars of the 1930s--they had been respectable in 1947-49, but 1950 saw them collapse.

What would make such a piece more interesting is the fact that the form of 1950 generally held throughout the rest of the pre-expansion period. The degree of polarization between the strong and weak teams was not nearly as strong, of course, but with one major exception, the teams essentially stayed in their divisions throughout the decade.

The table below gives the finish for each franchise (sticking with their 1950 abbreviation in the case of Philadelphia (Kansas City) and St. Louis (Baltimore)); the first division finishes have been bolded:

As you can see, the Yankees stayed in the first division for the next ten years, while the Indians missed just once and the Red Sox thrice. The Senators stayed in the second division the whole time, with the A's escaping just once and the Browns franchise just once.

There was one significant change from the standings of 1950, and that was the reversal of fortune for the Tigers and White Sox. The Tigers would make it back into the first division just twice over the period, while the White Sox joined the Yankees in never missing over the next ten years.

Not only did the 1950 AL feature a huge gap between the first and second divisions, the second division also represented a sort of permanent league underclass and the first division a permanent group of contenders, with the aforementioned exception of the White Sox and Tigers respectively (of course, the large gap does indicate that the second division teams had a lot of work to do, so this is not entirely surprising). The second division saw three of its four members move to greener pastures, while the teams of the first division all remain in the same place today. With the exception of the White Sox, it would be fifteen years before a second division team of 1950 was able to win a pennant (the '65 Senators (Twins)).

After that, things got better quickly for the underclass, as the Orioles would emerge as the most consistent AL franchise of the next twenty years and the A's, after another move, would become just the second franchise to win three consecutive World Series. Meanwhile, the first division Indians tumbled into thirty years of hopelessness, emulating the historical examples of the Browns and Senators. But the particular state of imbalance in the AL, best demonstrated in the standings of 1950, had held for a long time.