Future Shock

Does Size Matter?

the archives are now free.

All Baseball Prospectus Premium and Fantasy articles more than a year old are now free as a thank you to the entire Internet for making our work possible.

Not a subscriber? Get exclusive content like this delivered hot to your inbox every weekday. Click here for more information on Baseball Prospectus subscriptions or use the buttons to the right to subscribe and get instant access to the best baseball content on the web.

Will Carroll's experiment with using comments as something more than just comments was a rousing success, and I'd like to try a version of my own. I don't do a lot of quantitative work-it's not that I'm incapable of it, but I find that I often end up with a lot of information, and a difficulty in separating the signal from the noise. So I have a lot of data here, but no real conclusions. What I'm going to do here is just throw it all on the table, and then we can discuss it together in the comment thread and see what will come of it.

He is listed at 5'7". I remember the bromide, 'teams have a bias against short right-handed pitchers,' something the Astros tried to leverage over the years. But 'short' meant 5'11" or 6'0" (Roy Oswalt), not 5'7" (David Eckstein). Who was the last effective major league pitcher to be that short? I'm not saying this is a silly signing or Lee can't be successful (he was very good in the Olympics and can apparently throw 94 mph), but I'm honestly curious what other truly short pitchers there have been.

This prompted me to start thinking about pitcher size. I write a lot about ideal pitcher frames, and I complain about short pitchers as much as the next guy, but does it really matter? I'm not convinced of it, but here's some work I've done, without coming to any real conclusions, that I hope will spur some discussion below.

The first thing I did was to call the Indians in order to get more information on Lee, and I was able to reach a team official who informed me that Lee is actually five-foot-eleven. "Look, I've stood next to the guy, and he's taller than me," said the official, who added, "I don't think we'd be giving that kind of money to a pitcher who was five-foot-seven." In addition, Lee is different from most pitchers in the sense that he's a side-armer. A converted shortstop who did not begin pitching until he was 14 years old, Lee's fastball sits at 88-92 mph coming from that lower angle, and he also features a solid slider. He'll likely begin next year in the rotation at High-A Kinston in order to get him innings.

So, let's shift our focus and switch to the major league starters. Thanks to some great work from out data guru, William Burke, we can examine some pitcher profiles and see what they look like. First we have the performances by height of every starting pitcher this year:

The average height for a starting pitcher is 6'2¾", and the mid-point of the values lies just barely over 6'2"; the taller pitchers throw off the average significantly. There are 279 starts made by players five or more inches over the 6'2" average, but none five inches or more under it. While they're not the largest groups, note that the only heights with ERAs under four are the cadres comprising those who are 6'4" and 6'5"-sizes generally associated with the classic power-pitcher's build.

Build connotes both height and weight, though, so I took this one step further. Using the top 40 starting pitchers this year as measured by VORP, I calculated the BMI (body mass index) for each player. Now BMI is a pretty silly system when you check out how it's used; by this measurement, Dan Haren and Jamie Moyer are overweight, while Matt Cain is obese. But if we can ignore the labels, it does give us a good sense of the player's bulkiness.

The average weight of a starting pitcher this year is 213.05 pounds. Combined with our average height, that gives us a BMI of 26.8. From there I developed a matrix using standard deviations from these average heights and BMIs, with an 0.5 standard deviation from the average considered normal, 0.5-1.5 from average significant, and more than 1.5 extreme. Thus, we have:

The next thing I did was to play Olympic diving judge and get rid of the highs and lows-so anyone in the extreme categories is out. Before we do that however, let's quickly pay homage to the two opposite ends of the spectrum. There is only one Skyscraper/Fat pitcher (CC Sabathia), and only one Diminutive/Skinny pitcher (Tim Lincecum); both are among the best in the game. I'm not sure that tells us anything other than that there are no absolutes. So focusing on the remaining nine categories, we end up with the following.

Looking at these lists, and combining them mentally by height and BMI, and you start to see some trends here. Which group is the best? Which group would you think is more likely to give you 225 innings? Which group has the best health record? There are some interesting answers here. Normal/Thin is the most impressive list overall, but looking at the beefy list gives me far more confidence in regards to durability.

Now back to the original e-mail, which asks about five-foot-seven pitchers (even though we now know that Lee is merely short, as opposed to off-the-charts small). With no pitcher under five-foot-ten starting a game this year, William Burke compiled the top pitchers' seasons by those under that height in the modern-modern era (since 1969), and the list is dominated by two names:

Hi Kevin, have you tried looking at the data in relation to the fastball velocity data on gameday? I think it would be interesting to note which guys thrown harder and whether the tall guys do get more downward movement on their pitches

Size shouldn't matter if they manage to get to the Majors. What about the guys that failed to reach the Majors? Perhaps the bias against short pitchers is that they don't make it to the Majors as often as taller pitchers do.

If there is a bias against short pitchers in that way you'd see short pitchers outperforming taller pitchers on average (since the bias creates a stronger filter against the worst short pitchers). I think this actually accounts for why normal / thin is such a strong list: these guys don't get chances just because of their build but there's not a huge mechanical disadvantage to having the build they have.

You have a small error - "note that the only heights with ERAs under four are the cadres comprising those who are 6'4" and 6'5"." According to the chart, the ERAs under four belong to the 6'6" and 6'7"ers.

The reason that most pitchers are tall is the same reason that most baseketball players and swimmers are tall, and most gymnasts and jockeys are short.

I truly believe that baseball is a Darwinian process. If a pticher has the talent, then he will find somewhere to pitch regardless of his size. And if he puts up the numbers, he will get a shot at the next level.

Your data shows the result of a filtering process called High School, College, Indy Leagues and Minor Leagues. Scouts pick from the pool of available talent. And most of the talent is tall.

You ascribe far too great an efficiency to your 'filtering system.' How many potential professional pitchers never made it because they were developed as position players? I don't know, but I would guess there are at least a few. A 16 year old with the potential to pitch at the major league level will have a much harder time getting there if that potential is not identified by his high school coach.

Also, compared to basketball, swimming, and um, jockeying(?), size has much less of an impact on the success or failure of an individual in baseball. Therefore, other factors dominate height in the Darwinian process.

I agree that there are inefficiencies associated with kids getting into the system. All I'm saying is that the population of pitchers has been skewed toward taller players long before these kids become professionals.

I disagree. Bias can lead to taller players (in this case) getting more opportunities to develop. Even if a smaller player has more raw talent, it may be hard for him to overcome the difference in resources. It is more like Darwinism when a subset of the population is given guardian angels. Just because a group survives doesn't mean it is the strongest.

There very well may be a bias against short pitchers at the professional level, however I think the bias at the amateur level has much greater impact on the number of short pitchers we see at the professional/major league level. How many hard throwing 'short' 14 year olds, showing up for their first day of high school practice, get a chance to develop on the mound? It's more likely that they are ushered to the middle infield (assuming they are athletic) or given shin-guards and a mask.

You need two groups, the first group would have the top 40 (or so) pitchers by VORP (or other metric) and the second group would have everybody else. Then take the heights of the players and compares the means. It should give you a pretty clear answer.

I'll see if I can't dig something up because I need practice doing this anyway.

Before you start any major analysis, I think it is important to pin down the exact question you are trying to answer. If body type matters, how? Do taller pitchers have better pure stuff? Does their delivery throw off hitters? Are they more durable? Each one of these questions will require a different set of data, and I am not even sure it is would be possible to collect anything that resembles clean data (as a result of biases that extend to high school or before).

One thing to keep in mind when doing any analysis on this topic is that the population size of 6'2" people is considerably larger than 6'6" pitchers, more so than what looking at height data of mlb pitchers will reflect. Because of this comparisons get hard. MLB "Skyscrapers" may represent the top .00001 percentile of all "Skyscrapers" in the world, while MLB "Normals" may represent the top .0000000000001 percent of "Normals" in the world. Comparing the average of one groups "very good" to another groups "elite" is not going to be very informative.

Interesting findings, but two points to consider.
1) As others have already commented, much of the natural selection takes place at the amateur level - just like with our lack of left-handed catchers

2) Your first height/production chart does not account small sample size issues - I think it would be more instructive to only look at starting pitchers who at least pitched 50 innings as starters. It might also be instructive to view the same issue with releivers and then to compare the charts - I would make the minimum cut-off around 30 appearances. Why should Danny Ray Herrera skew the numbers all by his lonesome?

An interesting comparison is that the average American adult is about 5'9.5", 189 lbs, 27.5 BMI. (Source: http://www.cdc.gov/nchs/data/ad/ad347.pdf)

That means ALL Major League pitchers are above average height, even the "diminutive" Lincecum. That seems extreme enough to be more than just bias.

However, how accurate are players' reported heights? Certified doctors and nurses have measured my height with an error of +/- THREE inches. If there is a bias for taller pitchers, wouldn't most teams tend to err on the plus side and throw off the data?

Kevin- Interesting piece, but the one thought I had regarding this assessment is that these pitchers have already defied the odds associated with smaller players by making it to the Major Leagues. Because of this, it is hard to assess the relative success of these players because they are the enigma's that defied the stereotypes. Does that make sense?

I'd like to see him do a similar study on the heights of successful catchers. I tend to believe that a guy like Angel Salome can succeed despite protests from prospect experts who claim he is simply too short.

Height and BMI don't tell the full story about pitching anatomy (nor did Kevin, or anyone else knowledgeable, say they do). If the data are available -- and they may not be -- I'd be very interested to see this analysis repeated using wingspan rather than height. Many of the greats of the past (a good example was Walter Johnson) attracted notice for their exceptionally long arms. Pitching has changed a lot, of course, but I'm still curious whether that's true today.

I do think that the questions you asked like "Which group would you think is more likely to give you 225 innings? Which group has the best health record?" are good questions if you were to consider the role of pitchers as an NFL coach would want to analyze the role of the WR. Some obviously make better slot reveivers and others would be better for blocking purposes while others would be useful for diversions.

In baseball, at least in today's game, you have pitching roles. So maybe this is one way to set up what location in the rotation you should place a pitcher. A frame capable of higher workloads would go towards the front of the rotation while other pitchers should be relievers instead of starters. However, there are probably many who would mess with the roles because of their own better mechanics. There are exceptions that dismiss general rules all the time.

But what we may be able to develop from this sort of data with the right questions is probabilities in efficeincy. I would define efficiency in this context as the maximum output for the longest period of time. At some point/pitch count or inning count on the season certain body types may prove to become less efficient in certain roles. But if a pitcher, because of his body type, were in another role perhaps his effectiveness/efficiency can be lengthened for the full season rather than see it decrease when the penant race is in full swing, namely, when it is most important to have effectiveness. I also believe this would help evaluate if certain pitchers are worth more than other pitchers of equal body type. Financially you want the most for your money and if you get above average efficiency then you get more for your money.

First the research would need to be done on a broad scale (beyond top 40 VORP) to see what is the average pitch number or inning number at which each body type begins to decline. From there you should be able to determine roles or what pitchers would be better compared to the same body type as far as financially more sound. I think you'd also need to consider some age analysis too. Same age and body type comparisons would help. Body type might vary as to when the pitcher declines in age. Perhaps the tall-thin/normal crowd remains efficient longer in life than other body types?

Does this make sense? There are a lot of ways this can go to help a baseball team, whether in the front office, or managing in game scenarios over the course of a season.

Another thought comes to mind. Since pitching is so much a product of leg strength, rather than measuring BMI and total height of a pitcher, perhaps the height of their legs is more important than overall height. Some with longer legs and more BMI may be just as durable as the shorter leg less BMI pitchers. The leg strength is needing to be measured... the "rubber leg" to be more exact since some pitchers have one leg longer than the other!

Another thing, I think this may be helpful in analyzing when a pitcher begins to peak. Pitchers don't all peak at the same time. One may be good and then all of a sudden have more velocity on his pitches because he bulked up. I know from my own experience I grew tall just as I entered Highschool. My friends did not hit their growth spurt until late in their sophmore year. It was not an age issue, but just different timing. Same thing happened in my 20's. I was pretty skinny and as much as I worked out I could not build big muscle. Then I started to gain weight on my frame and am no able to build bigger muscles. It was a timing issue unique to me. My BMI changed at a different point than others. Just because a pitcher has not come to that point an organization may need to realize that the pticher, while good enough for AAA just needs to hit that growth spurt in their life before making the jump to MLB. A patient organization may be rewarded, while another one capitolizes on an impatient organization. Not sure what this might be called in scouting circles if it is or is not discussed at all.

Just running some quick numbers (some of the averages are not precise because I didn't scale for IP...but close enough for ballpark):
Looking at 3 year averages, only for pitchers with at least 20 starts:
Best VORP by subgroup is Short/Beefy (Santana seems to dominate the 3-year averages). Best VORP by group is Short, with Beefy and Thin groups also being close.
Most Average IP by subgroup is Tall/Thin, but it's one data point. Tall/Beefy is also high on the list. But what's surprising is that Most Average IP by group is Thin, with Beefy and Tall groups also being close.
Best ERA by subgroup is also Short/Beefy. Best ERA by group is Short.

In general, the biggest outlier is the Normal weight. Everything else looks like random noise punctuated by certain data points (Haren and Santana, once again on opposite ends of the spectrum).

If you break it down to just this year, where the data is more complete because the group of pitchers was sorted by this year:
Best VORP is still Short/Beefy, but best group is Thin or Short.
Best ERA is also Short/Beefy, but best group is Short or Thin.

The lack of data points make the data skew quite a bit due to Zambrano hurting the Tall/Beefy categories and Haren and Santana helping the Tall/Thin and Short/Beefy categories. But it still falls out that Normal seems to have the worst performance in general.

There are two major faults with trying to ascribe any meaning to this data:

1) Listed height/weight for professional athletes is notoriously inaccurate. You say that no pitcher listed at 5-10 has made a start this year. Well, I've stood next to Radhames Liz and he's about as 5-10 as it gets. Unfortunately, the O's list him at 6-0 or 6-1 or something.

2) As others have mentioned, there is a selection bias to the data. Tim Lincecum and CC Sabathia are at the extremes and they both are in the data pool because of their exceptional talent. Their talent led them to be put on the mound as children, they would've had extra chances because they had large signing bonuses (not that they needed them), etc. How many other 5-11 or 300 lb guys can we say that about?

So what do I suggest?

Well, there's nothing we can do about these faults in the data points, but we can try to get some conclusions from them anyway. For instance, we could see HOW a 300 lb man or a skinny 5-11 kid succeeds. Certainly, they utilize different arsenals, arm speeds, release points, and other variables. to be among the most effective pitchers in the world. If CC tried to pitch like Tiny Tim, or vice versa, I'm sure their effectiveness would be compromised.

I guess I am saying that the issue isn't what an ideal pitcher's frame is; it's more like how pitchers with certain frames tend to find more success. If there were an easy way to quantify the variables I listed above, it actually wouldn't be that hard to throw together.

I ump'd some Little League this year ... and it's my general observation that plenty of short players had an opportunity to pitch ... and many of them seemed "above average". Yes, there's a bias for tall players ... but that seems more than justified by reality (especially by the observation that *all* MLB pitchers are taller than average). Are there so few innings available to HS freshman and JV and varsity teams that a good pitcher who is short will not get them?

Perhaps the college ranks are a good place to look ... as their objective is "winning" rather than "development for MLB" ...

I've seen many good pitchers who are short ... such as Lincecum, obviously ... the other example who comes to mind is former Oregon St closer Kevin Gunderson ... http://itmightbedangerous.blogspot.com/2008/09/organizational-consistency.html ... listed as 5'10" ... hmmm, I'll bet he's not 5'10" :-)

How can it be that guys like him don't make MLB because of "bias" ... as opposed to "results"? He's getting enough of an opportunity to make it or not based on results, imo.

If you were a college and you had to pick between a short pitcher who has a little better stuff now or the tall pitcher who is a little farther behind? I imagine most schools will go with the tall pitcher. I don't think most high schools or little leagues have the luxury of being confronted with such choices.

But then the short pitcher with better stuff *will* get an opportunity at a different college ... even if it's down a level ... and so he'll get an opportunity to put up results to justify further opportunity.

And I guess that what I'm saying is that even D1 college programs don't have the luxory of making the choice you are proposing.

I think the biggest impact of the "bias" is that results in mis-reporting actual/true player heights ...

Noticing that the "Normal" groups tend to have sub-par comparisons to the more extreme groups leads to a question of cause and effect:
If there is a selection bias, where amateur managers/scouts/general managers/etc... tend to look for pitchers with a specific body type, the players that make up that body type will not necessarily represent the best available talent, because they have a different selection criteria applied. But a pitcher who makes the major leagues with an extreme body type (Sabathia/Lincecum even Santana/Haren) will have to have a talent level that overcomes the selection bias. So rather than seeing a normal distribution of talent around a normal distribution of body types, you will see an inverse distribution of talent. Which is essentially what you see (albeit with the small sample size/rough analysis caveat).

My suggestion to test it with a real sample size is to use PECOTA. Take all of the pitchers in the pool and give them a height/weight in the middle of each of the subgroups and re-run PECOTA. Since PECOTA takes these factors into account, you will see an increase or decrease in projected value IF the data is more than just noise. I will bet that you will find that putting more normal body types into the projection will cause a very conservative projection, whereas each of the extremes will cause some increase in the projection.

drmboat you make an interesting point. The fact that the normal body frame will be more average makes sense too especially if more pitchers make up in that group. Over the large numbers people in that body frame will regress the numbers to the mean. Plus, those who stand out and have a different body frame will obviously need to be good enough to buck the norm of those scouts preconcieved judgments. Interesting.

I am not sure your data analysis reflected your query. You really only looked at pitchers that have 'made it'. Lee doesn't fit in that category, yet. And really, if Lee turns into just about any player on your list, the Indians will be ecstatic.

The analysis should really be looking at domestic players drafted (and a controlled subset of foreign-born undrafted players to avoid the Nomos and Dice-Ks of the world) to see whether height impacted whether they 'made it'. If in the first four rounds of the draft 70% of pitchers over 6'3" make it, but that number drops to 40% for pitchers over 5'11", that would be telling data, particularly as the pitchers taken in similar rounds will carry similar expectations.

It would also be useful if we had reliable measurements, but you work with what you get.

"If in the first four rounds of the draft 70% of pitchers over 6'3" make it, but that number drops to 40% for pitchers over 5'11", that would be telling data, particularly as the pitchers taken in similar rounds will carry similar expectations. "

I have to disagree. The qualifier of *in the first four rounds* means you have just introduced a new variable. One might hypothesize that shorter pitchers have trouble getting drafted early, due to a lack of projection. That could certainly impact the results beyond a simple "Tall pitchers succeed/fail more often" type of conclusion based on that data set. All it would tell us is that tall/short pitchers drafted early succeed/fail at a higher rate. That could just end up telling us, for instance, that clubs are bad at identifying which short pitchers to draft early.

Again, I think answering a question on ideal body type is pretty impossible given the data to work with and the amount of noise you're going to get with all the variables everyone has come up with.

The better approach is to quantify pitch velo/movment with pitchfx, find a simple way to quantify release point (mlb.tv pixels?), and arm speed (you could probably use max velocity as a proxy for now) and run some multivariate testing within each body type group.

You wouldn't find an ideal body type, but you might find some pitching styles that are more likely to be successful within each group. For instance, you might find that over-the-top deliveries are more successful as player height increases (just speculating).

The more I think about this, the more this kind of study seems feasible.

It would be interesting to explore whether the preference for a certain frame at the amateur level is an effective strategy. For example, the primary reasons a taller frame is desired is 1)potential for future velocity gains 2)better downward angle 3)ability to handle larger workloads.

As many have mentioned, the "shorter" pitchers at the ML level have been selected out b/c they are special. Therefore they don't necessarily serve as the best data points. The real question is whether the selection bias at the lower levels is appropriate-- a tough question to answer quantitatively.

There is another variable to consider in this discussion...there is a much stronger bias among college coaches and professional scouts against the "little righthander". They are much more apt to give an opportunity to a "little lefty". It's one of those old school "truisms".

I'd be interested to know the percentage of "short" lefties vs. righties who have been successful. Granted there are lots of selection biases based on population (short people who are left handed are more unique than other demographics). My hunch is that most of the successful, short pitchers are lefties.

I agree with ostrowj1(8095): "Before you start any major analysis, I think it is important to pin down the exact question you are trying to answer."

The most relevent questions this type of investigation could answer that I can think of off hand are:

1. Do teams over- or under-draft pitchers of a certain build?

2. Do pitchers of a particular build take longer to reach their peak . . .

3. . . . develop further . . .

4. . . . more durable . . .

5. . . . have longer lasting careers . . .

6. Taking 1976reds question a step further, is there a difference at any height between the development of lefties vs. righties? There is a myth begging for verification.

The questions about fooling the hitters and pitch movement really boil down to success, which is ultimately all that matters.

The first question above regarding under/over-drafting would obviously be looked at from the point of view of the draft as suggested by tballgame. Some fair way to fudge the pitchers drafted lower due to unaffordable expected salary demands would be helpful. A control would have to be in place to make sure high school pitchers are not shorter or taller than pitchers drafted out of college. And, yes, as mikehollman points out, excluding lowest draftees might have some bias as well. However, I think some cut off could be allowed for those drafted so low that there is very little chance of their reaching the majors.

Questions 2-5 above are somewhat related. As drmboat suggests PECOTA could be a useful tool. As PECOTA is based on historical data, I don't understand Corkedbat's objection.

I generally agree with the comments warning about sample size, but I think some generalities might be found if a broad enough perspective is taken.

I'm going with the overly general belief that there is a bias against shorter pitchers because they generally have less total muscle mass, and thus less velocity. Less velocity means less opportunity. It's essentially the same reason there's a bias against shorter position players: because they can't hit the ball as hard.