Sunday, April 21, 2013

Tigers Pitcher Similarity Scores

The concept of similarity scores for both batters and pitchers has been around for a while. Bill James first introduced them in 1986 in his annual abstract and Baseball Prospectus uses them as part of their PECOTA system. The idea is that players are compared according to their career statistics and each pair of players gets a score which tells how similar they are. For pitchers, the comparison includes metrics such as innings pitched, ERA, strikeouts and walks and, in the case of PECOTA, characteristics such as height and weight.

These similarity scores are useful in seeing how closely your favorite players compare to players from different eras. For example, the Bill James similarity scores at Baseball-Reference suggest that Tigers ace Justin Verlander compares most similarly to Phillies left hander Cliff Lee and Angels righty Jered Weaver. They also indicate that he was similar to old time pitchers such as Sal Maglie and Gary Nolan.

These similarity scores are fun to look at but they don't say much about what kind of pitches a pitcher throws. If you were to compare a prospect to a major league pitcher to give you an idea of what kind of pitcher he might be in the future, it does not make a lot of sense to use career statistics. It would be more useful to compare his pitching arsenal.

To that end, Stephen Loftus of Beyond the Box Score recently published a method for comparing pitchers based on their "stuff" rather than their performances. It's a work in progress but it's some of the most interesting and potentially valuable baseball research I've seen in a while. His article is very math heavy, but very worth reading even if you need to skip over some of the math.

Very simply, what Mr. Loftus did was compare pitchers using the following Pitch f/x data:

Pitch types

Pitch velocity

Pitch break (horizontal and vertical)

Release point (horizontal and vertical)

Pitch location

For each pair of pitchers with over 1,000 pitches thrown in 2012, he calculated a similarity score between 0 and 1 with the most similar pitchers close to 1. These scores are broken down into categories such as 0.8-1.0 extremely similar and 0.6-0.8 reasonable similar. The results for all pairs of pitchers can be found at a link to an excel spreadsheet at the bottom of his article.

It turns out that the two most similar pitchers in baseball in 2012 were Anibal Sanchez of the Marlins and Tigers and Anthony Bass of the Padres. A closer look at the data for Sanchez and Bass at Brooks Baseball tells us why they had such a high score. As Loftus points out, they shared three pitches - fastball, slider and change-up. Sanchez also threw a curve, but not that often. What makes them so close though is that they threw these pitches with remarkably similar velocity, break and release point.

I took Loftus' spreadsheet and determined the top three comparisons for all pitchers and included them in a Google spreadsheet. The results for the Tigers are shown in Table 1 below. The fact that Verlander was most similar to thirty-year old Orioles right hander Jason Hammell may seem more amusing than useful given their results, but Hammel was pretty effective last year. They might not have been so similar in past years.

This research is just a start and will be come more useful when more years are available and especially when we are able to compare minor leaguers to major leaguers. Other data such as pitch sequencing could also be added to give better comparisons. I look forward to see what Loftus and others can do with this concept going forward.