The much-maligned Rawlings Gold Glove awards have long been decided by the managers and coaches around the league, even though such people are left with little enough time to watch other teams when they have to deal with their own for 162 games per season.

As part of the multi-year collaboration beginning with the 2013 season, SABR will develop an expanded statistical resource guide that will accompany the Rawlings Gold Glove Award ballots sent to managers and coaches each year. In addition, SABR will immediately establish a new Fielding Research Committee tasked to develop a proprietary new defensive analytic called the SABR Defensive Index™, or SDI™. The SDI will serve as an “apples-to-apples” metric to help determine the best defensive players in baseball exclusively for the Rawlings Gold Glove Award and Rawlings Platinum Glove Award selection processes. The collaboration also installs SABR as the presenting sponsor of the Rawlings Platinum Glove Award. …

Beginning in 2013, the managers/coaches vote will constitute a majority of the Rawlings Gold Glove Award winners’ selection tally, with the new SDI comprising of the remainder of the overall total. The exact breakdown of the selection criteria will be announced once the SDI is created later this summer.

One imagines this new addition will ensure Derek Jeter fails to add a sixth Gold Glove to his current total. Ideally, it means that offense, always a huge factor in the actual voting, will begin to mean less. As is, it’s far too difficult for below average hitters to win Gold Gloves.

Give Rawlings some credit here. This is the second big, positive change in the last few years after the 2011 switch to break up the outfield spots (previously, it was typical for three center fielders to win Gold Gloves in each league). Not only will they add data to the voting, but they’ll give that same data to the managers and coaches, which could then be used to make better choices.

Wait, so they’re finally including sabermetric data in an awards process, but they decide to it for the area that’s the least reliable and most prone to inexplicable year-to-year variation? I’m all for sabermetrics in awards voting, but this seems a bit odd..

No stat, be it offensive or defensive, is perfect. When you look at some previous “winners” (Pouliot mentioned Jeter, Palmero comes to mind), it’s clear that the Gold Glove needs something to help the voters better decide.

Sure, but there are arguments about whether the way defense is measured for sabermetric purposes is fundamentally or not–the data has improved, but there’s a real possibility that the inputs are junk. It’s impossible for a stat to meaningful if it’s built from flawed data. People can argue about FIP, or wOBA whatever stat of their choice is, but no one argues that offensive stats or pitching stats suffer from junk inputs. Defensive stats might.

Ben - Mar 8, 2013 at 4:38 PM

*fundamentally flawed

Ben - Mar 8, 2013 at 4:40 PM

Check out Up and In podcast episode 8, for example–http://www.baseballprospectus.com/podcast/episode8.mp3
Colin Wyers over at Baseball Prospectus argues that defensive stats are junk because the inputs are junk. It’s possible (likely) that it’s changed since then, it is an older episode.

Defensive stats are only getting more accurate as technology improves.

albertmn - Mar 8, 2013 at 4:45 PM

Not only did familiar names win, especially for pitchers, but if two guys were close, the better hitter would win every time. But, I can understand why voting didn’t always make sense in the past, with no fielding stats other than fielding percentage for the longest time. That left too much to the eye test, and could be skewed for some voters just because you saw Team A 16 times in a season and Team B only 9 times (or whatever the exact numbers were in a particular season).

Ben - Mar 8, 2013 at 4:46 PM

I agree, they are getting better, and that’s awesome. The data will only get better as FieldFX gets better too. But there is no other area of baseball statistics where there are substantive questions about the quality of the data.

paperlions - Mar 8, 2013 at 4:52 PM

Well then, by that definition:

RBI is a junk stat (team dependent)
Batting average is a junk stat (uses wrong denominator and numerator)
Pitcher wins is a junk stat (team dependent)
ERA is a junk stat (ERA belongs to both a pitcher and his fielders, not just the pitcher)

There are plenty of junk stats that people still use to form opinions about pitching and hitting. Imperfect doesn’t mean the same as junk (= useless).

Current fielding stats are still far better than what most managers/coaches relay on (errors and fielding %).

Ben - Mar 8, 2013 at 4:56 PM

You’re misinterpreting the problem.

Batting average might be a junk stat if we couldn’t count on the official scorer to accurately judge a what’s a hit, and then record it in the official record. We would worry about the veracity of batting average data if we wondered whether pop-outs were being recorded as hits. That’s the problem with defensive data. The fundamental inputs are flawed because they often aren’t or can’t yet be accurately judged. The input data itself is bad. We don’t worry about the veracity of batted ball data because baseball solved that problem decades ago. That problem has not yet been solved for defensive data.

churchoftheperpetuallyoutraged - Mar 8, 2013 at 6:14 PM

That problem has not yet been solved for defensive data

This isn’t necessarily true. With the introduction of Field F/X data, it’s possible that all the problems for the public defensive stats has been removed, but none of us has access to it.

Ben - Mar 8, 2013 at 6:29 PM

Right. But none of us have access to it. A statistic is only as good as its data.

paperlions - Mar 8, 2013 at 7:39 PM

Ben, pop ups are regularly counted as hits….so are ground balls that should have been outs, so are fly balls that should have been outs….you (and most people) are just so inured to it that it seems “right” and acceptable because it has always been that way. The problems you have with current public defensive stats are exactly the same.

A pop up that is misplayed is a hit (often a double). A flyball that is misplayed is an XBH. A ground ball that should be an out becomes a hit because the defender has no range. Any time these things happen (and they happen repeatedly every single game), they affect ever offensive and pitching stat.

In addition, umpires miss a lot of calls on balls and strikes…a missed ball or strike call can have a huge effect on how the rest of the PA progresses, causing a gap between actual pitching performance and recorded pitching performance.

In short, everything people complain about in defensive stats already exist in hitting and pitching stats….people are just accustomed to those “problems” and understand that they are part of the system and generally will “even out” over time….sometimes they do, sometimes they do not.

churchoftheperpetuallyoutraged - Mar 8, 2013 at 6:11 PM

and most prone to inexplicable year-to-year variation?

Why is this an issue? Barry Bonds one year had a .688 SLG and the next had a .863. No one says that SLG is bad because it’s prone to year to year fluctuation.

I will like the addition of the metrics, if they remain a percentage, and not the entire determining factor. Metrics, while not perfect, are helpful and will help balance out those that use the eye test, but don’t apply it well.

If you had watched many Cardinal broadcasts, you’d know. The first year they mentioned it all the time…..there is a platinum patch on Yadi’s glove, like the gold ones on the gloves of GG winners….so any time they showed a close up of him in his gear in which you could see the patch, they would mention it.

It’s only one metric, but that’s Fangraphs’ UZR primer. BIS data is charted by interns off video (presumably DVR). While we don’t know all of the inputs, there are things we can reasonably speculate on based off what we know about UZR. For instance, the primer indicates that batted ball type and location are inputs, each composed of component pieces.

“One of the differences between UZR and linear weights is that with UZR the amount of credit that the fielder receives on each play, positive (if he makes an out) or negative (if he allows a hit or an ROE), depends on how often that particular kind of batted ball, in terms of its location, speed and several other factors […]”.

Without going into the “other factors”, we can talk about batted ball type, location and speed [velocity]. BIS breaks down batted ball type into GB, IFFB, LD, “Fliner”, and FB. These are subjective judgments with a high susceptibility to human error. Further, not all batted balls of each type are created equal. Location is essentially plotted on a graph by the BIS interns with their software. There’s human error here as well. Further, the difficulty of, say, catching a line drive, can’t be accurately deduced from the batted ball input (a line drive is likely plotted where it lands or is caught; either BIS is projecting the landing location on line drives for balls that are caught, or the plotted locations are different based on whether the defender touches the ball or not). For velocity, BIS isn’t taking objective exit velocity to my knowledge. While there are ways of deducing batted ball velocity, they require input closer to what a company like Sportvision (Pitch, Hit, and Field FX using camera systems) or Trackman (cameras, sensor measurements using the Doppler effect) provides. To my knowledge, BIS doesn’t have the hardware, access, or a partnership with a company that has the hardware to give accurate velocity readings on batted balls. While I don’t know enough about their processes to completely rule out accurate velocity readings, it’s strongly in doubt for me given the technical difficulties of getting either the velocities or data from which the velocity can be deduced.

I could continue on the validity and reliability of defensive metrics, but I think I’ve given enough example to explain why defensive metrics are viewed on shakier ground than offensive or pitching metrics. I’m not making the argument that defense metrics are without merit (I don’t believe Ben was either), only that there are significant questions over the validity and reliability of defensive data. Ultimately, I think adding the defensive metrics will help the Gold Glove voting process (they’ve pretty clearly been a joke for a long time), but as Ben said, it is a surprising to see a defensive award be the first to embrace sabermetric methods.

Too bad they don’t have a similar standard for evaluating Baseball writers…if they did our friend M. Pouliot would most likely find himself employed writing prices on a fast food ” Daily Special” chalkboard.

It never ceases to amaze me the depths Yankee haters will go to discredit any accolade associated to any aspect or player on the club.
I realize it is some sort of adolescent hormonal feeling of inadequacy caused by home team inconsistent performance issues…suffered in all but a handful of MLB cities around the US. NYC doesn’t seem to have that problem with their American League franchise.
Most of these pissers and moaners should just pull up their big boy pants, let go of Mommies left breast, wipe their chins and give the devil his due.