Wednesday, June 02, 2010

We were discussing defensive analysis recently. Me, AROM, CW, Tango, Peter and we talked about lots of things. One of which is the regular availability of any stat that people want to use in discussions, and to make it readily available. Sean Forman recently added AROM’s TotalZone to his site, and you can check out the leaders daily. That’s always nice. People have been enjoying Fangraphs defensive leader boards. Both of these resources will also show you offensive leaders. Offensive stats rarely have the same level of skepticism, and hopefully we’ll get there with defensive stats as well.

I can’t offer you anything special with offensive stats, so I am not going to. I could offer you something different, and maybe I will, but that’s pretty complicated. The fact is daily defensive stats is complicated. Fortunately, SG of Replacement Level Yankee Weblog is a genius. His hard work has enabled us to be able to generate my defensive system, DRS (the one before Bill James’), on a daily basis. Naturally, I cannot thank SG enough, and I am very appreciative of all he’s done, both with this and his historical file at RLYW.

With the various critiques of the methodologies that you hear all the time, TZ and UZR, having another option may provide you more data to feel better about your players. Plus, DRS comes from a different stat source, and that will allow us to see where some difference exist. For those that are not familiar with my methodology, here is the method. Now, that is dated 2005, but I really developed it a lot earlier, and that’s where I tried to refine it. The Mets newsgroup got the actual developmental changes.

So when you hear about people developing methods, offensive or defensive, recall that people were doing it before there was really an internet with all these websites. I wanted to link those so you could see that not only do I rely on SG, I didn’t develop this without lots of critique and feedback from smarter people like Ron Johnson and Dale Stephenson. Read through those threads and you can see some interesting Defensive Average work from Dale and others. I enjoy reading those old threads.

At any rate, thanks to SG’s Excel skills, I hope to update this spreadsheet often. I think weekly right now. It can be more often, but I have a job, and a scatterbrain. Please let me know if you see something that doesn’t seem to make sense - somehow a player is missing, or is on the wrong team. It shouldn’t happen, but I know that spreadsheets aren’t perfect.

Reader Comments and Retorts

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

So it's like I've always said, the Braves three best defensive players are Matt Diaz, Chipper Jones, and Troy Glaus in that order. Apparently being able to move laterally is not highly valued in this system.

#8 Back when we were discussing ZR vs DA the consensus was that ZR placed a high premium on good hands. A real quick and dirty study I did found that this appeared to be true for middle infielders at least (When I attempted to model ZR using the basic fielding stats I found error rate carried more signal than assists and/or putouts per defensive inning. Not that the model was worth much)

OK, we're on the same page as for precision. How about presentation? I think presenting without decimal places makes it much more appealing to look through. What say you others? If I'm in the minority and others want to see +3.27 instead of +3 I'll shut up.

OK, we're on the same page as for precision. How about presentation? I think presenting without decimal places makes it much more appealing to look through. What say you others? If I'm in the minority and others want to see +3.27 instead of +3 I'll shut up.

I tried a few times to get it to one decimal in Google Docs and it wouldn't take. I don't like no decimals because false precision or not, I like having a feel for "just shy of three, and nearing four."

No big deal Chris. And we should all be happy that you are willing to take on this task. I didn't realize it was problematic on the technology side.

Do you have the access to load an HTML file to BTF? Because if you can do that, I could write an excel vba program that will take the results in your spreadsheet and automatically create an html page. I could probably integrate it into SG's macro for getting this data. All you'd have to do is hit one more button, and then when you get a new page load it to the website when it's done, maybe something like: baseballthinkfactory.com/dialed_in/drs.htm. Just a thought.

I agree that showing any of the runs numbers (offense or defense) to one decimal is showing false precision. It drove me crazy when MSM was quotig Teix's UZR as "-1.2 runs". That -1.2 could just as well have been +1.2 and it would mean virtually the same thing.

The runs numbers are really +/- 5 runs, so I would even be in favor of rounding to those points.

My idea would be that I give you an excel program that you can run, I wouldn't have to do anything after the initial setup. Once I get my projections into my excel macro, the time to write the web pages takes about 3 seconds. So frequency will be totally up to you.

If you want to give it a try just email me the latest spreadsheet you're using to grab the data, and this weekend I should be able to put together a program. Do you use excel 2007? I am, but if you are on 2003 I think I'll still be able to save it in a format you can use, but no guarantees. Excel is usually backwards compatible but sometimes the vba language isn't.

The runs numbers are really +/- 5 runs, so I would even be in favor of rounding to those points

That's not realistic Tango. Almost every player would be -5, 0, or +5 for most of the season. And is it really more accurate to say zero for +2.2, but +5 for +2.8? Your FIP metric gets reported to the hundredth of a run. Our stats are full of false specificity -- you just have to educate people and hope they use them in a reasonable way.

If you have 60 runs allowed in 150 IP and 61 runs allowed in 150 IP, then the ERA is shown as 3.60 and 3.66. I would be in favor of showing ERA to one decimal place. But, that ship has sailed.

If you were to shows UZR runs saved per game, and you have one guy at +20 runs in 130 G and another guy at +21 runs in 130 G, then their numbers are +0.154 and +0.162. I'd show that as at most two decimal places. You can even make the case for 1.

And is it really more accurate to say zero for +2.2, but +5 for +2.8?

Yes, in the same way that rounding 5.49 to 5 and 5.51 to 6 is accepted. I don't understand the point. We always round to the level where the precision level allows us.

Right. Which was your choice, even though it's probably not accurate to more than a quarter run (depending on how one would choose to define FIP accuracy). But if you had rounded every pitcher to the nearest .25, there would be a lot less interest in the stat. So you compromised. And if Chris rounded to the nearest 5 runs, there would be exactly ZERO interest in his stat. Why should he do that?

As for rounding, the problem is that people use these to compare players. They will say the +5 player is "five runs better" than the zero player, when it may really be just 1 run. In that case, your rounding has created a less accurate perception by the casual reader than showing them as +2 and +3: rounding makes two roughly equal players appear qualitatively different.

I agree zero decimals is better than one. But not rounding to nearest 5.

I'm with Guy on this one. Give me zero decimal places. I can live with one, don't like having two. If you go the the nearest 5, there's no point in even posting anything until May or even June, and even then most players will still have a rating of zero.

Some might say that's a good thing, that two months of defensive data should be ignored. But people want baseball stats. And if your goal is to make this metric accessible, you aren't going to be able to compete with the metrics that do post daily/weekly updates from the start of the season.

Me either. But tough to tell without being able to look through the data and compare differences to video. I had read that BIS was tracking flyball hangtime, that would be an improvement. But as far as I know, that is not used in UZR or Dewan's ratings.

If you go the the nearest 5, there's no point in even posting anything until May or even June, and even then most players will still have a rating of zero.

Some might say that's a good thing, that two months of defensive data should be ignored.

Exactly my point. The level of precision only goes so far. Even for hitting stats by the way, which should probably be rounded to the closest 3 runs.

So, you can construct a reasonable argument for rounding to the closest 5, or 3, or even 1. But to the closest one-tenth of a run? There's no reasonable argument there. There may be an argument to be made, but it won't be based on reason.

I don't disagree with your argument on a larger scale, but I prefer to see unrounded numbers - I don't automatically assign precision based on that. the methodology and measuring practices define that for me, not the output.

So, you can construct a reasonable argument for rounding to the closest 5, or 3, or even 1.

No, there is no "reasonable argument" for rounding to the nearest 5 runs. Because that ensures no one will have any interest in this metric. You report WOWY over an entire career to the nearest run. You report FIP to two decimals, and Marcel ERAs as well. Marcel BA is calculated to three places. You don't hold yourself to this standard, so why here?

Easy...okay, I updated the defensive totals through yesterday (June 3). I added a team summary, and removed all decimals. Can you guys advise on the presentation? Oh, you still have to click the link in the middle of the article that says through May 31. I'll talk to Jim about a link on the front page so you don't have to read any of my ramblings and we just have a front page link to Defensive Stats updated with regularity.

Ron,
neither is too difficult. I can add innings, but It won't be as clean. Maybe. It also isn't any work to organize the order as you like. It's 30 seconds of work. The order they are in is alphabetical, which is just the easiest.