/m/yankees

Reader Comments and Retorts

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Has anyone spent much time with this data? I don’t have any context for how reliable/useful it is.

I have had a theory that Nick Markakis is unfairly treated by UZR because of the big wall and high scoreboard in RF at Camden Yards. He routinely rates in the -5 to -10 range for UZR. His rankings among 28 RFs with at least 1000 innings over the last two years:

Of course, Markakis is the kind of player that is sure-handed and does other things well (he has a strong arm), so maybe he is precisely the type of player who will have their range overrated by a scouting-based system.

I think one of the issues (for me) with this data is how incredibly skewed it is, and how it breaks down an already small sample size into even smaller bins. The way this graph is presented, one might get the impression that the data is uniform across the categories, but a huge percentage of plays made are in the 90-100% category.

Ichiro made
239 out of 240 plays in the 90-100% category,
9/10 in 60-80
7/11 in 40-60
4/8 in 10-40
4/12 in the 1-10

Beltran
234/234 in 90-100
9/11 in 60-80
1/1 in 40-60
0/4 in 10-40
0/2 in 1-10

These seem like pretty small samples, and I don't have a good understanding of what separates a 60-80 play from 90-100 play, or the amount of subjectivity involved in doing so. Still, cool to see the fine detail.

EDIT: Wait, what am I doing wrong? These numbers look nothing like the graph.

"Degree of difficulty" axis seems backwards. I would expect a play with 100% difficulty to be nearly impossible to make. Once I got over that hump I wondered what time endpoints were imposed on this data set. One season? 4 seasons? I don't feel like reading TFA to find out.

I wonder if this type of analysis can be useful for nailing down a type of fielder if not the quality. Swisher's line certainly looks like the one oddly shaped one, I wonder if it's representative of a guy who catches everything he gets to but doesn't have great range or speed (which seems to jive with what I know of Swisher).

So to ask a stupid question, if a player gets to, say 14% of remote plays (1-10%), does that mean he is above average (ignoring sample size issues) at making remote plays?

Does anyone know how they chose the gaps (1-10, 10-40, 40-60, etc.)?

Alex Gordon has a pretty unusual looking distribution (in 2788 innings, the 6th largest sample available). 25% of remote plays (10th in the Majors over the last two years), 71.4% of unlikely plays (3rd), 68.8& of about even plays (25th), and then only 78.6% of likely plays (79th!). I have no idea what that says about him, maybe great tools but has trouble making reads? But he's not a fast guy. So maybe he just goes balls to the wall on everything and misreads a lot of plays. Any thoughts from Royals fans?

Alex Gordon has a pretty unusual looking distribution (in 2788 innings, the 6th largest sample available). 25% of remote plays (10th in the Majors over the last two years), 71.4% of unlikely plays (3rd), 68.8& of about even plays (25th), and then only 78.6% of likely plays (79th!). I have no idea what that says about him, maybe great tools but has trouble making reads? But he's not a fast guy. So maybe he just goes balls to the wall on everything and misreads a lot of plays. Any thoughts from Royals fans?

Simplest explanation would seem to be positioning. A guy shaded heavily towards LCF (relative to the typical LF) should make more remote plays to LCF, fewer routine plays to straight LF and then stuff down the line he doesn't have a prayer on. (Or vice versa of course). The question then would be if he really gained anything -- maybe he makes more plays overall but maybe the ones he misses down the line are more damaging. Or he makes fewer plays overall but the remote plays he gets to would have been more damaging.

The Swisher line I would guess is good break, lousy speed. Maybe also good routes.

The vast majority of OFs make the play on virtually anything they get to, so it should mostly be a function of what you get to. That would seem to be determined by 4 factors -- positioning, reaction, routes/read, speed. So, for Gordon, if his reactions and routes were great, he wouldn't miss so many "likely" balls. But bad reactions, routes and OK speed doesn't get you to remote balls either. But an OF with average reaction, route and speed skills that is positioned much differently than the typical OF will have a much different distribution. The fancy new software should allow us to measure most of this stuff, most importantly (from a saber perspective)* starting position.

*I'm guessing. I mean in the end we care about what the guy gets to and not so much if his routes could use some work. The latter would of course be valuable to scouts/coaches. But just knowing position (and ball path which the new system also measures), we should be able to classify "likely" "remote" etc. relative to starting point. Then we can worry about the value of a play not made although that too must differ by starting position -- i.e. if Gordon has farther to run to the line, a triple becomes more likely.