Fantasy semantics

If you’ve never taken a course in econometrics, I encourage you to do so if you have the chance. Actually, any course that teaches statistical methodology will do. Even if you never want to “crunch numbers,” it will teach you how to think and read “probabilistically.” Since nature, life and fantasy baseball are inherently random, understanding the semantics of probability is essential, doubly so if you’re purporting to offer advice.

Things basic econometrics has taught me:

1. We can still say something about coin flips even though the outcome can be either heads or tails.

This may seem obvious, but I’ve seen educated people argue that just because we can’t be sure about the outcome (or that “the outcome could be anything”), it is useless to talk about forecasts or numbers or statistics. Obviously not. We can still talk about which outcomes are more likely than others. We can still talk about the probability of outcomes. We can still say that the odds of a heads is 50 percent and that the outcome of a fair die roll is more likely to be between two and five (inclusive) than it is to be either one or six.

2. In an ideal world, tell me everything—give me the probability of everything.

Let’s say you meet God. God turns out “to play dice with the universe.” He doesn’t know whether Chipper Jones is going to retire midyear, but he does know the probability that it might happen. He knows his own dice.

Why ask God only how many home runs he “expects” (that is, the average amount) Chipper to hit? Why not get more information from him? What’s the probability that he hits fewer than five (tantamount to asking the probability that Jones gets injured)? What the probability that he hits 10, 15, etc.? With this information, you’d have a better idea about how risky Jones is.

3. It can be very tempting to take shortcuts when writing.

Actually, I learned this writing about baseball. I like to keep my columns as simple as possible while still making my point. I try to avoid adverbs when possible (though I am rarely successful). Writing “probabilistically” without adverbs is difficult—words like “usually, probably, likely” are useful. I have the same problem with numerical information. Yes, in a perfect world, I would just give you my probability of everything when I talk about my forecasts for Chipper, but parsimony and limited attention spans demand that I give you only as much as I deem relevant and interesting.

On the relationship between “experts” and readers:

The key is trust and establishing consistency. It is possible for one expert (say, Ron Shandler) to use mostly intervals and another (say, Derek Carty) to provide mostly point estimates. Intervals are kind of nice, but they require more disclosure. It is OK for Shandler to prefer to say (paraphrasing) “Miguel Cabrera is likely to have a home run total in the 30s” as long as we know what he means by “likely”—40 percent? 90 percent? Similarly, it is OK for Carty to say “Cabrera is projected to hit 37 home runs.” If Carty gives us some interval around it, too (“standard error bands” in statistics speak), then Carty’s statement is very similar to Shandler’s even though they’ve used different words. (In fact, I am just sort of rephrasing what Carty wrote about on Tuesday. My problem with Shandler’s recent writing is that he forgot a version of Lesson One above.)

Readers and writers need to come to a sort of tacit understanding about language. More often than not, writers are going to give numbers for everything. If a Shandler-esque writer wants to say “Cabrera is likely to hit around 35 home runs” instead of giving lots more numbers, than he should be consistent with what his words mean. Approximately what does “likely” mean?

On arguments within the expert community:

What goes for communication between adviser and advisee goes doubly for these blogged exchanges between experts. It is very hard to champion your cause against another “expert” in a venue designed to still be accessible to the layman reader. Actually, it is very hard to do it in any venue.

I’ll have more on the quants-versus-quaints (in case you can’t tell which side I’m on) debate in my next article, the tenor of which has actually be very good I think. Many expert exchanges are not nearly as interesting in part because one expert will say something semi-informative but mostly substantive like “It is good to use statistics to forecast how many home runs Cabrera will hit.” And then the opposing expert will say something like “Statistical forecasts are always wrong. I prefer to go with my gut.”

My problem with the second statement is that it is absolutely true but totally practically false. Forecasts are always wrong, but they are still incredibly useful. Most experts, even those who haven’t taken econometrics, know this to be true. The more literally accurate statement, “I project that Cabrera will hit between 35 and 45 home runs with 95 percent certainty” would be more bulletproof to these kinds of flatulent responses, but all of those numbers are superfluous to the argument. At some point it would be better if some details could be taken for granted.

Comments

I was being totally sincere when I titled my article from two weeks ago. Many, many debates really boil down to nuanced differences between the way two “sides” use and interpret the use of a few key terms.

Of course, language as wonderful and useful as it is, is woefully inadequate to capture the infinite complexity of human thought. And, the chasm between the essence of my thought and my ability to communicate it is the source of many a disagreement. When I say “likely” I perceive it to mean something very specific within the context of how I am using it, but there are infinite possible interpretations of that term, and each reader will filter my language through their own prisms to derive unique but (usually, though not always) similar comprehensions.

It is incumbent upon a writer to be aware of this, to do his/her best to minimize it, to expect it, and to be able to distinguish criticisms that address his/her fundamental premises from those that derive from a communication gap.

Some of my friends will joke with me about my being “an expert” becaue I have this column. I often reply that I am one of many of us who are highly adept at playing fantasy baseball, but that is only a prerequisite to being qualified to discuss something in a public forum. It is not the fact that I play fantasy baseball better than most that qualifies me, it is that I can write about playing fantasy baseball better than most that qualifies me. (Or at least that’s what the THT editorial staff must have felt).

This distinction is important in terms of the readers versus expert dynamic. Many readers judge the experts solely on the experts’ abilities to play fantasy baseball, asking why him and not me? Many readers are often ignorant to the fact that being good at the game is only one component of being good at being an expert.

To bring it back to the quants and geniuses discussion, a semantic issue that deserves some serious examination is what defines something as being “a model.” A big part of the debate is that Liss thinks he’s doing something very different than what Bill’s model would try to do, while Bill argues that Chris is fundamentally attempting to do the same thing, but informally and sloppily so. Well, is Chris using a model? I dunno – depends on whether you define something as a model by its intent or its means?

Apologies to Bill Phipps, the name “quants” is going to stick even though you claim [correctly so] that the term doesn’t accurately reflect the actual composition of your group.

I agree 100%. Econometrics is how I found myself falling down the rabbit hole known as sabermetrics. I wanted to do my final paper in the class on baseball performance. My metrics paper was a fairly flawed construct that looked just at free agents over a 3 year period using stats like OPS, but that lead to my Labor Economics capstone that used fangraphs WAR from ‘02-‘08 and all position players. Since then I’ve fallen further and further down the hole…

I hear ya. I know a number people I would consider brilliant who have absolutely no interest in baseball. I’ve tried to tell them that they would absolutely fall in love with it if they developed an interest in it by thinking about studying the game as trying to understand and make sense of a complex system. I tell them that their ignorance of many of the game’s basics might actually work in their favor. Most of the game’s fans first have to unlearn the false knowledge they’ve been indoctrinated with by charlatans of the industry who are painted as gurus before beginning to understand how things actually work. He who knows nothing is already far ahead of he who knows who Mike Francesa is but not Tom Tango.

I’ve said this before, but I think fantasy baseball’s destiny is to be the vehicle that shoots sabermetrics to the mainstream and finally exposes the Tim McCarvers and Joe Morgans of the world as the flat-earthers they are. Just watching and appreciating the game lends itself to the romance of the “clutch performance.” It privileges moments, and especially anamolous ones, independent of historical context. But, in fantasy baseball an owner is forced to be much more agnostic and exacting. Success lies in predicting, and thus understanding how outcomes are acheived in order to get insight into whether they are repeteable. And, I hope, that through doing this people begin to separate signal from noise and realize that what they are being told by the mainstream baseball press is really anecdotes that coincide with a cultural mythology that truly belie reality and disappear once any scrutiny is applied.

(Then, I hope those same people flip over to CNN and MSNBC and extrapolate the same lessons…)

It would be wonderful if the quants could get together and agree upon a standard notion for expressing confidence intervals around projections in text. Even something as simple as is done with polling data (X +/- Y) would be a big step up.

Personally, I see “likely” as 1 SD, “very likely” as 2 SD, and extremely likely as 3 SD.

Of course, I think part of the reason this isn’t done is because we have a standard sample size and mechanic of variability, so we can develop a reasonable intuition about error bands. Give me an AVG or OBP and I’ll put 20 points on either side. Give HR or RBI and I’ll put 20% on either side. I think most people do something like this subconsciously, if not consciously.

I don’t know RMR, a lot of people I talk to are willing to dismiss a projection system because one player got hurt and hit 15 homeruns instead of 25. Or a healthy Juan Pierre hit .350 over a month when he was predicted to hit more like .280, etc etc. Most of the people I interact with are either completely ignorant of Statistics or Econometrics or very nearly so.

And Derek, as long as a standard 5×5 uses R, HR, RBI, SB, AVG, W, SV, K, ERA, and WHIP as its stats, people aren’t going to develop a truly accurate sense of player performance. Hell, my primary fantasy league is called “advanced” because we had an auction draft and use OPS instead of AVG.

For me, the first order improvements in performance comes from understanding the basics – such as regression to the mean does not imply substandard future performance, just sub-current performance.
One can learn the basics just talking about stuff – no math necessary. I just find that many don’t see this stuff talked about unless they take a stats or metrics course. If at the end of such a course you walk out not being able to compute a standard deviation but still remember the basic principles, I think you’ve done yourself a great service anyway.

All I know for sure is that I know less about baseball by learning more about metrics. Sure, xFIP, BABIP, HR/FB, Z-Swing% and the like can help one’s understanding of what can be expected and what is exceptional. But as the Sabermetric movement gathers more steam it becomes more full of hot air.

I am a relative newb, only coming across fangraphs last summer, and fell in love with baseball by virtue of the numbers, not the game. But along with me have come many, many more that have began to adopt the SABR POV. This has muddied the landscape. I recognize that I am part of this tide of unwashed masses, but I also see what it has done. At least, it seems this way to me. Perhaps those that have been around longer could confirm this?

If not today, then soon, the market inefficiency as well as the hidden gems are to be exposed by faith without numbers. It has already been said by many that superior scouting is the new frontier now, not the old one.

Thanks for chiming in Judas. I think you’re right that numbers can be misused (and are by some), although I do think they are absolutely a necessity when used correctly. That being said, scouting is also a very important thing (check out my bio line). Scouting is very different, though, than “faith without numbers”. Scouting is a very difficult thing to do is a very different than simply watching a player on TV and forming a vague impression about him (not to say that’s what you mean, just that scouting is a term that often gets tossed around to mean non-quant analysis when true scouting is it’s own thing entirely). Just thought that needed to be clarified.

It doesn’t really matter whether you continue to use triple crown stats in your leagues; that doesn’t preclude understanding the importance of peripherals and gaining insight into the basics of the discipline of statistics.

If I’m betting on a player’s AVG, I want to know his BABIP, his trajectory splits, etc. I want to know whether his past, or current, output is sustainable. Sabermetrics is not a specific tool kit of numbers and stats, it’s a way of approaching the process of deriving understanding from what you are seeing on the field and in the box scores.

Judas,

I don’t think you are describing anything particularly out of the ordinary. The sabermetric movement will attract random bandwagoneers who don’t know how to use data responsibly and do not contribute to moving the discussion forward. That’s fine – it’s a sign of “making it” in a sense, even. The fact that nature of the field of study imposes a higher intellectual barrier than just liking baseball doesn’t mean it is impervious to bunk.

What will always differentiate the influential theorists from the simple number crunchers is the grounding of applicability, understanding of the context (the game), and the associated nuance. When making this point, I always refer to the section in “The Book” where Tom talks about the benefits of bunting (or not.) Tom marries numbercrunching with context and game theory in this section and mentions that part of the advantage to not bunting in bunt situations is that the defense protecting against the bunt offers an advantage when not bunting (easier to get ball through drawn in infield, etc.). Therefore, even though the tables might dictate you should almost never bunt, if you actually stopped bunting entirely, you’d begin to lose the advantage of swinging away in bunt situations because game theory dictates that your opponent would juse cease to protect the agaisnt the bunt. (You cause your opponent to change their strategy in a way that actually improves his play – bad move, says game theory). The advantage of not bunting is heightened when the “threat” of the bunt is credible.

This is the exact type of insight the spreadsheet alone can not derive. This is good sabermetrics is all about.

I suppose a way of guarding against the misuse of statistics would be to draw a line in the sand, so to speak, of where/when we have enough data to make a solid assessment of what is going on. Where does SSS end and a useful amount of data begin? I know it varies from stat to stat.

I enjoyed your article Jonathan. THTfantasy is all over this discussion and that’s good for all of us involved in it and all of us playing the game, but I keep feeling that something is being missed. Chris’s polemic about his instinctive approach to drafting has turned what should be a discussion about what information is useful and what information is not (or less so) in winning fantasy baseball leagues, into an absurd caricature of quants versus the instinctives, in which the quants claim that anything that isn’t spreadsheetable is lazy and sloppy. It isn’t nearly so easy.

My goal, frustrated so far, has been to point out to Bill and Robert that, for example, they don’t really have the data to create meaningful confidence intervals. Those numbers they have are made up by them to reflect their evaluation of the risk. Which may be better or worse than someone else’s evaluation of the risk. That may be the best approach to analyzing the problem, heck, as Chris Liss has been saying for years (long before Blink), our minds are much more adept at processing various streams of information and making meaningful output than we give them credit for, but the analysis is still seriously subjective. To not cop to that seems to me grievously effete and dishonest. In other words, it burns my butt.

As I seem to have to do nowadays, I want to reiterate that I generate, process, use and analyze a lot of data in my work. It is essential, of course, since we’re playing a game of numbers, and because baseball is the one major game that can be measured state by state, which gives us rich veins of data for analysis. I agree with the econometric point here, that the data we have tells us stuff, even if it is fuzzy.

But somebody has to say it: Just because it can be measured, just because you can make a ratio or a calculation or devise some special formula doesn’t mean you’re getting closer to the solution to a problem.

I have yet to see Bill and Robert reveal any understanding of the game that wasn’t developed in public by John Benson, Ron Shandler, Alex Patton, Les Leopold, or John Mosey more than 15 years ago. Or Art McGee whenever he published (I don’t know when that was).

Is it churlish to expect more from those whose ambition is to beat the game with their numbers?