If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

What is Sabermetrics supposed to be?

With the recently thread about Moneyball and Billy Beane being overrated I got to thinking what exactly is sabermetrics? Now, I think most of us have some intuitive gut feeling definition. Statistical analysis is often seen as equivalent to sabermetrics. When sabermetrics is mentioned often Bill James comes to mind. But one person that I always think of is Craig Wright. He was the first person ever hired by a major league team (Texas Rangers-1981) with the title of sabermetrican. Below is his first business card. Recently, I've been reading his website, The Diamond Appraised. In an article he talks about his work experience working for major league teams. He said something that struck me as very interesting.

The first business card in baseball using the title “sabermetrician” caused quite a stir in those days.I stopped using that title around 1990 because the meaning had shifted too far from a scientific approach to baseball to one focused on statistical analysis of baseball.

The bolded part really caught my eye. He mentions this in several other articles on his website.

The first time I ever heard of Bill was when I read his 1980 Abstract. By then I had already been developing and using my scientific approach to the game for over a dozen years, and had already been pitching the idea to the ML clubs. Indeed, I had already begun my correspondence with Eddie Robinson that eventually led to my being hired by the Rangers after the 1981 strike ended.

I've no problem with opinions that Bill and I have a similar approach to the game. I share that opinion and said in The Diamond Appraised that I felt his work was far closer to my science and synthesis approach than the work of others who focused more on statistical analysis. I made that same point in my foreword to his 1985 Abstract. And I've no problem with people who would identify Bill and I as friends and/or colleagues. That is a demonstrable fact in every nuance.

It got me thinking what is this science and synthesis approach that Craig talks about. How is it fundamentally different that statistical analysis that seems to be the theme of sabermetrics today? Craig also mentions that in the forward to the 1985 Bill James Abstract he gives his definition of sabermetrics. It is posted below. It seems Craig had a very different vision of what sabermetrics is supposed to be than what it has become today. Is that a good thing or a bad thing? Can sabermetrics return to what Craig Wright believes it should be?

what is this science and synthesis approach that Craig talks about. How is it fundamentally different that statistical analysis that seems to be the theme of sabermetrics today?

My guess is that it means stats are only one of the many tools needed in the science of studying baseball. For instance, if you were an astronomist, you wouldn't just rely on mathematical formulae, although that would certainly be an important element of that science. However, there are many other factors that go into it, including human observation and logical deduction.

If all the sciences relied only on infallable statistics, then we'd have no scientific theories like evolution -- only absolute laws like gravity.

And, therein lies my biggest problems with some (not all) sabers: They act like their theories are absolute, 100 percent irrefutable laws; and that anyone who disagrees with them is a backward dinosaur.

To understand the relationship between science and sabrmetrics, of course we'd need to know what "science" is as well as of "sabrmetrics." Unfortunately, the former term is even more difficult than the latter. I myself try to the best of my ability, for example, to follow the rule of logic, stay within the bounds of common sense, yet question what seems obvious (I've actually defended Arthur Soden). Yet I never think of myself as really engaged in sabrmetrics and certainly not in science. What Wright describes (as "science" is just clear, reasonably rigorous thinking.

Analytical skill, sharp observational ability, breadth of vision, depth of knowledge, talent as a writer are the qualities that have made James particularly successful, and I think they are what Wright is pointing to. But none of those are either peculiarly "scientific" or "sabrmetric" qualities. It seems to me that James has been successful, or at the very least he's been readable for people like me, not because he's a scientist, but precisely because he's always been more than a sabrmetrician.

I have some confidence (but only some) in my ability to define or at least say something meaningful about sabrmetrics. I do believe it consists essentially in statistical analysis, although certainly not all statistical analysis is sabrmetric. The fact that somebody like James can combine statistical analysis with qualitative observation -- well, good, combining qualitative and quantitative analysis is an excellent way to give research depth and vitality. But it doesn't help to muddle up the categories of qualitative and quantative as Wright does here and then pile it all up under the rubric of "science," a label I am pretty sure Wright has chosen partly because it gets at the qualities of mind he aspires to but primarily because "scientist" is a term that has a cachet he wouldn't get out of calling himself, say, a "baseball humanist."

I've belabored the point and Wright himself more than a simple little introductory essay deserves, but I think we can speak more meaningfully and precisely about sabrmetrics. A more useful analogy than science, I think, would be the great ideological theories such as Marxism, psychotherapy, classical economics or Noam Chomsky's transformational grammar in lingustics, and there are others that could be added. These all consist of a battery of elaborate, often difficult, sometimes dauntingly arcane analytical techniques, but more importantly they are all founded on the basis of a reductive theoretical analysis that makes radically simplifying assumptions about the world or about what is knowable in it.

For Marxists, people are motivated by economic class and relations to the means of production, for psychotherapists they're motivated by sex. Classical economists assume we can understand human behavior if we assume that everyone acts as a rational self-interested agent. Transformational grammar holds that nothing meaningful can be said about any aspect of language except grammar; behaviorist psychology held that you can't say anything useful about what people think or feel, only what they do. In its humbler way, I submit that sabrmetrics is likewise founded on a fundamental simplifying assumption: "all value in baseball can be measured statistically."

I've gotten in way, way over my head in the previous two paragraphs, but I don't think I've mischaracterized any of these theories at a very broad level. The power of sabrmetrics like all the rest lies, first of all, not in its complicated and difficult technical superstructure but in their fundamental simplicity, which clears the world of its dense complexity long enough for us to get a glimpse of the forest instead of seeing only trees.

This is the great power of a theory like this, when it's done right, but of course, there is more on heaven and earth than is dreamt of in anyone's philosophy. The universe really is really a very complicated place, and assuming otherwise will distort your thinking if you're not careful and you don't recognize that your theory provides the perfect answer to every question. Inevitably, though, a lot of less penetrating thinkers will be attracted to such a theory precisely because it does seem to contain a simple key that unlocks absolutely everything. And inevitably there will come a time when much of the potential of the bedrock assumption has been squeezed out, and the more and more observers start to look at the assumption as less a guiding light than a straitjacket.

There are literary theorists who don't like reading literature and linguists who don't have much language sense. It wouldn't surprise me to see baseball statisticians who don't really like baseball, but I never actually encounter any. For all the complaints by those hostile to the discipline, I never see sabrmetricians who aren't also very great baseball fans in every sense of the word. I do think it's true that in the early days, twenty years ago, there were a lot of insufferable sabrmetric true believers of the sort that every theory of this sort will attract, people who were delighted to have found the One True Church and eager to heap scorn on those less enlightened -- and it didn't help that it was easier for them to copy James' sarcastic way with those who didn't see what he saw than to emulate his legitimately enviable qualities. There are still people like that around, naturally, but far, far fewer than in the early days of the sabrmetric revolution, and I think all the fulminating about "the sabrs" and their alleged dogmatism, arrogance and narrowness of vision is badly out of place at this late date.

There are literary theorists who don't like reading literature and linguists who don't have much language sense. It wouldn't surprise me to see baseball statisticians who don't really like baseball, but I never actually encounter any. For all the complaints by those hostile to the discipline, I never see sabrmetricians who aren't also very great baseball fans in every sense of the word. I do think it's true that in the early days, twenty years ago, there were a lot of insufferable sabrmetric true believers of the sort that every theory of this sort will attract, people who were delighted to have found the One True Church and eager to heap scorn on those less enlightened -- and it didn't help that it was easier for them to copy James' sarcastic way with those who didn't see what he saw than to emulate his legitimately enviable qualities. There are still people like that around, naturally, but far, far fewer than in the early days of the sabrmetric revolution, and I think all the fulminating about "the sabrs" and their alleged dogmatism, arrogance and narrowness of vision is badly out of place at this late date.

Oh, I don't know; I can provide you links to several forums where people who use the RBI or pitcher win statistic are still sneered at by saber-types. And, certainly, as we've seen in the Moneyball thread, some sabers are willing to engage in the most absurd pretzel logic in order to cling to their pet theories, which to me is the antithesis of what sabermetrics aims to be.

That said, I found your response to quite interesting and thought-provoking. But if sabermetrics is indeed defined by your characterization -- that "all value in baseball can be measured statistically" -- then it still begs the question: How did Wright define sabermetrics? Because he says his work didn't just use statistics, but used a broader method of looking at things.

Bill James coined the word, and he (originally?) called sabermetrics the search for objective knowledge about baseball.

I think that over the years the math guys have become so prevalent that sabermetrics has morphed into something a bit different. I like the examples of how different fields of study each relate their specialty to a world view. I think the old saying "if the only tool you have is a hammer everything looks like a nail" applies.

One somewhat esoteric example from my own experience is the field of high fidelity audio, which can be summarized as Engineers vs. Golden Ears. From an engineering perspective measurement is critical, and the prevailing attitude is "if it cannot be measured then it isn't important (or doesn't exist)". From the Golden Ear perspective it is listening, specifically to music and voice, that is crucial. It isn't unusual for equipment that measures well (sometimes spectacularly so) to leave them unsatisfied with the reproduced sound. Both camps tend to regard the other with suspicion, with engineers claiming that the Golden Ears are hearing their own expectations, and the Golden Ears telling the Engineers to actually listen to real music once in a while. There usually is a real failure to communicate between the two viewpoints, since Engineers talk in terms of distortion measurements, noisefloor, frequency response (all measurements), while the Golden Ears talk about things such as warmth or coolness or harshness of tone and the ability to locate specific instruments across the soundstage. A Golden Ear isn't usually able to describe what he is hearing in terms the Engineer can understand.

I think walks are overrated unless you can run. If you get a walk and put the pitcher in a stretch, that helps, but the guy who walks and canít run, most of the time heís clogging up the bases for somebody who can run. Ė Dusty Baker.

Bill James coined the word, and he (originally?) called sabermetrics the search for objective knowledge about baseball.

I would consider research that delves into pitching workload for youg pitchers or the Mike Marshall work on developing new pitching motion to fall under the general definition of sabermetics. I think SABR Matt once mentioned he had thought about or was working on research on how weather affects a batted ball or something like that. That fits under sabermetics IMO as well.

One somewhat esoteric example from my own experience is the field of high fidelity audio, which can be summarized as Engineers vs. Golden Ears. From an engineering perspective measurement is critical, and the prevailing attitude is "if it cannot be measured then it isn't important (or doesn't exist)". From the Golden Ear perspective it is listening, specifically to music and voice, that is crucial. It isn't unusual for equipment that measures well (sometimes spectacularly so) to leave them unsatisfied with the reproduced sound. Both camps tend to regard the other with suspicion, with engineers claiming that the Golden Ears are hearing their own expectations, and the Golden Ears telling the Engineers to actually listen to real music once in a while. There usually is a real failure to communicate between the two viewpoints, since Engineers talk in terms of distortion measurements, noisefloor, frequency response (all measurements), while the Golden Ears talk about things such as warmth or coolness or harshness of tone and the ability to locate specific instruments across the soundstage. A Golden Ear isn't usually able to describe what he is hearing in terms the Engineer can understand.

An interesting example. Why can't a person be both an Engineer and a Golden Ears person? Why must one be either a traditional baseball man or a saber guy? Why one can't be both? In real life a lot of people are both. One of my favorite baseball writers is Keith Law. He's the former assistant GM of the Toronto Blue Jays. He does mostly scouting work these days. He has a solid grounding in sabermetrics. But over the past few years he's also delved into traditional scouting quite a bit. Over the past several years I gotten into scouting stuff myself. I talk to several scouts and they have taught me how to scout players, how to look at tools. It's actually very interesting. It's increased my appreciation for baseball.

Sabermetrics is not science...I think it's important to remember that although a lot of sabermetric research is scientific, the group of research done in SABR is not all under the scientific heading. Some people find it just as interesting to study the game from a historical perspective using the data...this kind of thing is rarely strictly scientific. PCA, for example, is not science. It's just my best current attempt to put the game's metrics into a historical context so that players may be compared.

HOWEVER...to get employed in baseball today, a sabermetrician, needs to be doing work that is actually scientific in approach. The Yankees aren't interested in uberstats or simple correlational analysis (FIP, e.g.). That stuff doesn't help them make money. They DO want to know, though, if you have proof that pitchers with a release point at X position relative to their CG are more or less likely to suffer arm injuries (I've seen a few initial attempts to use pitch F/X (but not the landing spot...the starting spot) to relate mechanics to injury. That sort of thing is what they want now. My thoughts about studying weather impacts...that is what they want. It might some day help managers leverage their pitchers better (gee...it looks like starting pitchers whose primary weapon is a curveball don't do well in warm, dry climates...if you have one, use him when it's cool or humid...that sort of thing). I mean, all the teams now use uberstats to get a quick glance at players they're considering acquiring, but that's just to build the list of potential targets...after that, they need predictive tools...will he continue to produce at that rate for me...given my home park, my team defense (or surrounding line-up mates), and his health history and age? Does his type of player age well? What can we learn about players with his mindset (this is where scouting can sometimes help, BTW). And they need deeper analysis...especially when it comes to pitcher injuries/fatigue and fielding skill.

Put simply...sabermetrics is not Saberology because it didn't start out as a purely scientific pursuit. It's a statistical field...purely statistical analysis is generally not sciecne. But great science can and sometimes has been done in this field.

I suspect he was talking about the difference between application and intent. Many sabes, inquire through the data, searching for useful patterns, and upon finding an interesting apparent correlation, produce a way to use the correlation and call it progress. There's nothing inherently bad about that, but it's not science. Those correlation-driven studies only advance the field when someone applying the scientific method (hypothesize, experiment, infer and start again) asks the all important question - WHY? Why do pitchers apparently not control BABIP? Why do pitchers have little control over the HR/OF Fly ratio? Why do certain weather patterns lead to big changes in run scoring rate? When you interrogate the data like a scientist...looking for reasons behind your observations, then the learning begins. Science (asking the right questions and targetting your experiments toward finding an answer) and synthesis (applying your newfound knowledge to your broader understanding of the game) is not what most sabermetricians do. I wish I had time to do more of that...that's what I do for a living afterall (grad student in meteorology). Most sabes stop at the observation stage..."I see this pattern...let's use that and make some sweeping suggestions as to why it may be true without deeper examination"

I suspect he was talking about the difference between application and intent. Many sabes, inquire through the data, searching for useful patterns, and upon finding an interesting apparent correlation, produce a way to use the correlation and call it progress. There's nothing inherently bad about that, but it's not science. Those correlation-driven studies only advance the field when someone applying the scientific method (hypothesize, experiment, infer and start again) asks the all important question - WHY? Why do pitchers apparently not control BABIP? Why do pitchers have little control over the HR/OF Fly ratio? Why do certain weather patterns lead to big changes in run scoring rate? When you interrogate the data like a scientist...looking for reasons behind your observations, then the learning begins. Science (asking the right questions and targetting your experiments toward finding an answer) and synthesis (applying your newfound knowledge to your broader understanding of the game) is not what most sabermetricians do. I wish I had time to do more of that...that's what I do for a living afterall (grad student in meteorology). Most sabes stop at the observation stage..."I see this pattern...let's use that and make some sweeping suggestions as to why it may be true without deeper examination"

It seems to me that sabermetrics is less like physics or statics or strength of materials (all "hard" sciences) and is more similar to a subject like economics. Mathematic modeling and statistics are applied to economics but there is a significant human factor that makes mathematical results far less certain in economics than one derived in a "hard" science.

Very true, HWR...there is still, however, a gap between simply running statistical correlation studies and other forms of basic tests to interrogate your data and applying a scientific method of sorts to the construction of your studies that requires more from your conclusion than "this seems to work, so let's use it." A good social scientist never assumes his correlation makes sense just because it makes sense to HIM/HER.

Still, we do have to accept that baseball will never be 100% solvable the way that kinematics is.

Grant over at McCovey Chronicles is one of my favorite baseball writers. Here is his humorous take on Matt Cain and xFIP.

An Open Letter to the Viceroy of Stats
by Grant on Jan 14, 2011

Dear Viceroy of Stats,

First off, thank you for the stats. If I were to do a line graph comparing my love for baseball and the rise of the internet, the two lines would start rising dramatically around 1996 without a single dip. The stats are a big part of that. One of my favorite things in the world is feeling superior to other people. Now when someone references RBI, I know Iím objectively better than them in every capacity. You canít buy that feeling, and I have stats to thank. Plus, when people argue about "sabermetrics" vs. "sabremetrics", it reminds me of the Northern Conservative Baptist Great Lakes Region Council of 1912 joke,and thatís always a good thing.

But I also remember those early days of the internet stats. No-hit, all-glove wizards were not tolerated. Teams and GMs who signed players like Royce Clayton, Rey Sanchez, and Mike Bordick were mocked without mercy. The new stats, though, tell us that some of those guys had pretty valuable seasons. Jose Vizcaino, for whom I had a strong distaste in 1997, was actually a 2+ win player that year. Well, Iíll be. This isnít to suggest that because the methods of evaluation have changed, people should discount every innovation because itís likely to be considered wrong in a decade. Of course not.

It might not be a bad idea, though, always to assume that stats are likely to contain some measure of imperfection. When I see single-season WAR totals used with a dogmatic certainty, it makes me uneasy. I have a feeling that the formula for WAR will be updated and tinkered for years, if not decades, because itís surely tricky to combine hitting stats with something as variable as single-season fielding stats to produce a single number. Yet thereís a small faction among us who likes to use single-season WAR as a blunt object. It feels like some folks -- certainly not most or all -- use the stat without the spirit of intellectual curiosity with which it was created.

So Iíve searched for the most diplomatic way to phrase this, and I think Iíve arrived at something that fair, honest, and non-combative. Here goes: Matt Cain is good, and people who use xFIP as a blunt object can shut their yap holes. The idea of normalizing ERA to account for luck with balls put in play is a fine one. Trying to normalize home runs per fly all is a good idea too. Assuming that the current construct will work as an infallible predictive tool for every single pitcher in professional baseball right now? Not my favorite idea.

Matt Cain has outperformed his FIP for four straight seasons. He has probably benefited from some measure of luck, especially in 2009, when he beat the mark by a full run. The traditional stat, ERA, indicates that Matt Cain is an elite pitcher. FIP suggests that Cain is merely very good. Thatís a fair debate. Pitchers can do that sort of thing for an entire career, but theyíre the exceptions, not the rules. The burden of proof would probably be on the person suggesting that Cain is elite.

However, xFIP suggests that Matt Cain is an innings-eater of the most ordinary capacity, like a Jon Garland or a Joe Blanton. Matt Cainís career xFIP is 4.43, and aaaaaaany day now, his ERA will regress to meet that mark. Some people pounce on that, and they froth at the mention of Matt Cain as a top pitcher. And Iím forced to react like a troglodyte, mentioning that a) IíVE TOTALLY WATCHED, LIKE, EVERY ONE OF HIS STARTS, AND MY EYES ARE MORE BETTER THAN YOUR STATS, and b) but his ERA! ( ) I donít like both of those arguments. I can link to a study by the wizard who actually invented FIP, which acknowledges that there could be outliers like Cain when calculating xFIP, but because the math hurts my brain, I canít do anything but appeal to his authority.

It feels like with some folks, you get "Matt Cainís xFIP is this. His ERA is that. The difference means there is something wrong with Matt Cain." I would like more, "Matt Cainís xFIP is this. His ERA is that. Maybe thereís something that makes this happen every year." Thatís all. I would just like the small, vocal minority to use stats like WAR, FIP, and xFIP as useful tools, not divinely inspired scripture just yet. Please command them to do so with your powers as Viceroy of Stats.

I would like to end this open letter by noting that Matt Cain did not allow an earned run this postseason, and contrary to popular belief, that performance has tremendous predictive value. I predict that in 20 years, Matt Cainís performance in the 2010 playoffs will still have been totally awesome.

All stats are just tools, much like my high school crush's ex-boyfriends. And like those ex-boyfriends, only a few of them have any value whatsoever. It's not like there's one UBER-stat, but I find myself using only 5-7 on a regular basis.

Originally Posted by Cougar

"Read at your own risk. Baseball Fever shall not be responsible if you become clinically insane trying to make sense of this post. People under 18 must read in the presence of a parent, guardian, licensed professional, or Dr. Phil."

All stats are just tools, much like my high school crush's ex-boyfriends. And like those ex-boyfriends, only a few of them have any value whatsoever. It's not like there's one UBER-stat, but I find myself using only 5-7 on a regular basis.

Only tools? Absolutely. But, only a few having value? That’s where I beg to differ most vociferously.

Every single metric ever devised has value. However its value fluctuates with its audience. And, not only can it fluctuate with its audience, it can fluctuate in value to an individual. FI, when I’m thinking about hitting, I couldn’t care less what individual pitchers’ WHIPs are, so WHIP has very little value at that time.

I’ve found that everyone does what you do, i.e. only regularly use a very small number of different metrics. But, in most cases, those limited metrics are seldom the same from person to person. However, no matter if a person likes just one or all metrics, sooner or later there’s a question that comes up that no metric known can provide the answer to. That’s when things get dicey.

The pitcher whoís afraid to throw strikes, will soon be standing in the shower with the hitter who's afraid to swing.

James writes that his goal is to seek the truth. He uses numbers a lot because (1) they collect a lot of useful information, and 2) though they have biases inherent in them, many of those biases can be found and adjusted for. Ideally, that is what sabermetrics are about.

My own experience with Japanese baseball may be instructive. I still can't read much Japanese, but I was able to figure out their baseball encyclopedia. I'd say 80-90% of what I know about Japanese ball, I learned from that encyclopedia. James has written that baseball statistics are a language all their own, and have the ability to tell stories. I can attest to the truth of that, having taken a lot of the stories I found in the Japanese baseball encyclopedia and turned them into words.

Seen on a bumper sticker: If only closed minds came with closed mouths.Some minds are like concrete--thoroughly mixed up and permanently set.
A Lincoln: I don't think much of a man who is not wiser today than he was yesterday.

It sure is convenient when you can dismiss a whole class of people because you don't like the way they view the world and have therefore ascribed straw man attributes to them with a broad brush.

What, slipping in a dry sarcastic comment is disallowed now? It was my way of saying I'm not sure if anyone really knows what sabremetrics are, especially after reading that first post. Apparently there is a school of thought which believes sabremetrics is science, and another which thinks it has to do with statistical analysis. To me, it has to do with numbers and formulas determining which players are most effective in certain situations. Kind of like physics.

Sarcasm does not translate well online, and that comment sounded snarky and mean spirited (allowed...but I'm also allowed to protest), rather than contributing something to the discussion...if you'd jsut said what you said in this last post first, we wouldn't be having this discussion.

BTW, you comparison to physics doesn't track for me. At least ideally speaking, physics is a pure scientific discipline...why would you compare physics to something just about numbers and formulas? That's not how I at least view physics.

Only tools? Absolutely. But, only a few having value? That’s where I beg to differ most vociferously.

Every single metric ever devised has value. However its value fluctuates with its audience. And, not only can it fluctuate with its audience, it can fluctuate in value to an individual. FI, when I’m thinking about hitting, I couldn’t care less what individual pitchers’ WHIPs are, so WHIP has very little value at that time.

I’ve found that everyone does what you do, i.e. only regularly use a very small number of different metrics. But, in most cases, those limited metrics are seldom the same from person to person. However, no matter if a person likes just one or all metrics, sooner or later there’s a question that comes up that no metric known can provide the answer to. That’s when things get dicey.

Sabermetrics is a thought process.

Rather than advocating a belief, say, Barry Bonds is the greatest hitter of all time, facts are gathered and conclusions are derived. Adjustments are used in an attempt to normalize and center facts in ways that have probative value.

To wit: A fact that may be inflated in a certain environment is still strong relative to the environment it is in.