If this is your first visit, be sure to
check out the Board Policies and Rules by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Another thread about stats

So all this talk about stats and which ones are the 'best' and the such as kind of got me thinking on my own.

I'm not the kind of guy that takes a lot of things just based on what I've been told, I tend to like to do research on my own and test things out myself.

I never really knew anything at all about wOBA before today, so thanks to eb45 for all of the work he did to educate us.

I decided to take four different 'rate' stats from all MLB teams in 2010 (I looked at BA, OBP, OPS, and wOBA). I wanted to look for myself and see how these four correlate with run production.

Not that complicated, but I graphed each team's run production in 2010 as a function of their BA, OBP, OPS, and wOBA. And I calculated r^2 for a linear regression each time. That just tells you how well a line 'fits' your data. the closer r^2 is to 1.......the better the fit.......herego the MORE likely that the given stat will be able to give you meaningful prediction power for run production.

*This one I should probably ask for some further help on. I tried to calculate wOBA using FanGraph's formula, but the values i got for each team were slightly off for some teams compared to what FanGraphs had on their own. Only thing i can think of is maybe i used the wrong values for guys reaching base on an error? Where should I look for that? I used bb-reference, the "ROE" stat.

CONCLUSIONS

The findings pretty much confirm the suspicion of myself and most others that batting average really is a pretty poor judge of a team's offense and their ability to score runs. The findings confirm that .obp is a better choice, but they SEEM to show that OPS and wOBA may be the best available.

I was surprised that OPS came out looking so good.

Anyway, I'm wondering, do these results match what is accepted among most in the saber crowd? Is one season too small of a sample size? I was thinking about looking at another seaon or so just to make sure.

But I thought there'd be enough saber guys on here that might find these results kind of interesting.

So all this talk about stats and which ones are the 'best' and the such as kind of got me thinking on my own.

I'm not the kind of guy that takes a lot of things just based on what I've been told, I tend to like to do research on my own and test things out myself.

I never really knew anything at all about wOBA before today, so thanks to eb45 for all of the work he did to educate us.

I decided to take four different 'rate' stats from all MLB teams in 2010 (I looked at BA, OBP, OPS, and wOBA). I wanted to look for myself and see how these four correlate with run production.

Not that complicated, but I graphed each team's run production in 2010 as a function of their BA, OBP, OPS, and wOBA. And I calculated r^2 for a linear regression each time. That just tells you how well a line 'fits' your data. the closer r^2 is to 1.......the better the fit.......herego the MORE likely that the given stat will be able to give you meaningful prediction power for run production.

*This one I should probably ask for some further help on. I tried to calculate wOBA using FanGraph's formula, but the values i got for each team were slightly off for some teams compared to what FanGraphs had on their own. Only thing i can think of is maybe i used the wrong values for guys reaching base on an error? Where should I look for that? I used bb-reference, the "ROE" stat.

CONCLUSIONS

The findings pretty much confirm the suspicion of myself and most others that batting average really is a pretty poor judge of a team's offense and their ability to score runs. The findings confirm that .obp is a better choice, but they SEEM to show that OPS and wOBA may be the best available.

I was surprised that OPS came out looking so good.

Anyway, I'm wondering, do these results match what is accepted among most in the saber crowd? Is one season too small of a sample size? I was thinking about looking at another seaon or so just to make sure.

But I thought there'd be enough saber guys on here that might find these results kind of interesting.

NERDS.

OPS comes out so well because it captures the largest components of run production.

But, the reason that wOBA looks relatively "worse" in terms of R^2 is because you are not comparing apples to apples. wOBA adjusts for park effects to "normalize" performance, whereas runs scored is just a raw unadjusted stat. Because your OPS figure is not adjusted for park effects, it looks relatively "better" in terms of predicting runs scored.

Also would be interesting to understand which teams fall above/below the respective regression lines. TAM, I suspect, will show a strong above the line variance because of their baserunning effects and a much higher OPS with runners on base than their full year avg.

I think reports of the death of OPS are greatly exaggerated, can someone answer my concerns earlier about wOBA? I want to make sure I was calculating it right.

I don't know how to calculate wOBA, which is part of the problem with that stat, it's just too complicated for a quick calculation. Your regression analysis is pretty interesting, because even if you are doing wOBA wrong, it shows that team OPS at least comes close to being just as good as a predicter of runs scored. Considering how easy it is to calculate, it will remain a very useful, accessible stat.

So all this talk about stats and which ones are the 'best' and the such as kind of got me thinking on my own.

I'm not the kind of guy that takes a lot of things just based on what I've been told, I tend to like to do research on my own and test things out myself.

I never really knew anything at all about wOBA before today, so thanks to eb45 for all of the work he did to educate us.

I decided to take four different 'rate' stats from all MLB teams in 2010 (I looked at BA, OBP, OPS, and wOBA). I wanted to look for myself and see how these four correlate with run production.

Not that complicated, but I graphed each team's run production in 2010 as a function of their BA, OBP, OPS, and wOBA. And I calculated r^2 for a linear regression each time. That just tells you how well a line 'fits' your data. the closer r^2 is to 1.......the better the fit.......herego the MORE likely that the given stat will be able to give you meaningful prediction power for run production.

*This one I should probably ask for some further help on. I tried to calculate wOBA using FanGraph's formula, but the values i got for each team were slightly off for some teams compared to what FanGraphs had on their own. Only thing i can think of is maybe i used the wrong values for guys reaching base on an error? Where should I look for that? I used bb-reference, the "ROE" stat.

CONCLUSIONS

The findings pretty much confirm the suspicion of myself and most others that batting average really is a pretty poor judge of a team's offense and their ability to score runs. The findings confirm that .obp is a better choice, but they SEEM to show that OPS and wOBA may be the best available.

I was surprised that OPS came out looking so good.

Anyway, I'm wondering, do these results match what is accepted among most in the saber crowd? Is one season too small of a sample size? I was thinking about looking at another seaon or so just to make sure.

But I thought there'd be enough saber guys on here that might find these results kind of interesting.

NERDS.

Originally Posted by Frobby

I don't know how to calculate wOBA, which is part of the problem with that stat, it's just too complicated for a quick calculation. Your regression analysis is pretty interesting, because even if you are doing wOBA wrong, it shows that team OPS at least comes close to being just as good as a predicter of runs scored. Considering how easy it is to calculate, it will remain a very useful, accessible stat.

Here's a spreadsheet for wOBA calculation. I'm not sure if I've used this one, but there are a few floating around.

I had my friend take a look at this thread. He has a Masters in Statistics and a basic understanding of baseball in that he is American and read Moneyball. But, had didn't know much about OPS or wOBP. He made some very good points about this debate.

1. Frobby is right. The simplicity of OPS makes it useful because if it parsimonious.

Chat transcript:

from a stats point of view
parsimony
is often important
when predicting future outcomes
its very easy to include EVERYTHING
and say "yes, by knowing everything, we can predict what is most likely to come next"
but EVERYTHING is often expensive to collect record clean and maintain
so reducing it to the point where is quick and easy
is often important

...

Phil
saving the more complicated model for performance
for when its important
so on the surface
OPS seems much more parsimonious
but, it could be that the other stat includes something really important
but if the R-sq comes out the same most of the time
you might question why you use the complicated one
that said, there is probably a reason someone developed the complicated one
and thats stats in a nutshelll- no clear answers

2. Two good rate stats are better then 1. When wOPB and OPS tell you same thing, be happy we have more confidence in the predicative quality of the stats.

3. At the end of the day, if the outcomes of OPS and wOBP aren't incredibly different, then choose the simpler one OPS.

Parsimony.

Phil
both OPS and wOBP high...good
both low
bad
signficant difference between the two
proceed with caution
if two players have similar, hire the one with the better personality
boom

I had my friend take a look at this thread. He has a Masters in Statistics and a basic understanding of baseball in that he is American and read Moneyball. But, had didn't know much about OPS or wOBP. He made some very good points about this debate.

1. Frobby is right. The simplicity of OPS makes it useful because if it parsimonious.

Chat transcript:

2. Two good rate stats are better then 1. When wOPB and OPS tell you same thing, be happy we have more confidence in the predicative quality of the stats.

3. At the end of the day, if the outcomes of OPS and wOBP aren't incredibly different, then choose the simpler one OPS.

Parsimony.

Good post, but here's an excellent discussion on wOBA and correlation to runs.

The methodology for measuring that correlation gets a bit sticky. Working right now so I can't expand much at the moment, but check out Tom's comments:

Instead of using my 'homemade' wOBA values I simply took the given values provided at Fangraphs's website.

Comparing OPS to wOBA vs. Runs Scored

in 2010

r^2 for OPS = .891
r^2 for wOBA = .908

in 2009

r^2 for OPS = .919
r^2 for wOBA = .923

So this seems to make a bit more sense to me I suppose. wOBA looks like it's a 'slight' improvement over OPS, but very slight in the grand scheme of things.

Also, I'm a bit concerned that I couldn't exactly replicate the FG values for wOBA, but anyway, all of this seems to show that OPS is a lot more interesting/useful than we might have previously anticipated.

It is important to define the unit of observation you are assessing with your r2 statistic. So if the question is, how well does OPS correlate with runs scored, we must first ask the question, what is our unit of observation? It looks like you took a team's OPS for a season and regressed that against the total runs scored. That will give you one answer.

If instead we created an OPS for each game and correlated that with runs scored in the game, we would get a different estimate. We would get a different one still if we used runs scored in each inning. And the differences are substantial:

Code:

Observation Unit r2 between runs and OPS
team 0.889
games 0.678
inning 0.495

The problem with using summary statistics for inference is that conclusions could fall prey to the ecological fallacy. Interpret with caution.

In general, you want to use the finest level possible. But that's tricky in baseball because although some stats are based upon plate appearances (e.g. OBP), runs scored by plate appearances are not independent. Given that, it seems innings is the finest appropriate level, though the interpretation of a team's OBP or OPS in an inning is not often the parameter of interest (when, for example, you want ot evaluate players).

OPS comes out so well because it captures the largest components of run production.

But, the reason that wOBA looks relatively "worse" in terms of R^2 is because you are not comparing apples to apples. wOBA adjusts for park effects to "normalize" performance, whereas runs scored is just a raw unadjusted stat. Because your OPS figure is not adjusted for park effects, it looks relatively "better" in terms of predicting runs scored.

Aglets, very interesting so far. Why don't you take account of the above and simply use OPS+ instead of OPS - or at least add it to the selection of stats you're comparing?

Ahhhhhhhhhh! As a statistician, I find it my obligation to chime in. Do not blindly use regression!!! Check your assumptions:

1) is your data linear? If not, is there a transformation that makes it linear (i.e, quadratic, inverse, square root, cubic, etc.)?
2) Is your data normally distributed? Construct a histogram in Excel. The data should be roughly bell shaped, no skew character, roughly symmetric and unimodal.
3) Independence - this you probably satisfied, since I presume you did not count the same player twice in any instance.
4) Heteroscedasticity - Is there constant variation of the errors in your data? Note that a large error is an observation (data point) that is far away from the least square (regression) line. In contrast, a small error means that the data point lies very close to or on the regression line. What is important is that all the errors are roughly the same. If it is borderline, you proceed with regression. When you have a violation is upon plotting the residuals (errors), you notice any familiar, suspicious, cyclical, funneling out, funneling in, parabolic patterns.
5) Outliers. Understand if the outliers "make sense" in the context of your study. Note that it is generally a BAD practice to throw out outliers, without a sensible reason (such as incorrect or incomplete information). For example, if you were interested in OPS in the early 2000s, Barry Bonds would be a big-time outlier. However, throwing him out because it is convenient is a terrible practice; he is a legitimate member of the population we are studying. So what do we do in the case of outliers/non-normality? A transformation! I would start off playing around with a log transformation, since that would bring Barry back closer to the rest of the data.
6) Multicollinearity (same as collinearity) - Since you are performing a simple linear regression (meaning just one independent variable), this is not much a concern for us. In the multiple linear regression case, we would have to make sure that none of the independent variables are related to each other.

I applaud you for your effort and interest in statistics! However, I just want you to know that regression makes certain assumptions and if you fail to verify them and they are violated, your conclusions will most likely be very inaccurate. If you have any questions, PM me (after Wednesday since I have a graduate course exam that day).