2 Answers
2

Without wanting to take credit for @ttnphns' answer, I wanted to move the answer out of the comments (particularly considering that the link to the article had died). Matt Krause's answer provides a useful discussion of the distinction between $R^2$ and $R^2_{adj}$ but it does not discuss the decision of which $R^2_{adj}$ formula to use in any given case.

As I discuss in this answer, Yin and Fan (2001) provide a good overview of the many different formulas for estimating population variance explained $\rho^2$, all of which could potentially be labelled a type of adjusted $R^2$.

They perform simulation to assess which of a wide range of adjusted r-square formulas provide the best unbiased estimate for different sample sizes, $\rho^2$, and predictor intercorrelations. They suggest that the Pratt formula may be a good option, but I don't think the study was definitive on the matter.

Update: Raju et al (1997) note that adjusted $R^2$ formulas differ based on whether they are designed to estimate adjusted $R^2$ assuming fixed-x or random-x predcitors. Specifically, the Ezekial formula is designed to estimate $\rho^2$ in the fixed-x context, and the Olkin-Pratt and Pratt formulas are designed to estimate $\rho^2$ in the random-x context. There's not much difference between the Olkin-Pratt and Pratt formulas. Fixed-x assumptions align with planned experiments, random-x assumptions align with when you assume that the values of the predictor variables are a sample of possible values as is typically the case in observational studies. See this answer for further discussion . There's also not much difference between the two types of formulas as sample sizes gets moderately large (see here for a discussion of the size of the difference).

Summary of Rules of Thumb

If you assume that your observations for predictor variables are a random sample from a population, and you want to estimate $\rho^2$ for the full population of both predictors and criterion (i.e., random-x assumption) then use the Olkin-Pratt formula (or the Pratt formula).

If you assume that your observations are fixed or you don't want to generalise beyond your observed levels of the predictor, then estimate $\rho^2$ with the Ezekiel formula.

If you are want to know about out of sample prediction using the sample regression equation, then you would want to look into some form of cross-validation procedure.

The choice of $R^2$ or adjusted $R^2$ depends on what you're trying to do. In a regression context, regular $R^2$ is used as a measure of goodness of fit for your model. However, imagine you're comparing several models which have different numbers of parameters. All things being equal, the model with more parameters will more closely fit your observation. In the limit, you could have a model with parameters for each data point but one; this would give you a perfect fit on your observations, but would be useless for new prediction since it'd capture both the underlying 'signal' AND any associated noise. Adjusted $R^2$ is an attempt to solve this problem by adjusting the $R^2$ value according to the number of parameters in the model.

They therefore have slightly different purposes. $R^2$ describes how well different data sets fit a model. You might write something like "The model described above accurately predicts the performance of Part A ($r^2$=0.9), but not Widget B ($r^2$=0.05) under standard test conditions." Adjusted $R^2$ describes how well different models fit the same data (or similar data). For example, "Results from the short and long-form questionnaire predicted customer's annual spending equally well (Adjusted $R^2$ = 0.8 for both)."

$\begingroup$Thanks, I found that to be a very clear explanation of the difference between R-squared and adjusted R-squared. In your view how does unbiased R-squared fit into this picture?$\endgroup$
– user1205901Mar 25 '12 at 5:10

5

$\begingroup$There are indeed various formulas to estimate the population R^2. See for example studyforquals.pbworks.com/f/yin.pdf. Fisher's (= Wherry's) "Adjusted R^2" is said to be slightly negatively biased (it is still dependent on sample size while not dependent on number of predictors), so Olkin-Pratt version is probably somewhat better.$\endgroup$
– ttnphnsMar 25 '12 at 10:44

1

$\begingroup$@ttnphns, maybe that should be an answer instead of a comment. To me, it seems to address the original question more than this answer.$\endgroup$
– gung♦Mar 25 '12 at 14:48

1

$\begingroup$The $R^2$ value computed from a sample will be slightly smaller than the "true" population value. The plot on page 6/138 of uv.es/psicologica/articulos1.03/9.ZUMBO.pdf showing how the bias varies with sample size and $R^2$ value. The Olkin-Pratt formula corrects for this sample size bias. There seem to be two versions of the Olkin-Pratt formula floating around, one of which also corrects for the number of parameters (see ttnphns link). In fact, that paper contains several tables which will help you choose a correction method for your specific application, so it's worth a look.$\endgroup$
– Matt KrauseMar 25 '12 at 18:32

1

$\begingroup$@ttnphns, I agree with Gung! You should write up an answer and take some credit. Also, can you confirm what I wrote? JStor is acting strange today and won't let me read the original Olkin and Pratt paper.$\endgroup$
– Matt KrauseMar 25 '12 at 18:35