When a researcher in economics or finance wants to apply a linear regression model but suspects a non-linear relationship between one of the regressors and the dependent variable, it is typical to also include the square of that regressor, then (maybe, but usually not) do something like a Ramsey RESET test afterwards.

My question is; why square it? Why not take the exponent (call it $x$) to be $x \in [1.5,2.5]$, for example? Getting the right "shape" of the line is impotant to make the assumption $E[\epsilon_i | \mathbf{X}]=0$ hold; there may be cases where $x=2$ does this well but in other cases something like $x=1.8$ might me more sensible.

Here, of course, I'm talking about variables that assume strictly positive values so we don't get complex outcomes. An example is $Age$ in education$\to$income studies.

@John I don't understand. Do you mean that it's like this because $2$ looks prettier than $x\in[1.5,2.5]$? Surely parsimonious should be defined statistically instead of based on how pretty something looks from a qualitative perspective.
–
JaseDec 4 '12 at 22:50

Instead of performing a non-linear least squares routine, the researcher has effectively imposed constraints on the coefficient. They want to handle non-lineraities without too many extra variables. So they just square it. Parsimony.
–
JohnDec 4 '12 at 23:12

2 Answers
2

We want to model $y$ using a simple linear model, the most basic setup is
$$
y = c + \mathbf{X}\beta
$$
with $y$ the $N$ observations, $c$ a constant, $\mathbf{X}$ the $N \times M$ matrix of regressors and $\beta$ a $M$-dimensional vector of coefficients. This model has $M$ parameters, the elements of $\beta$.

The above model is estimated and the Ramsey RESET test finds that the model to be misspecified and the researcher wants to fix this. As you propose the above model is easily extended
$$
y = c + \mathbf{X}\beta + \mathbf{X}'\gamma
$$
where $\mathbf{X}'_{i, j} = \mathbf{X}_{i, j}^{e_i}$, $\mathbf{e}$ is a $M$-dimensional vector and $\gamma$ a $M$-dimensional vector of coefficients. This model has $3M$ parameters, the elements of $\beta$, $\gamma$ and $e$ and much harder to estimate because of the nonlinearity.

This can be easily solved by fixing all $e_i$ a priori. This yields another question: to which value do we fix it? As @pat notes, raising to a non-integer is a bad idea in the general case. But, as you note, one could use the absolute of the regressor raised to a rational exponent since $f(q) = |a^q|$ is continuous and real for all real $q \in \mathbb{Q}$. So why the insistence on integer valued exponents? One simple reason is laziness: it is much simpler to compute $x^2$ than $x^{1.95}$, a second reason is convention. A third reason is that small changes in the exponent have a small impact on the model. These arguments do not apply to the case where a rational exponent would yield a significant improvement. Unfortunately this has severe methodological problems: as argued above, making the exponent parameters makes estimation much harder and, perhaps more importantly, reduces parsimony. The last option of fixing the exponent is possible. However it would require a strong economic argument to defend this particular choice. If your application is such that it is absolutely clear that exponentiation with $q \in \mathbb{Q}$ is justified then you're free to do that. There are no methodological problems that I know of. But prepared for your critics who will notice and wil require justification of your particular choice for $q$.

Another reason to choose $e_i = 2$ is the symmetry with taking cross products of the regressors, from this perspective is a square is a cross product with itself.

The relationship between a response variable and one or more
continuous covariates is often curved. Attempts to represent curvature
in single- or multiple-regression models are usually made by means of
polynomials of the covariates, typically quadratics. However, low
order polynomials offer a limited family of shapes, and high order
polynomials may fit poorly at the extreme values of the covariates. We
propose an extended family of curves, which we call fractional
polynomials, whose power terms are restricted to a small predefined
set of integer and non-integer values. The powers are selected so that
conventional polynomials are a subset of the family. Regression models
using fractional polynomials of the covariates have appeared in the
literature in an ad hoc fashion over a long period; we provide a
unified description and a degree of formalization for them. They are
shown to have considerable flexibility and are straightforward to fit
using standard methods. We suggest an iterative algorithm for
covariate selection and model fitting when several covariates are
available. We give six examples of the use of fractional polynomial
models in three types of regression analysis: normal errors, logistic
and Cox regression.