Standardized vs Unstandardized Regression Coefficient

In one of my predictive model, i found a variable whose unstandardized regression coefficient (aka beta or estimate) close to zero (.0003) but it is statistically significant (p-value < .05). If a variable is significant, it means its coefficient value is significantly different from zero. The question arises "Why coefficient value is close to zero if it is a significant variable?".

The answer lies in the difference between unstandardized coefficient and standardized coefficient.

If an independent variable is expressed in millions or billions of dollars (for eg, $656,765), it can have unstandardized estimate close to zero. To make the coefficient value more interpretable, we can rescale the variable by dividing the variable by 1000 or 100,000 (depending on the value). After rescaling the variable, run regression analysis again including the transformed variable. You would find beta coefficient larger than the old coefficient value and significantly larger than 0.

Important Key takeaway :

Unstandardized coefficient should not be used to drop or rank predictors (aka independent variables) as it does not eliminate the unit of measurement.

But if a standardized beta is close to zero, it's a REAL PROBLEM.

Detailed Explanation

The concept of standardization or standardized coefficients comes into picture when predictors (aka independent variables) are expressed in different units. Suppose you have 3 independent variables - age, height and weight. The variable 'age' is expressed in years, height in cm, weight in kg. If we need to rank these predictors based on the unstandardized coefficient, it would not be a fair comparison as the unit of these variable is not same.Real Use of Standardized Coefficient

They are mainly used to rank predictors (or independent or explanatory variables) as it eliminate the units of measurement of independent and dependent variables). We can rank independent variables with absolute value of standardized coefficients. The most important variable will have maximum absolute value of standardized coefficient.

Interpretation

In the next section, we will discuss the interpretation of unstandardized and standardized coefficient in linear regression.

Linear Regression : Unstandardized Coefficient

It represents the amount by which dependent variable changes if we change independent variable by one unit keeping other independent variables constant.

Linear Regression : Standardized Coefficient

The standardized coefficient is measured in units of standard deviation. A beta value of 1.25 indicates that a change of one standard deviation in the independent variable results in a 1.25 standard deviations increase in the dependent variable.

Calculation of Standardized Coefficient for Linear Regression

Standardize both dependent and independent variables and use the standardized variables in the regression model to get standardized estimates. By 'standardize', i mean subtract the mean from each observation and divide that by the standard deviation. It is also called z-score. It would make mean 0 and standard deviation 1.

Another Approach

Standardized Coefficient for Linear Regression

The standardized coefficient is found by multiplying the unstandardized coefficient by the ratio of the standard deviations of the independent variable and dependent variable.

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 7 years of experience in data science and predictive modeling. During his tenure, he has worked with global clients in various domains like banking, Telecom, HR and Health Insurance.

While I love having friends who agree, I only learn from those who don't.