In our last chapter, we learned how to do ordinary linear regression with SPSS,
concluding with methods for examining the distribution of variables to check for
non-normally distributed variables as a first look at checking assumptions in regression.
Without verifying that your data have met the regression assumptions, your results may
be misleading. This chapter will explore how you can use SPSS to test whether your
data meet the assumptions of linear regression. In particular, we will consider the
following assumptions.

Linearity - the relationships between the predictors and the outcome variable should be
linear

Normality - the errors should be normally distributed - technically normality is
necessary only for the t-tests to be valid, estimation of the coefficients only requires
that the errors be identically and independently distributed

Homogeneity of variance (homoscedasticity) - the error variance should be constant

Independence - the errors associated with one observation are not correlated with the
errors of any other observation

Many graphical methods and numerical tests have been developed over the years for
regression diagnostics and SPSS makes many of these methods easy to access and
use. In this chapter, we will explore these
methods and show how to verify
regression assumptions and detect potential problems using SPSS.

2.1 Unusual and Influential data

A single observation that is substantially different from all other observations can
make a large difference in the results of your regression analysis. If a single
observation (or small group of observations) substantially changes your results, you would
want to know about this and investigate further. There are three ways that an
observation can be unusual.

Outliers: In linear regression, an outlier is an observation with large
residual. In other words, it is an observation whose dependent-variable value is unusual
given its values on the predictor variables. An outlier may indicate a sample peculiarity
or may indicate a data entry error or other problem.

Leverage: An observation with an extreme value on a predictor variable
is called a point with high leverage. Leverage is a measure of how far an
observation deviates from the mean of that variable. These leverage points can have an unusually large effect on the estimate of
regression coefficients.

Influence: An observation is said to be influential if removing the observation
substantially changes the estimate of coefficients. Influence can be thought of as the
product of leverage and outlierness.

How can we identify these three types of observations? Let's look at an example dataset
called crime. This dataset appears in Statistical Methods for Social
Sciences, Third Edition by Alan Agresti and Barbara Finlay (Prentice Hall, 1997). The
variables are state id (sid), state name (state), violent crimes per 100,000
people (crime), murders per 1,000,000 (murder), the percent of the
population living in metropolitan areas (pctmetro), the percent of the population
that is white (pctwhite), percent of population with a high school education or
above (pcths), percent of population living under poverty line (poverty),
and percent of population that are single parents (single).
Below we read in the file and do some descriptive statistics on these
variables. You can click crime.sav to access
this file, or see the Regression with SPSS page to
download all of the data files used in this book.

Let's say that we want to predict crime by pctmetro, poverty, and single
. That is to say, we want to build a linear regression model between the response
variable crime and the independent variables pctmetro, poverty and single.
We will first look at the scatter plots of crime against each of the predictor variables
before the regression analysis so we will have some ideas about potential problems. We can
create a scatterplot matrix of these variables as shown below.

The graphs of crime with other variables show some potential problems.
In every plot, we see a data point that is far away from the rest of the data
points. Let's make individual graphs of crime with pctmetro and poverty and single
so we can get a better view of these scatterplots. We will use BY
state(name) to plot the state name instead of a point.

GRAPH /SCATTERPLOT(BIVAR)=pctmetro WITH crime BY state(name) .

GRAPH /SCATTERPLOT(BIVAR)=poverty WITH crime BY state(name) .

GRAPH /SCATTERPLOT(BIVAR)=single WITH crime BY state(name) .

All the scatter plots suggest that the observation for state =
"dc" is a point
that requires extra attention since it stands out away from all of the other points. We
will keep it in mind when we do our regression analysis.

Now let's try the regression command predicting crime from pctmetropoverty
and single. We will go step-by-step to identify all the potentially unusual
or influential points afterwards.

regression
/dependent crime
/method=enter pctmetro poverty single.

Variables Entered/Removed(b)

Model

Variables Entered

Variables Removed

Method

1

SINGLE, PCTMETRO, POVERTY(a)

.

Enter

a All requested variables entered.

b Dependent Variable: CRIME

Model Summary(b)

Model

R

R Square

Adjusted R Square

Std. Error of the Estimate

1

.916(a)

.840

.830

182.068

a Predictors: (Constant), SINGLE, PCTMETRO, POVERTY

b Dependent Variable: CRIME

ANOVA(b)

Model

Sum of Squares

df

Mean Square

F

Sig.

1

Regression

8170480.211

3

2723493.404

82.160

.000(a)

Residual

1557994.534

47

33148.820

Total

9728474.745

50

a Predictors: (Constant), SINGLE, PCTMETRO, POVERTY

b Dependent Variable: CRIME

Coefficients(a)

Unstandardized Coefficients

Standardized Coefficients

t

Sig.

Model

B

Std. Error

Beta

1

(Constant)

-1666.436

147.852

-11.271

.000

PCTMETRO

7.829

1.255

.390

6.240

.000

POVERTY

17.680

6.941

.184

2.547

.014

SINGLE

132.408

15.503

.637

8.541

.000

a Dependent Variable: CRIME

Let's examine the standardized residuals as a first means for identifying outliers.
Below we use the /residuals=histogram subcommand to request a histogram for
the standardized residuals. As you see, we get the standard output that we
got above, as well as a table with information about the smallest and largest
residuals, and a histogram of the standardized residuals. The histogram
indicates a couple of extreme residuals worthy of investigation.

Let's now request the same kind of information, except for the studentized
deleted residual. The studentized deleted residual is the residual that
would be obtained if the regression was re-run omitting that observation from
the analysis. This is useful because some points are so influential that
when they are included in the analysis they can pull the regression line close
to that observation making it appear as though it is not an outlier -- however
when the observation is deleted it then becomes more obvious how outlying it is.
To save space, below we show just the output related to
the residual analysis.

The histogram shows some possible outliers. We can use the outliers(sdresid)
and id(state) options to request the 10 most extreme values for the studentized deleted
residual to be displayed labeled by the state from which the observation
originated. Below we show the output generated by this
option, omitting all of the rest of the output to save space. You can see
that "dc" has the largest value (3.766) followed by "ms"
(-3.571) and "fl" (2.620).

We can use the /casewise subcommand below to request a display of all
observations where the sdresid exceeds 2. To save space, we show
just the new output generated by the /casewise subcommand. This shows
us that Florida, Mississippi and Washington DC have sdresid values exceeding
2.

Now let's look at the leverage values to identify observations that will have
potential great influence on regression coefficient estimates. We can
include lever with the histogram( ) and the outliers( )
options to get more information about observations with high leverage. We
show just the new output generated by these additional subcommands below.
Generally, a point with leverage greater than (2k+2)/n should be
carefully examined. Here k is the number of predictors and n is
the number of observations, so a value exceeding (2*3+2)/51 = .1568 would
be worthy of further investigation. As you see, there are 4 observations
that have leverage values higher than .1568.

As we have seen, DC is an observation that both has a large residual and large
leverage. Such points are potentially the most influential. We can make a plot
that shows the leverage by the residual and look for observations that are high
in leverage and have a high residual. We can do this using the /scatterplot
subcommand as shown below. This is a quick way of checking potential influential observations and outliers at the
same time. Both types of points are of great concern for us. As we
see, "dc" is both a high residual and high leverage point, and
"ms" has an extremely negative residual but does not have such a high
leverage.

Now let's move on to overall measures of influence, specifically let's look at Cook's D,
which combines information on the residual and leverage. The lowest value that Cook's D can assume is zero, and the higher the Cook's D is, the
more influential the point is. The conventional cut-off point is 4/n, or in
this case 4/51 or .078. Below we add the cook keyword to the outliers(
) option and also on the /casewise subcommand and below we see that for the 3 outliers flagged in the "Casewise
Diagnostics" table, the value of Cook's D exceeds this cutoff. And,
in the "Outlier Statistics" table, we see that "dc",
"ms", "fl" and "la" are the 4 states that exceed
this cutoff, all others falling below this threshold.

Cook's D can be thought of as a general measure of influence. You can also consider more
specific measures of influence that assess how each coefficient is changed by
including the observation. Imagine that you compute the regression coefficients
for the regression model with a particular case excluded, then recompute the
model with the case included, and you observe the change in the regression
coefficients due to including that case in the model. This measure is called DFBETAand
a DFBETA value can be computed for each observation for each predictor.
As shown below, we use the /save sdbeta(sdbf) subcommand to save the DFBETA values for each of the predictors. This saves 4 variables into the
current data file, sdfb1, sdfb2, sdfb3 and sdfb4,
corresponding to the DFBETA for the Intercept and for pctmetro, poverty
and for single, respectively. We could replace sdfb with
anything we like, and the variables created would start with the prefix that we
provide.

The /save sdbeta(sdfb) subcommand does not produce any new output, but
we can see the variables it created for the first 10 cases using the list
command below. For example, by including the case for "ak" in
the regression analysis (as compared to excluding this case), the
coefficient for pctmetro would decrease by -.106 standard errors. Likewise, by including the case for "ak" the
coefficient for poverty decreases by -.131 standard errors, and the
coefficient for single increases by .145 standard errors (as compared to
a model excluding "ak"). Since the inclusion of an observation could either contribute to an
increase or decrease in a
regression coefficient, DFBETAs can be either positive or negative. A DFBETA value
in excess of 2/sqrt(n) merits further investigation. In this example, we
would be concerned about absolute values in excess of 2/sqrt(51) or .28.

We can plot all three DFBETA values for the 3 coefficients against the state id in one graph shown
below to help us see potentially troublesome observations. We see changed
the value labels for sdfb1sdfb2 and sdfb3 so they would be
shorter and more clearly labeled in the graph. We can see that the DFBETA
for single for "dc" is about 3, indicating that by including
"dc" in the regression model, the coefficient for single is 3
standard errors larger than it would have been if "dc" had been omitted.
This is yet another bit of evidence that the observation for "dc" is
very problematic.

The following table summarizes the general rules of thumb we use for the
measures we have discussed for identifying observations worthy of further investigation (where k is the number
of predictors and n is the number of observations).

Measure

Value

leverage

>(2k+2)/n

abs(rstu)

> 2

Cook's D

> 4/n

abs(DFBETA)

> 2/sqrt(n)

We have shown a few examples of the variables that you can refer to in the /residuals
, /casewise, /scatterplot and /save sdbeta( ) subcommands.
Here is a list of all of the variables that can be used on these subcommands;
however, not all variables can be used on each subcommand.

PRED

Unstandardized predicted values.

RESID

Unstandardized residuals.

DRESID

Deleted residuals.

ADJPRED

Adjusted predicted values.

ZPRED

Standardized predicted values.

ZRESID

Standardized residuals.

SRESID

Studentized residuals.

SDRESID

Studentized deleted residuals.

SEPRED

Standard errors of the predicted values.

MAHAL

Mahalanobis distances.

COOK

Cook’s distances.

LEVER

Centered leverage values.

DFBETA

Change in the regression coefficient that results from
the deletion of the ith case. A DFBETA value is computed for each case
for each regression coefficient generated by a model.

SDBETA

Standardized DFBETA. An SDBETA value is computed for
each case for each regression coefficient generated by a model.

DFFIT

Change in the predicted value when the ith case is
deleted.

SDFIT

Standardized DFFIT.

COVRATIO

Ratio of the determinant of the covariance matrix with
the ith case deleted to the determinant of the covariance matrix with
all cases included.

MCIN

Lower and upper bounds for the prediction interval of
the mean predicted response. A lowerbound LMCIN and an upperbound UMCIN
are generated. The default confidence interval is 95%. The confidence
interval can be reset with the CIN subcommand. (See Dillon &
Goldstein

ICIN

Lower and upper bounds for the prediction interval for
a single observation. A lowerbound LICIN and an upperbound UICIN are
generated. The default confidence interval is 95%. The confidence
interval can be reset with the CIN subcommand. (See Dillon &
Goldstein

In addition to the numerical measures we have shown above, there are also several graphs that can be used to search for unusual and
influential observations. The partial-regression plot is very useful in identifying
influential points. For example below we add the /partialplot subcommand
to produce partial-regressionplots for all of the predictors.
For example, in the 3rd plot below you can see the partial-regression plot
showing crime by single after both crime and single have been
adjusted for all other predictors in the model. The line plotted has the same slope
as the coefficient for single. This plot shows how the observation for DC
influences the coefficient. You can see how the regression line is tugged upwards
trying to fit through the extreme value of DC. Alaska and West Virginia may also
exert substantial leverage on the coefficient of single as well.
These plots are useful for seeing how a single point may be influencing the
regression line, while taking other variables in the model into
account.

Note that the regression line is not automatically
produced in the graph. We double clicked on the graph, and then chose
"Chart" and the "Options" and then chose "Fit Line
Total" to add a regression line to each of the graphs below.

DC has appeared as an outlier as well as an influential point in every analysis.
Since DC is really not a state, we can use this to justify omitting it from the analysis
saying that we really wish to just analyze states. First, let's repeat our analysis
including DC below.

regression
/dependent crime
/method=enter pctmetro poverty single.

<some output omitted to save space>

Coefficients(a)

Unstandardized Coefficients

Standardized Coefficients

t

Sig.

Model

B

Std. Error

Beta

1

(Constant)

-1666.436

147.852

-11.271

.000

PCTMETRO

7.829

1.255

.390

6.240

.000

POVERTY

17.680

6.941

.184

2.547

.014

SINGLE

132.408

15.503

.637

8.541

.000

a Dependent Variable: CRIME

Now, let's run the analysis omitting DC by using the filter command to
omit "dc" from the analysis. As we expect, deleting DC made a large
change in the coefficient for single .The coefficient for single dropped
from 132.4 to 89.4. After having deleted DC, we would repeat the process we have
illustrated in this section to search for any other outlying and influential observations.

In this section, we explored a number of methods of identifying outliers
and influential points. In a typical analysis, you would probably use only some of these
methods. Generally speaking, there are two types of methods for assessing
outliers: statistics such as residuals, leverage, and Cook's D, that
assess the overall impact of an observation on the regression results, and
statistics such as DFBETA that assess the specific impact of an observation on
the regression coefficients. In our example, we found out that DC was a point of major concern. We
performed a regression with it and without it and the regression equations were very
different. We can justify removing it from our analysis by reasoning that our model
is to predict crime rate for states not for metropolitan areas.

2.2 Tests for Normality of Residuals

One of the assumptions of linear regression analysis is that the residuals are normally
distributed. It is important to meet this assumption for the p-values for the t-tests
to be valid.
Let's use the elemapi2 data file we saw in Chapter 1 for these analyses.
Let's predict academic performance (api00) from percent receiving free meals (meals),
percent of English language learners (ell), and percent of teachers with emergency
credentials (emer). We then use the /save command to generate
residuals.

We now use the examine command to look at the normality of these
residuals. All of the results from the examine command suggest that the
residuals are normally distributed -- the skewness and kurtosis are near 0,
the "tests of normality" are not significant, the histogram looks
normal, and the Q-Q plot looks normal. Based on these results, the
residuals from this regression appear to conform to the assumption of being
normally distributed.

Another assumption of ordinary least squares regression is that the variance of the
residuals is homogeneous across levels of the predicted values, also known as
homoscedasticity. If the model is well-fitted, there should be no
pattern to the residuals plotted against the fitted values. If the variance of the
residuals is non-constant then the residual variance is said to be
"heteroscedastic." Below we illustrate graphical methods for detecting
heteroscedasticity. A commonly used graphical method is to use the residual
versus fitted plot to show the residuals versus fitted (predicted) values. Below we use the
/scatterplot subcommand to plot *zresid (standardized residuals) by
*pred (the predicted values). We see that the
pattern of the data points is getting a little narrower towards the right end, an
indication of mild heteroscedasticity.

Let's run a model where we include just enroll as a predictor and show
the residual vs. predicted plot. As you can see, this plot shows serious
heteroscedasticity. The variability of the residuals when the predicted
value is around 700 is much larger than when the predicted value is 600 or when
the predicted value is 500.

As we saw in Chapter 1, the variable enroll was skewed considerably to the right,
and we found that by taking a log transformation, the transformed variable was more
normally distributed. Below we transform enroll, run the regression and show the
residual versus fitted plot. The distribution of the residuals is much improved.
Certainly, this is not a perfect distribution of residuals, but it is much better than the
distribution with the untransformed variable.

Finally, let's revisit the model we used at the start of this section, predicting api00
from meals, ell and emer. Using this model, the distribution of
the residuals looked very nice and even across the fitted values. What if we add enroll
to this model. Will this automatically ruin the distribution of the residuals?
Let's add it and see.

As you can see, the distribution of the residuals looks fine, even after we added the
variable enroll. When we had just the variable enroll in the model, we did a
log transformation to improve the distribution of the residuals, but when enroll was part
of a model with other variables, the residuals looked good so no transformation was
needed. This illustrates how the distribution of the residuals, not the distribution
of the predictor, was the guiding factor in determining whether a transformation was
needed.

2.4 Collinearity

When there is a perfect linear relationship among the predictors, the estimates for a
regression model cannot be uniquely computed. The term collinearity implies that two
variables are near perfect linear combinations of one another. When more than two
variables are involved it is often called multicollinearity, although the two terms are
often used interchangeably.

The primary concern is that as the degree of multicollinearity increases, the
regression model estimates of the coefficients become unstable and the standard errors for
the coefficients can get wildly inflated. In this section, we will explore some SPSS
commands that help to detect multicollinearity.

We can use the /statistics=defaults tol to request the display of "tolerance" and "VIF"
values for each predictor as a check for multicollinearity. The
"tolerance" is an indication of the percent of variance in the
predictor that cannot be accounted for by the other predictors, hence very
small values indicate that a predictor is redundant, and values that are less
than .10 may merit further investigation. The VIF, which stands for variance inflation factor,
is (1 / tolerance) and as a rule of thumb, a variable whose VIF
values is greater than 10 may merit further investigation. Let's first look at the regression we
did from the last section, the regression model predicting api00 from meals, ell
and emer using the /statistics=defaults tol
subcommand.
As you can see, the "tolerance" and "VIF" values are all
quite acceptable.

Now let's consider another example where the "tolerance" and "VIF"
values are more worrisome. In the regression analysis below, we use acs_k3
avg_ed grad_sch col_grad and some_col as predictors of api00.
As you see, the "tolerance" values for
avg_ed grad_sch andcol_grad are below .10,
and avg_ed is about 0.02, indicating that only about 2% of the variance
in avg_ed is not predictable given the other predictors in the model. All of these variables measure education of the
parents and the very low "tolerance" values indicate that these variables
contain redundant information. For example, after you know grad_sch and col_grad, you
probably can predict avg_ed very well. In this example, multicollinearity
arises because we have put in too many variables that measure the same thing, parent
education.

We also include the collin option which produces the
"Collinearity Diagnostics" table below. The very low
eigenvalue for the 5th dimension (since there are 5 predictors) is another
indication of problems with multicollinearity. Likewise, the very high
"Condition Index" for dimension 5 similarly indicates problems with
multicollinearity with these predictors.

Let's omit one of the parent education variables, avg_ed. Note that the
VIF values in the analysis below appear much better. Also, note how the standard
errors are reduced for the parent education variables, grad_sch and col_grad.
This is because the high degree of collinearity caused the standard errors to be inflated.
With the multicollinearity eliminated, the coefficient for grad_sch, which
had been non-significant, is now significant.

When we do linear regression, we assume that the relationship between the response
variable and the predictors is linear. If this
assumption is violated, the linear regression will try to fit a straight line to data that
do not follow a straight line. Checking the linearity assumption in the case of simple
regression is straightforward, since we only have one predictor. All we have to do is a
scatter plot between the response variable and the predictor to see if nonlinearity is
present, such as a curved band or a big wave-shaped curve. For example, let us
use a data file called nations.sav that has data about a number of
nations around the world. Let's look at the relationship between GNP per
capita (gnpcap) and births (birth). Below if we look at
the scatterplot between gnpcap and birth, we can see that the
relationship between these two variables is quite non-linear. We added a
regression line to the chart by double clicking on it and choosing
"Chart" then "Options" and then "Fit Line Total"
and you can see how poorly the line fits this data. Also, if we look at the
residuals by predicted, we see that the residuals are not homoscedastic,
due to the non-linearity in the relationship between gnpcap and birth.

We modified the above scatterplot changing the fit line from using linear
regression to using "lowess" by choosing "Chart" then
"Options" then choosing "Fit Options" and choosing "Lowess"
with the default smoothing parameters. As you can see, the "lowess"
smoothed curve fits substantially better than the linear regression, further
suggesting that the relationship between gnpcap and birth is not
linear.

We can see that the capgnp scores are quite skewed with most values
being near 0, and a handful of values of 10,000 and higher. This suggests to us that some transformation of the variable
may be necessary. One commonly used transformation is a log transformation, so
let's try that. As you see, the scatterplot between capgnp and birth
looks much better with the regression line going through the heart of the
data. Also, the plot of the residuals by predicted values look much more
reasonable.

This section has shown how you can use scatterplots to diagnose problems of
non-linearity, both by looking at the scatterplots of the predictor and outcome
variable, as well as by examining the residuals by predicted values. These
examples have focused on simple regression, however similar techniques would be
useful in multiple regression. However, when using multiple regression, it
would be more useful to examine partial regression plots instead of the simple
scatterplots between the predictor variables and the outcome variable.

2.6 Model Specification

A model specification error can occur when one or more relevant variables are omitted
from the model or one or more irrelevant variables are included in the model. If relevant
variables are omitted from the model, the common variance they share with included
variables may be wrongly attributed to those variables, and the error term can
be inflated. On
the other hand, if irrelevant variables are included in the model, the common variance
they share with included variables may be wrongly attributed to them. Model specification
errors can substantially affect the estimate of regression coefficients.

Consider the model below. This regression suggests that as class size increases the
academic performance increases, with p=0.053. Before we publish results saying that increased class size
is associated with higher academic performance, let's check the model specification.

/dependent api00
/method=enter acs_k3 full
/save pred(apipred).

<some output deleted to save space>

Coefficients(a)

Unstandardized Coefficients

Standardized Coefficients

t

Sig.

Model

B

Std. Error

Beta

1

(Constant)

32.213

84.075

.383

.702

ACS_K3

8.356

4.303

.080

1.942

.053

FULL

5.390

.396

.564

13.598

.000

a Dependent Variable: API00

SPSS does not have any tools that directly support the finding of specification
errors, however you can check for omitted variables by using the procedure
below. As you notice above, when we ran the regression we saved the
predicted value calling it apipred. If we use the predicted value
and the predicted value squared as predictors of the dependent variable, apipred
should be significant since it is the predicted value, but apipred squared
shouldn't be a significant predictor because, if our model is specified correctly, the squared predictions should not have much of
explanatory power above and beyond the predicted value. That is we wouldn't expect apipred
squared to be a
significant predictor if our model is specified correctly. Below we compute apipred2
as the squared value of apipred and then include apipred and apipred2
as predictors in our regression model, and we hope to find that apipred2
is not significant.

The above results show that apipred2 is significant, suggesting that
we may have omitted important variables in our regression. We therefore should
consider whether we should add any other variables to our model. Let's try adding the variable
meals to the above model. We see that meals is a significant
predictor, and we save the predicted value calling it preda for inclusion
in the next analysis for testing to see whether we have any additional important
omitted variables.

We now see that preda2 is not significant, so this test does not
suggest there are any other important omitted variables. Note that after including meals and full, the
coefficient for class size is no longer significant. While acs_k3 does have a
positive relationship with api00 when only full is included in the model,
but when we also include (and hence control for) meals, acs_k3 is no
longer significantly related to api00 and its relationship with api00
is no longer positive.

2.7 Issues of Independence

The statement of this assumption is that the errors associated with one observation are not
correlated with the errors of any other observation. Violation of this
assumption can occur in a variety of situations.
Consider the case of collecting data from students in eight different elementary schools. It is
likely that the students within each school will tend to be more like one another that students
from different schools, that is, their errors are not independent.

Another way in which the assumption of independence can be broken is when data are collected on the
same variables over time. Let's say that we collect truancy data every semester for 12 years. In
this situation it is likely that the errors for observations between adjacent semesters will be
more highly correlated than for observations more separated in time -- this is known as
autocorrelation. When you have data that can be considered to be time-series you
can use the Durbin-Watson statistic to test for correlated residuals.

We don't have any time-series data, so we will use the elemapi2 dataset and
pretend that snum indicates the time at which the data were collected. We
will sort the data on snum to order the data according to our fake time
variable and then we can run the regression analysis with the durbin
option to request the Durbin-Watson test.
The Durbin-Watson statistic has a range from 0 to 4 with a midpoint of 2. The observed value in
our example is less than 2, which is not surprising since our data are not truly
time-series.

This chapter has covered a variety of topics in assessing the assumptions of
regression using SPSS, and the consequences of violating these
assumptions. As we have seen, it is not sufficient to simply run a
regression analysis, but it is important to verify that the assumptions have
been met. If this verification stage is omitted and your data does not
meet the assumptions of linear regression, your results could be misleading and
your interpretation of your results could be in doubt. Without thoroughly
checking your data for problems, it is possible that another researcher could
analyze your data and uncover such problems and question your results showing an
improved analysis that may contradict your results and undermine your
conclusions.

2.9 For more information

You can see the following web pages for more
information and resources on regression diagnostics in SPSS.