Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UK Essays.

Introduction

The model which is to be developed is real GDP in the UK. From such a series of real values, it is straightforward to calculate year-on-year growth of GDP.

Selection of variables

To model GDP, key factors identified by Easton (2004) include labour costs, savings ratio, taxation issues, inflation and terms of trade. However, many of these variables are not available for the required 40 year time span.

The variables eventually chosen and the justification were as follows:

GDP: the dependent variable, measured at 1950 prices. As GDP deflator figures were not available back to 1960, the eventual starting point of the analysis, the RPI inflation measure was used to convert the series into real prices.

Exim: this variable is the sum of imports and exports, at constant 1950 prices. As a measure of trade volumes, EXIM would be expected to increase as GDP also increases. The RPI deflator was also used for this series. Total trade was plasced into one variable was to abide by the constraint of no more than four independent variables.

Energy: energy consumption was calculated as production plus imports minus exports in tonnes of oil equivalent. As energy use increases, we would expect to see an increase in the proportion of GDP attributable to manufacturing.[1]

Labour: this variable is the total number of days lost through disputes. We would expect this variable to have a negative coefficient, since an increase in the number of days lost will lead to a reduction of GDP.

Scatter diagrammes showing the relationship between the dependent variable GDP and each of the independent variables is sown in Appendix 1. These diagrammes support each of the hypotheses outlined above.

Main results

The regression equation produced by EViews, once the energy variable is excluded, is as follows:

The adjusted R2 is equal to 0.978; or, 97.8% of the variation in GDP is accounted for by the variation in EXIM, LABOUR and POPN.

Each of the coefficients of the three independent variables, EXIM, LABOUR and POPN, have t-statistics sufficiently high to reject the null hypothesis that any of the coefficients is equal to zero; in other words, each variable makes a significant contribution to the overall equation.

To test the overall fit of the equation, the F value of 703 allows us similarly to reject the hypothesis that the coefficients are simultaneously all equal to zero.

Dependent Variable: GDP

Method: Least Squares

Date: 04/15/08 Time: 09:10

Sample: 1960 2006

Included observations: 47

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

-73223.22

23204.60

-3.155548

0.0029

EXIM

1.062679

0.117445

9.048297

0.0000

LABOUR

-0.139105

0.036951

-3.764585

0.0005

POPN

1.565374

0.443541

3.529270

0.0010

R-squared

0.980046

Mean dependent var

32813.25

Adjusted R-squared

0.978654

S.D. dependent var

10905.60

S.E. of regression

1593.331

Akaike info criterion

17.66631

Sum squared resid

1.09E+08

Schwarz criterion

17.82377

Log likelihood

-411.1582

F-statistic

703.9962

Durbin-Watson stat

0.746519

Prob(F-statistic)

0.000000

The Akaike and Schwartz criteria are used principally to compare two or more models (a model with a lower value of either of these statistics is preferred). As we are analysing only one model here, we will not discuss these two further.

Using tables provided by Gujarati (2004), the upper and lower limits for the DW test are:

DL = 1.383 DU = 1.666

The DW statistic calculated by EViews is 0.746, which is below DL. This results leads us to infer that there is no positive autocorrelation in the model. This is an unlikely result, given that we are dealing with increasing variables over time, but we shall examine the issue of autocorrelation in detail later on.

Multicollinearity

Ideally, there should be little or no significant correlation between the dependent variables; if two dependent variables are perfectly correlated, then one variable is redundant and the OLS equations could not be solved.

The correlation of variables table below shows that EXIM and POPN have a particularly high level of correlation (the removal of the ENERGY variable early on solved two other cases of multicollinearity).

It is important, however, to point out that multicollinearity does not violate any assumptions of the OLS process and Gujarati points out the multicollinearity is a consequence of the data being observed (indeed, section 10.4 of his book is entitled “Multicollinearity; much ado about nothing?”).

Correlations of Variables

GDP

EXIM

POPN

ENERGY

GDP

1.000000

EXIM

0.984644

POPN

0.960960

0.957558

ENERGY

0.835053

0.836279

0.914026

LABOUR

-0.380830

-0.320518

-0.259193

-0.166407

Analysis of Residuals

Overview

The following graph shows the relationship between actual, fitted and residual values. At first glance, the residuals appear to be reasonably well behaved; the values are not increasing over time and there several points at which the residual switches from positive to negative. A more detailed tabular version of this graph may be found at Appendix 2.

Heteroscedascicity

To examine the issue of heteroscedascicity more closely, we will employ White’s test. As we are using a model with only three independent variables, we may use the version of the test which uses the cross-terms between the independent variables.

White Heteroskedasticity Test:

F-statistic

1.174056

Probability

0.339611

Obs*R-squared

10.44066

Probability

0.316002

Test Equation:

Dependent Variable: RESID^2

Method: Least Squares

Date: 04/16/08 Time: 08:24

Sample: 1960 2006

Included observations: 47

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

-2.99E+09

4.06E+09

-0.735744

0.4665

EXIM

-49439.98

45383.77

-1.089376

0.2830

EXIM^2

-0.175428

0.128496

-1.365249

0.1804

EXIM*LABOUR

-0.049223

0.047215

-1.042532

0.3039

EXIM*POPN

0.982165

0.879151

1.117174

0.2711

LABOUR

-18039.83

18496.29

-0.975322

0.3357

LABOUR^2

-0.018423

0.009986

-1.844849

0.0731

LABOUR*POPN

0.344698

0.336446

1.024526

0.3122

POPN

120773.0

157305.5

0.767761

0.4475

POPN^2

-1.217523

1.523271

-0.799282

0.4292

R-squared

0.222142

Mean dependent var

2322644.

Adjusted R-squared

0.032933

S.D. dependent var

3306810.

S.E. of regression

3251902.

Akaike info criterion

33.01368

Sum squared resid

3.91E+14

Schwarz criterion

33.40733

Log likelihood

-765.8215

F-statistic

1.174056

Durbin-Watson stat

1.306019

Prob(F-statistic)

0.339611

The 5% critical value for chi-squared with nine degrees of freedom is 16.919, whilst the computed value of White’s statistic is 10.44. We may therefore conclude that, on the basis of the White test, there is no evidence of heteroscedascicity.

Autocorrelation

The existence of autocorrelation exists in the model if there exists correlation between residuals. In the context of a time series, we are particularly interested to see if successive residual values are related to prior values.

To determine autocorrelation, Gujarati’s rule of thumb of using between a third and a quarter of the length of the time series was used. In this particular case, a lag of 15 was selected.

Date: 04/16/08 Time: 08:05

Sample: 1960 2006

Included observations: 47

Autocorrelation

Partial Correlation

AC

PAC

Q-Stat

Prob

. |**** |

. |**** |

1

0.494

0.494

12.234

0.000

. |*** |

. |** |

2

0.423

0.237

21.409

0.000

. |*. |

.*| . |

3

0.155

-0.171

22.669

0.000

. | . |

.*| . |

4

0.007

-0.145

22.672

0.000

.*| . |

.*| . |

5

-0.109

-0.069

23.319

0.000

**| . |

.*| . |

6

-0.244

-0.160

26.674

0.000

**| . |

. | . |

7

-0.194

0.037

28.845

0.000

**| . |

. | . |

8

-0.202

-0.004

31.247

0.000

**| . |

.*| . |

9

-0.226

-0.162

34.344

0.000

**| . |

.*| . |

10

-0.269

-0.186

38.859

0.000

.*| . |

. |*. |

11

-0.134

0.122

40.013

0.000

.*| . |

. | . |

12

-0.079

0.047

40.428

0.000

.*| . |

.*| . |

13

-0.078

-0.151

40.837

0.000

. | . |

. | . |

14

0.013

0.029

40.849

0.000

. | . |

. | . |

15

0.041

0.018

40.970

0.000

The results of the Q statistic indicate that the data is nonstationary; in other words, the mean and standard deviation of the data do indeed vary over time. This is not a surprising result, given growth in the UK’s economy and population since 1960.

A further test available to test for autocorrelation is the Breusch-Godfrey test. The results of this test on the model are detailed below.

Breusch-Godfrey Serial Correlation LM Test:

F-statistic

15.53618

Probability

0.000010

Obs*R-squared

20.26299

Probability

0.000040

Test Equation:

Dependent Variable: RESID

Method: Least Squares

Date: 04/16/08 Time: 09:23

Presample missing value lagged residuals set to zero.

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

9294.879

18204.51

0.510581

0.6124

EXIM

0.047292

0.092176

0.513065

0.6107

LABOUR

0.039181

0.031072

1.260967

0.2144

POPN

-0.182287

0.348222

-0.523479

0.6035

RESID(-1)

0.788084

0.154144

5.112655

0.0000

RESID(-2)

-0.180226

0.160485

-1.123009

0.2680

R-squared

0.431127

Mean dependent var

0.000100

Adjusted R-squared

0.361753

S.D. dependent var

1540.499

S.E. of regression

1230.710

Akaike info criterion

17.18731

Sum squared resid

62100572

Schwarz criterion

17.42350

Log likelihood

-397.9019

F-statistic

6.214475

Durbin-Watson stat

1.734584

Prob(F-statistic)

0.000225

We can observe from the results above that RESID(-1) has a high t value. In other words, we would reject the hypothesis of no first order autocorrelation. By contrast, second order autocorrelation does not appear to be present in the model.

Overcoming serial correlation

A method to overcome the problem of nonstationarity is to undertake a difference of the dependent variable (ie GDPyear1 – GDPyear0) An initial attempt to improve the equation by using this differencing method produced a very poor result, as can be seen below.

Dependent Variable: GDPDIFF

Method: Least Squares

Date: 04/16/08 Time: 08:17

Sample: 1961 2006

Included observations: 46

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

14037.58

12694.29

1.105818

0.2753

EXIM

0.084287

0.052601

1.602398

0.1167

ENERGY

0.011470

0.011710

0.979487

0.3331

LABOUR

-0.004251

0.014304

-0.297230

0.7678

POPN

-0.300942

0.265082

-1.135279

0.2629

R-squared

0.207408

Mean dependent var

816.6959

Adjusted R-squared

0.130082

S.D. dependent var

657.1886

S.E. of regression

612.9557

Akaike info criterion

15.77678

Sum squared resid

15404304

Schwarz criterion

15.97555

Log likelihood

-357.8660

F-statistic

2.682255

Durbin-Watson stat

1.401626

Prob(F-statistic)

0.044754

Forecasting

The forecasts for the dependent variables are based on Kirby (2008) and are presented below.

The calculation of EXIM for future years was based upon growth rates for exports (47% of the 2006 total) and imports (53%) separately. The two streams were added together to produce the 1950 level GDP figure, from which year-on-year increases in GDP could be calculated. The results of the forecast are shown below.

The 2008 figure was felt to be particularly unrealistic, so a sensitivity test was applied to EXIM (population growth is relatively certain in the short term and calculating a forecast of labour days lost is a particularly difficult challenge).

Instead of EXIM growing by an average of 1.7% per annum during the forecast period, its growth was constrained to 0.7%. As we can see from the “GDP2” column, GDP forecast growth is significantly lower in 2008 and 2009 as a result.

Critical evaluation of the econometric approach to model building and forecasting

GDP is dependent on many factors, many of which were excluded from this analysis due to the unavailability of data covering forty years. Although the main regression results appear highly significant, there are many activities which should be trialled to try to improve the approach:

–a shorter time series with more available variables: using a short time series would enable a more intuitive set of variables to be trialled. For example, labour days lost is effectively a surrogate for productivity and cost per labour hour, but this is unavailable over 40 years;

–transformation of variables: a logarithmic or other transformation should be trialled to ascertain if some of the problems observed, such as autocorrelation, could be mitigated to any extent. The other, more relevant transformation is to undertake differencing of the data to remove autocorrelation; the one attempt made in this paper was particularly unsuccessful!