Abstract

The grey prediction model with convolution integral GMC (1, n) is a multiple grey model with exact solutions. To further improve prediction accuracy and describe better the relationship between cause and effect, we introduce nonlinear parameters into GMC (1, n) model and additionally apply a convolution integral to produce an improved forecasting model here designated as NGMC (1, n). The model solving process applied the least-squares method to evaluate the structure parameters of the model: convolution was used to obtain an exact solution with this improved grey model. The nonlinear optimisation took the parameters as the decision variables with the objective of minimising forecasting errors. The GMC (1, 2) and NGMC (1, 2) models were used to predict China’s industrial SO2 emissions from the basis of the economic output level as the influencing factor. Results indicated that NGMC (1, 2) can effectively describe the nonlinear relationship between China’s economic output and SO2 emissions with an improved accuracy over current GMC (1, 2) models.

1. Introduction

Due to the complexity of both the internal and external environmental factors in any such system, the behavioural data are often sparse. Statistical analyses can effectively address the problem where datasets are large; for small sample sizes (), solution becomes difficult. Grey system theory, pioneered by Deng in 1982 [1], is an uncertainty theory dealing specifically with the analysis, modelling, prediction, and control of information-poor systems. Grey theory considers that although the objective system appears complex, with sparse data, it always has an innate overall governing relationship: key to accurate forecast is the choice of appropriate methods of data-mining and utilisation. In grey theory [2, 3], every stochastic process has grey variables changing at a certain amplitude and periodicity. An accumulated generating operation (AGO) is a basic method to render a grey process white. With accumulation, the changing trend becomes more apparent and the innate governing relationship is revealed. Based on this, a grey differential equation is built to describe the dynamic law of grey accumulation. Then, an inverse accumulated generating operation (IAGO) is used to reduce the whitening of this grey process. Of course, it is not merely the case that greater accumulation leads to better forecasts. Tien demonstrated that the -AGO () data of the original series cannot be used as intermediate information in grey prediction model building [4]. In practical applications, the system generated by real data is considered grey: on that basis, grey models can be both effective past simulators and future predictors.

In general, a grey model can be denoted by GM (), where is the order and is the number of variables of the grey equation. GM (1, 1) is the most widely used grey prediction model with the simplest structure [2]. Its core principle is to regard real systems as generalised energy systems with exponentially changing trends under no external agency’s influence. Therefore, only when 1-AGO original data series are consistent with this exponential change, can accurate prediction by GM (1, 1) be achieved. Many scholars have made useful improvements to basic GM (1, 1) models [5–8]. In spite of this, it is not appropriate to use GM (1, 1) predictions based upon data mining from series which are themselves subject to external agents. The GM () model with relative factors acting as the associated series is a multiple, grey, prediction model [2, 3]. This model can fully exploit information provided the associated series unlike a GM model which only includes information about the predicted series in its modelling process. So, from the point of view of provision of supplementary information, the prediction accuracy of GM , should, under these conditions, be higher than that of GM (1, 1). However, the solution of the whitening differential equation of GM is inexact and on occasion wrong thereby producing significant practical forecasting errors [4]. The model therefore has found few real-world applications [9–11] to date.

A grey prediction model with convolution integral GMC proposed by Tien [12] is a new model to improve upon traditional GM versions. The values modelled by GMC () are theoretically the exact solution of the traditional GM model, and the grey control parameter like that of GM (1, 1) is introduced into the model. GMC reduces to GM (1, 1) for . The GMC model improved the prediction accuracy of multiple grey models and has successfully been applied to different areas [12–14]. Tien proposed three improved models over the basic GMC model to meet various application requirements. The three models are deterministic GMC (DGDMC ()) [15]; interval GDMC (IGDMC ) [16]; and first pair-of-data GMC (FGMC ) [17]. The first derivative of the 1-AGO data of each associated series is introduced into the DGDMC () model [15] to strengthen the indicative significance while evaluating its 1-AGO predicted data by a convolution integral. The IGDMC model [16] draws lessons from the prediction method of its linear regression interval, extending the prediction method of DGDMC () from point to interval prediction. Certain components of the model may be removed to better satisfy hypothesis tests based upon system parameters. The building of the improved model FGMC [17] only needs pairs of historical data: for the first pair of entries, the message was shown to be independent from the modelling results. Therefore, FGMC () is usually more satisfactory and stable than GMC because it extracts the first pair of entries and bases its predictions on the message enshrined in the original series.

Nowadays, GMC and variants thereon are linear: the high-precision predictions are predicated upon the similarity in trend between the predicted variables and their influencing factors. At present, most applications meet this premise. For improved predictions, Wu and Chen [13] used grey relational analysis to analyse the similarities between predicted, and associated, series curves of differing periodicities. They then built GMC using data from the largest period of grey relational analysis and obtained satisfactory predictions. However, the nonlinear interaction between the internal and external system factors is ubiquitous in reality: the linear structure of these GMC models is not conducive to high-precision prediction of dynamic systems. Therefore, nonlinear processing to the forms of the relative factors on the right-hand side of the GMC equation was applied here. A power exponent was added to the 1-AGO series of every associated series to reflect the nonlinear interaction of the associated series with that predicted. Although these newly added parameters were initially unknown, they were estimated and subjected to refinement by a least-squares method. After obtaining the time response function of the predicted series, the optimum values of parameters reflecting their nonlinear interactions were derived with regard to the aforementioned minimum error criterion. This improved GMC model, a nonlinear grey prediction model with the addition of a convolution integral, is hereafter designated NGMC .

In recent years, industrial SO2 emissions have become a major source of harmful air pollution in China as their industrial growth continues. To cater for these increased emissions, an enhanced understanding of likely future industrial SO2 emissions is needed. Though China’s industrial output values can be used as a basis to interpret growing industrial emissions, there is no a simultaneous increase in industrial emissions and industrial development [18]. Data also show that China’s SO2 emissions and industrial output values are not governed by a simple linear relationship but an uncertain nonlinear one. We present GMC and NGMC models where the influencing, causative, factors are industrial output; predicted series is China’s SO2 emissions. Through comparison of their modelling and prediction accuracy, the paper demonstrated the effectiveness of the proposed NGMC .

The remainder of this paper is organized as follows. An introduction to the existing GMC model and the modelling method of nonlinear grey model with convolution integral NGMC () proposed in this paper are given in Section 2. Section 3 demonstrates the effectiveness of the proposed NGMC model by forecasting China’s SO2 emissions. Finally, the paper concludes with some comments in Section 4.

2. Modelling Method

2.1. The Existing GMC Model

Assume that pairs of observations are available at time intervals of inputs and an outputfrom some dynamic system. The existing GMC () modelling process [12] is carried out as follows.

Consider the following original predicted series:
and the original associated series:
then the first order accumulated generation (1-AGO) data for are given by (3) and (4), respectively:
The grey prediction model based on the predicted 1-AGO series:
and the associated 1-AGO series:
is given by the differential equation:
where , are parameters to be estimated and is the data number used in model building; is a delay period, and is the number of entries to be forecast. Equation (7) is called the -factor grey prediction model with convolution integral and is denoted by GMC () [12]; the 1 represents the first-order derivative of the 1-AGO series of ; the represents the total of relative series introduced into the grey differential equation.

The grey derivative for the first-order grey differential equation with 1-AGO is represented as follows:
and when

The background value of the grey derivative is taken as the mean of and , and those of the associated series are also taken as the mean of and for , respectively, in the determination of model parameters by GMC ().

The least-squares solution to the model parameters of GMC () [12] in (7) by from 1 to is
where In summary, the right-hand side of (7), the discrete function [12] can be obtained as
The 1-AGO modelling values of the predicted series [12] can be derived with the initial condition as
The modelling values, , in (14) of the 1-AGO data of the predicted series [12] can be evaluated approximately by
where is the unit step function.

Applying 1-IAGO to (16) yields the following modelled values together with the forecasts:

Assume the system parameters in (7) to be constants in the postsampling period and then by using the postsampling data, combined with the given data for the corresponding associated series, as a new input series, the corresponding forecasts or values of indirect measurement for the predicted series can be derived.

2.2. Nonlinear Grey Modelling with Convolution Integral: NGMC

To adapt GMC () to reality’s ubiquitous nonlinearity, the improved model—NGMC ()—with its convolution integral was developed and used.

2.2.1. Representation of the New Model: NGMC

Suppose that the grey prediction model based on the predicted 1-AGO series:
and the associated 1-AGO series:
is given by the differential equation:
where , , are parameters to be estimated, is the data number used in model building, is a delay period, and the number of entries to be forecast. Equation (21) denotes the -factor nonlinear grey prediction model with the addition of a convolution integral (NGMC ()), where 1 represents the first-order derivative of the 1-AGO series of and represents the total numberrelative series introduced into the grey differential equation.

Compared with GMC , unknown parameters are introduced into this NGMC () model. They are taken as the power exponents of the predicted variables’ relevant factors to reflect the effect of these upon nonlinear system behaviours and interactions.When , (20) is reduced to a traditional GMC model [12].When and , (20) reduces to a traditional GM () model [2, 4].When or , (20) reduces to a traditional GM (1, 1) model [2].

2.2.2. The Evaluation of System Parameters , , and

To evaluate all parameters in the model, the power exponents of are assumed to be known; by applying a method similar to that of GMC (), estimates of the corresponding parameters , are made. After obtaining the time response function of , the corresponding optimisation algorithm is applied to solve for .

The grey derivative for the first-order grey differential equation with 1-AGO is conventionally represented by
The background value of the grey derivative is taken as the means of and , and those of the associated series are also taken as the means of and for , respectively, when determining the model parameters by NGMC .

The least-squares solution to the model parameters of NGMC () in (20) by from 1 to is
where

In summary, the right-hand side of (21), the discrete function , can be obtained as

As seen in (26), is a nonlinear function on , , but it, in essence, is still linear. If the new variable , , can be written as
therefore, (27) remains a linear function, essentially the same as (13).

2.2.3. The Determination of Unit Impulse Response Function

The unit impulse response function of the system characterised by (21) can be derived by Laplace transform. From (21):
where is the unit impulse function to determine the corresponding unit impulse response function. Applying a Laplace transform to (28) with the initial condition gives
or
The inverse transform of is
That is, the unit impulse response function of the system is
The process continues by outlining the evaluation of .

2.2.4. The Evaluation of

The 1-AGO modelling values of the predicted series can be derived with the initial condition as
The second term on the right-hand side of (33) can be evaluated approximately by two-point Gaussian numerical integration. Thus, the modelling values in (33) of the 1-AGO data of the predicted series can be evaluated approximately by
where is the unit step function.

Applying 1-IAGO to (35) produced the following modelling values and their forecasts:

2.2.5. An Overall Measure of Accuracy for Forecasts

To evaluate forecast performance, Tien’s standard [12] test using the root mean squared percentage error (RMSPE) for the priori-sample period (RMSPEPR) and postsample periods (RMSPEPO), respectively, is used. Generally, the RMSPEPR and RMSPEPO are defined as

2.2.6. The Determination of Power Exponents

The discussion above is proposed under the condition that , is known, whereas, in reality, we cannot really know the values of them that are unknown. This work suggests that the minimisation of the root mean squared percentage error for the presample period should be the objective, and the unknown parameters , are solved by building the optimisation model below.

Consider the following:

The optimisation problem above can be solved by proprietary software. Once , is confirmed, the structure parameters , of the model are also determined. Then, the discrete function can be derived by back-substituting the derived results into (26): final predictions can then be obtained by applying to (35) to (37).

3. Forecasting China’s Industrial SO2 Emissions

Industry is the dominant factor in promoting China’s economic development. However, SO2 emissions discharged during increasing industrialisation pose a significant threat to the eco-system and biosphere. The effective management of industrial SO2 emissions demands accurate forecasts thereof. The paper argues that industrial output activity levels are considered to be the main factor affecting industrial SO2 emissions. Local governments and enterprises, to increase industrial output, continue to expand the scale of production scale and thereby increase SO2 emissions. China’s industrial SO2 emissions and gross industrial output levels from 2003 to 2010 are given in Table 1: Figure 1 shows the trends in China’s industrial SO2 emissions and gross industrial output. As seen in Figure 1, China’s industrial SO2 emissions and industrial output values are not simply linearly related. With the increasing industrial output, SO2 emissions initially increased and then decreased. Generally, the reasons for this phenomenon are complex; multifarious and interrelated but basically may be summarised as combinations of the following: structural economic change, technical progress, different modalities of demand, and more effective regulatory regimes, and so forth.

GMC (1, 2) and NGMC (1, 2) models were used to forecast—from known industrial output values—China’s SO2 emissions as a time series. Through comparing the modelling and prediction accuracy of the two models, the effectiveness of the proposed NGMC (1, 2) was demonstrated.

3.1. Forecasting by GMC (1, 2)

Applying to the GMC (1, 2) model of (7) to (18), the values of parameters , , , and in (7), estimates of model parameters in (10) can be obtained and are listed in Table 2. The GMC (1, 2) model from (7) becomes

Table 2: The values of parameters , , , and in (7), the estimates of model parameters , , and in (10), and the values of RMSPEPR and RMSPEPO in (38) and (39).

In summary, the right-hand side of (7), the discrete function in (13) for the GMC (1, 2) model is obtained and listed in Table 3; the values of RMSPEPR and RMSPEPO in (38) and (39), respectively, are also listed in Table 2. The modelling values and forecasts for China’s industrial SO2 emissions by GMC (1, 2) are listed in Table 4.

As seen in Table 2, the root mean squared percentage error for the presample period (RMSPEPR) was 8.02%, but the root mean squared percentage error for the postsample period (RMSPEPO) was as high as 185%. This result was unacceptable. Table 4 shows that the relative errors of modelling and prediction gradually increased over time, especially the relative errors for the period 2008 to 2010, being up to −48%, −128%, and −290%, respectively. This indicated that the traditional GMC (1, 2) model cannot be used to describe the relationship between China’s industrial SO2 emissions and industrial output values nor could it be used to predict future industrial SO2 emissions.

3.2. Forecasting by NGMC (1, 2)

Applying the NGMC (1, 2) model of (19) to (37), the values of parameters , , , and in (21), the estimates of model parameters , and in (23), the optimised parameter in (40) can be obtained and values are listed in Table 5. The NGMC (1, 2) model from (21) becomes

Table 5: The values of parameters , , , and in (21), the estimates of model parameters , , in (23) and the optimized parameter in (40), and the values of RMSPEPR and RMSPEPO in (38) and (39).

In summary, the right-hand side of (21), the discrete function in (26) for the NGMC model, is obtained and values are listed in Table 6; the values of RMSPEPR and RMSPEPO in (38) and (39), respectively, are also listed in Table 5. The modelling values and forecasts of China’s industrial SO2 emissions by NGMC are listed in Table 7. Among them, the values for 2003 to 2007 are derived by modelling and those for 2008 to 2010 are forecasts.

As seen in Table 5, the RMSPEPR and RMSPEPO of NGMC (1, 2) were 2.44% and 5.48%, respectively, significantly less than the corresponding RMSPEPR and RMSPEPO values for the GMC (1, 2) model as shown in Table 2. Table 7 also shows the modelling and prediction values for the period 2003 to 2010 by NGMC (1, 2) to be close to the actual values. This indicated that the traditional NGMC (1, 2) model could effectively describe the relationship between China’s industrial SO2 emissions and industrial output values and accurately forecast industrial SO2 emissions.

4. Conclusions

Grey prediction modelling with the addition of a convolution integral to GMC system greatly improved the prediction accuracy of the traditional multiply-grey model GM and was successfully applied in practice. The current GMC model is linear; therefore, the premise of obtaining high-precision modelling results by GMC is the existence of similar trends between the predicted variables and their influencing factors. The research introduced a power exponent into a 1-AGO series of relative factors of GMC to reflect the nonlinear interaction of the associated series with that predicted. The nonlinear optimisation model is built to solve location parameters. The prediction examples related to China’s industrial SO2 emissions showed that the GMC (1, 2) model cannot effectively describe the nonlinear relationship between China’s industrial SO2 emissions and their industrial output: the ensuing prediction errors were insurmountable. However, the NGMC (1, 2) model proposed in this paper effectively described the relationship and achieved satisfactory prediction accuracy.

Compared with the traditional GMC , power exponents of the predicted variables’ relevant factors are introduced into the NGMC model to reflect the effect of these upon nonlinear system behaviours and interactions. The unknown parameters are determined by a computer program, which calculates the minimum average relative percentage error of the forecasting model. This strengthens the adaptability of the NGMC model towards the original data and eventually improves the forecast accuracy.

Conflict of Interests

The author declares that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The author is grateful to the editors and the anonymous reviewers for their insightful comments and suggestions. The author also thanks the National Natural Science Foundation of China (Grant no. 71101132), the Philosophy and Social Science Foundation of Zhejiang Province, China (Grant no. 13ZJQN029YB), the Academic Climbing Project for Young and Middle-aged Leading Academic in the Universities of Zhejiang Province, China (Grant no. PD2013275), the Postdoctoral Science Foundation of China (Grant no. 2013M540448), and the Postdoctoral Science Foundation of Jiangsu Province, China (Grant no. 1302139C) for financially supporting this study.