This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

More than 20 techniques have been developed to de-noise time-series vegetation index data from different satellite sensors to reconstruct long time-series data sets. Although many studies have compared Normalized Difference Vegetation Index (NDVI) noise-reduction techniques, few studies have compared these techniques systematically and comprehensively. This study tested eight techniques for smoothing different vegetation types using different types of multi-temporal NDVI data (Advanced Very High Resolution Radiometer (AVHRR) (Global Inventory Modeling and Map Studies (GIMMS) and Pathfinder AVHRR Land (PAL), Satellite Pour l’ Observation de la Terre (SPOT) VEGETATION (VGT), and Moderate Resolution Imaging Spectroradiometer (MODIS) (Terra)) with the ultimate purpose of determining the best reconstruction technique for each type of vegetation captured with four satellite sensors. These techniques include the modified best index slope extraction (M-BISE) technique, the Savitzky-Golay (S-G) technique, the mean value iteration filter (MVI) technique, the asymmetric Gaussian (A-G) technique, the double logistic (D-L) technique, the changing-weight filter (CW) technique, the interpolation for data reconstruction (IDR) technique, and the Whittaker smoother (WS) technique. These techniques were evaluated by calculating the root mean square error (RMSE), the Akaike Information Criterion (AIC), and the Bayesian Information Criterion (BIC). The results indicate that the S-G, CW, and WS techniques perform better than the other tested techniques, while the IDR, M-BISE, and MVI techniques performed worse than the other techniques. The best de-noise technique varies with different vegetation types and NDVI data sources. The S-G performs best in most situations. In addition, the CW and WS are effective techniques that were exceeded only by the S-G technique. The assessment results are consistent in terms of the three evaluation indexes for GIMMS, PAL, and SPOT data in the study area, but not for the MODIS data. The study will be very helpful for choosing reconstruction techniques for long time-series data sets.

Comparisons of these techniques have shown that each has its own advantages and drawbacks [15,20,24,25,39,44,45]. Viovy et al. [20] compared the BISE and MVC techniques and concluded that BISE is superior to MVC in terms of de-noising. Jönsson and Eklundh [35] demonstrated that A-G is superior to BISE and the Fourier-based technique. Chen et al. [24] demonstrated that S-G, BISE, and the Fourier-based transformation (FT) are effective techniques for constructing high-quality NDVI time-series data sets, and they showed that S-G is the best of those three techniques. However, Jönsson and Eklundh [46] explained that the A-G technique could outperform an S-G filter and an alternative harmonic analysis. Later, Ma and Veroustraete [25] reported that the MVI performed better than the M-BISE and FT. Beck et al. [33] showed that the new version technique of D-L is better than both the FT and A-G because it can handle outliers effectively and estimate phenological event parameters. Lu et al. [39] developed the WT technique and compared it with the BISE, FT, and S-G techniques. The results indicated that the WT is the best for removing noise. Later, Hird and McDermid [15] demonstrated that the general superiority of the D-L and A-G function-fitting techniques over four alternative filtering techniques (S-G, 4523H, MVI, and ARMD3-ARMA5) by performing an empirical comparison. Julien and Sobrino [44] presented the IDR technique and showed that it can provide the best profile reconstruction for most land cover classes (compared with HANTS and D-L). More recently, Atkinson et al. [45] found that it was necessary to tune the A-G, D-L, FT, and WS techniques according to the number of annual growing seasons to produce reliable fits. Zhu et al. [26] presented the CW technique and showed that it was more effective than A-G, D-L, and S-G in preserving the curve shape, as well as the timing and amplitude of the local maxima/minima in the time series, for a broad range of phenologies. Jiang et al. [27] developed the PP technique based on CW, and showed that it performs much better than the CW filter for different levels of noise.

Although many studies have focused on NDVI noise reduction techniques comparison, there are still some aspects that need to be improved and completed. First, previous work has been restricted by a small selection of the available noise reduction techniques, and the majority of the literature compares two or three techniques, typically, single novel technique is compared to one or two widely-known techniques [15]. Only a few studies compare more than three techniques, such as the studies by Atkinson et al. [45], Hird and McDermid [15], and Jiang et al. [27]. Hird and McDermid [15] compared six NDVI de-noising techniques, two of which (the ARMD3-ARMA5 filter and the 4253H) were rarely used in the literature. Moreover, approximately ten new techniques have been developed in the last decade, and some of these new techniques have not been rigorously compared against other techniques, IDR has not been compared to A-G, and WS has not been compared to S-G, and CW. Second, previous comparisons have been focused on the pixels or regional scale, and few studies have paid attention to vegetation types, which could be important for selection optimum noise-reduction techniques [33,45]. Third, almost all noise-reduction technique comparisons have been based on one sensor. Few published studies have tested the differences between sensors. Finally, previous evaluations have principally relied on qualitative rather than quantitative assessments, and they generally have not addressed factors that could affect the performance of noise-reduction techniques [15]. Currently, few studies refer to the effects of different evaluation methods on assessment of outcome.

The main purpose of the present study was to support other basic studies by comparing eight techniques for representing time-series NDVI data to support other basal studies. These techniques include the A-G, M-BISE, CW, D-L, IDR, MVI, S-G, and WS techniques. Several studies have indicated that the A-G and D-L are superior to other fitting techniques and filters [15,33,35,46]. The S-G has been widely used in the literature [47–50] due to the advantages cited by Chen et al. [24]. Additionally the A-G, D-L, and S-G have been made available as software that can be easily downloaded freely from the website [51]. The M-BISE and the MVI techniques are simple and effective de-noising techniques that can be implemented readily with a few lines of code. The CW, IDR, and WS are innovative techniques that have been used recently for time-series reconstruction using remote sensing data. The potential advantages of these techniques could lead to widely application in the future.

Four widely used long time-series NDVI data sets were used in this study (Table 1 [52–55]). All of these data sets were reconstructed using the eight techniques mentioned above combined with information about the vegetation types in the study area. Three statistical indexes were used to evaluate the reconstruction results. The aims of this research were to identify the optimal de-noising technique for each vegetation type and data set in the study area; then to analyze the differences between the optimal techniques for each vegetation type and the data set; and finally to explore the effects of different evaluation methods on the results.

2.Method2.1.Study Area

The Heihe River Basin, in the middle of the Hexi Corridor of Gansu Province, is the second largest inland river basin in China. It is located between 97°1′–102°0′E and 37°7′–42°7′N, with an area of approximately 143,000 km2. The elevation in the Heihe River Basin ranges from 5500 m in the south to 1000 m in the north [56]. It consists of three regions: the upper mountainous area (the source of the Heihe River), the middle oasis area (including towns, such as Zhangye and Jiuquan), and the lower terminal arid area around Ejina [57], from an upstream to downstream, the environment changed from glacier, to frozen soil, alpine meadow, forest, irrigated farmland, a riparian ecosystem, and a desert (the Gobi) [58]. Figure 1 shows the location and vegetation types of the study area.

2.2.Data and Processing

Four long time-series NDVI data sets of remote sensing were used in this study, and detailed information about the data sets is shown in Table 1. All of these NDVI data sets were compiled for multi-day data using the MVC technique. To ensure data quality, a series of processes have been performed for these data sets. GIMMS data sets had been corrected for calibration, view geometry, and volcanic aerosols and were verified using a stable desert control point [59,60]; PAL data sets pre-processing steps include navigation, inter-satellite calibration and partial correction for Rayleigh scattering [61]; The MODIS data (MOD13A2) have been performed atmospherically corrected bidirectional surface reflectance that had been masked for water, clouds, heavy aerosols, and cloud shadows, and the accuracy had been assessed over a widely distributed set of locations and time periods via several ground-truth and validation efforts [62]. The processes of SPOT VGT NDVI data sets include atmospheric correction, radiometric correction, and geometric correction [63]. A subset was extracted for SPOT VGT NDVI using VGTExtract software (version 2.1.0), a free vegetation extraction tool produced by VITO. The status maps were used to identify useful NDVI. For MOD13A2, the MODIS Reprojection Tool was used to extract the desired bands and to re-project them as WGS 1984. Quality assurance products from Terra MODIS were used to eliminate the obvious error noised data. Only pixel reliability values of 0 and 1 were accepted as reliable pixel. The continental GIMMS and pathfinder NDVI data sets were provided by the Cold and Arid Regions Science Data Center [52,53]. Pixels with obvious error NDVI data, such as values less than 0 or null values, were replaces by the mean value of the same time of multiyear for all of the four NDVI data sets. After the obvious errors were removed, the data were fitted with each of the techniques.

The 2001 Vegetation Map of China [64], with a spatial resolution of 1 km, was used to identify the main vegetation types in the study area. To thoroughly analyze the reconstruction results for the main types of vegetation, pixels with 80% to 100% cover of the same vegetation type were extracted as representative homogeneous pixels in the study area. All of the homogeneous pixels for grassland, meadow, crops, desert, and shrub areas were selected for further analysis.

2.3.Candidate Time Series Reconstruction Techniques and Parameters

Eight de-noising techniques were selected for this research, and the main objective of each technique is shown in Table 2. All of these techniques have been used or have the potential to be used in real applications. Detailed descriptions of the techniques can be found in the literature (Table 2). All of the techniques run by compiling Interactive Data Language code (IDL), except for A-G, D-L, and WS, the A-G and D-L are run in TIMESAT [46] and the WS is run in Matlab.

The parameters used for the techniques greatly affect the reconstruction results. As described by Atkinson et al. [45], for any of the tested techniques, finding a single set of technique parameters that is appropriate for the study area can be challenging for a landscape that is diverse and complex. To obtain the optimum parameters for each of the eight techniques, different fitting criteria were set to reconstruct the NDVI time-series for the four sensors over the five main vegetation types in the study area.

By comparison, we used 0.1 as the threshold for the M-BISE technique and a 20% multi-year average for the MVI filter threshold. For S-G, m (the half-width of the smoothing window) was 4, d (the degree of the polynomial) was 6, and the IDR threshold was 0.02. These values are similar to their authors’ recommended. For WS, λ was 15 because the entire research area contains single-season vegetation types [45]. There were three parameters for the CW technique: the window width was 7 for MODIS and GIMMS data and 9 for the SPOT and PAL data. The threshold for removing a false local maximum and minimum point and a noise point were the same for the four data sets, 0.1 and 0.05, respectively. The D-L and A-G techniques were adjusted interactively in the TIMESAT software to arrive at close fitting results. In TIMESAT, the median filter option with a parameter value of 2 was chosen to remove spikes and noise.

RMSE is a well-accepted absolute goodness-of-fit indicator for continuous response variables that describes the difference between the observed and predicted values in the appropriate units [65]. Here, the RMSE indicates the difference between the mean NDVI time series obtained from the eight noise- reduction techniques (assumed to be accurate) and the corresponding experimental time series to which noise reduction has been applied [15]. It is calculated using Equation (1).
(1)RMSE=Σt=1N(VI*(t)−VI(t))2Nwhere VI* is the resultant NDVI value, VI is the mean NDVI value obtained from the eight noise-reduction techniques (mean NDVI), and N is the number of time points.

The Akaike’s Information Criterion (AIC) [66] and the Bayesian Information Criterion (BIC) [67] are criteria used for selecting technique from a finite set of techniques. Both criteria are closely related and can measure the efficiency of the parameterized technique in terms of predicting the data, but the BIC penalizes the free parameters more severely than does the AIC. The AIC and BIC evaluation index have used to evaluate the performance of de-noised techniques by Atkinson et al. [45]. They are calculated using Equations (2) and (3), respectively.
(2)AIC=2k+n[ln(RSS)](3)BIC=n[ln(RSS)]+kln(n)where, k is the number of free parameters in the technique, n is the number of input data points, and RSS is the residual sum of squares between the mean NDVI data and the fitted technique. A lower value for AIC and BIC would indicate a preferable technique.

For M-BISE, MVI, and IDR required only one free parameter. For S-G and CW, two and three free parameters were needed, respectively. For the A-G and D-L techniques, seven and six parameters were needed, respectively. For WS, the free parameter is 9.37 when λ is 15 according to Atkinson et al. [45].

3.Results3.1.Visual Assessment of the Fitted Curves

Figures 2 and 5 present the sample pixel results for the five vegetation types in the four NDVI data sets for the study area. The sample positions are shown in Figure 1.

Due to space limitations, only the results for one year (2000) are shown in the figure. It was observed that most de-noising techniques were effective for reconstructing high-quality NDVI time-series data sets. The results closely resemble the profiles of vegetation growth, especially for crops, grass, shrubs, and meadows. However, some obvious differences were found in the reconstructed NDVI time-series created with the eight techniques. For example, the WS technique produced the smoothest fitted curve, but it showed a lower maximum NDVI than other techniques (Figure 2a–d; Figure 4a,b; and Figure 5a). The S-G technique obtained most close the upper envelope for the input NDVI values in almost all cases. It is hard to judge the results for some techniques because the fitting curves results approximately match the other curves. Therefore, we quantify the differences between the eight techniques by using the RMSE, the AIC and the BIC in the following section.

3.2.Quantitative Evaluation and Regional Application

The RMSE, the AIC, and the BIC were calculated at the pixel level for the eight techniques for all of the data sets after reconstruction. All of the homogeneous pixel samples for each vegetation type were used to calculate the mean value of the three indexes to further analyze the fit of the techniques. To test the performance of the eight techniques at the regional scale, the RMSE was used as an example to display the differences between the techniques in the study area.

3.2.1.GIMMS NDVI

Table 3 shows the RMSE, the AIC, and the BIC for each of the eight techniques for the five vegetation types of GIMMS NDVI time-series data.

The technique performance differs based on the three evaluation indexes. For crop, desert, grassland and meadow, the optimal techniques are S-G, WS, S-G, and CW, respectively, in terms of the three evaluation indexes. However, the best technique is different for shrub in terms of the three evaluation indexes, the optimal technique is S-G for RMSE and AIC, while it is WS for BIC. At the same time, the worst technique is also different in terms of the three evaluation indexes, for the five vegetation types both RMSE and BIC indicate that IDR perform worst, while it is D-L for AIC. For all the eight techniques, the accuracy differs for different vegetation types. For example, the WS technique performs better than the D-L technique for shrub and desert areas, but the situation is reversed for other three vegetation areas. Overall, the S-G, CW and WS techniques are effective for the GIMMS NDVI time series data. The IDR, M-BISE, and MVI techniques perform worse than other techniques for most vegetation types. Evaluation indexes can affect the assessment results for some vegetation types.

Figure 6 presents the RMSE values calculated for all pixels in the GIMMS NDVI data sets using the eight techniques. All of the techniques show good fitting results (i.e., low RMSE values). The performances of the eight techniques in Figure 6 are similar to the performances described in Table 3. Overall, the S-G (Figure 6g) and CW techniques (Figure 6c) performed best; most of the study areas are green in the results (representing lower RMSE values). The WS (Figure 6h) and D-L (Figure 6d) techniques were the next best, showing more yellow areas in Figure 6d,h. The M-BISE (Figure 6b) and IDR (Figure 6e) techniques performed worst, producing more red areas than the other techniques (representing higher RMSE values).

3.2.2.PAL NDVI

There are a few differences between the results for the GIMMS NDVI data and the PAL NDVI data (Table 4). The three indexes show that the S-G technique performs best for crop, grassland and meadow, while WS technique performs best for desert. However, the best technique is different for shrub in terms of the three evaluation indexes, the optimal technique is S-G for RMSE, while it is WS for AIC and BIC. For all vegetation types, the S-G and WS techniques performs better than the A-G and the D-L techniques in terms of the three evaluation indexes. The IDR perform worse than the other techniques for all vegetation types. The eight techniques performed consistently for most situations according to the three evaluation indexes, but there were subtle differences between shrub and desert.

Figure 7 shows the RMSE calculated for each pixel of the PAL NDVI data sets using the eight techniques. Overall, the S-G (Figure 7g) and WS techniques (Figure 7h) perform best. The next best are the A-G (Figure 7a), the CW (Figure 7c), the D-L (Figure 7d), and the MVI (Figure 7f) techniques, and it is hard to determine which is better because their performances vary with different areas. Again, the IDR (Figure 7e) perform worst for this data set, and next to it is the M-BISE (Figure 7b).

3.2.3.SPOT VGT NDVI

For the SPOT VGT NDVI data, the S-G technique is also the best reconstruction technique for shrub, crop and meadow, and the WS technique outperforms the other six techniques for desert (Table 5) in terms of three evaluation indexes. However, the best technique is different for grassland in terms of the three evaluation indexes, the optimal technique is S-G for RMSE, while it is WS for AIC and BIC. Overall, the CW and the WS techniques are effective for all the vegetation types, but it is difficult to determine which is better because their performances vary with different vegetation types. The three evaluation indexes are highly consistent for all vegetation types except subtle differences in grassland.

Figure 8 displays the RMSE calculated for each pixel of the SPOT VGT NDVI data set using the eight techniques. All of the techniques show good fitting results (i.e., low RMSE values). The S-G technique (Figure 8g) performs best, and the next best are the WS (Figure 8h) and CW techniques (Figure 8c). The performances of the A-G (Figure 8a) and the D-L techniques (Figure 8d) are next to the WS (Figure 8h) and CW techniques (Figure 8c). The performances of the M-BISE (Figure 8b), the IDR (Figure 8e), and the MVI techniques (Figure 8f) are relatively poor in the study area.

3.2.4.MODIS NDVI

As shown in Table 6, the S-G technique performs best for all the five vegetation types in terms of RMSE, while the best technique is different for the five vegetation types in terms of the AIC and BIC (shrub excepted). The optimal techniques for crop, desert, grassland and meadow are the WS, S-G, WS and WS in terms of AIC, respectively, but it is the S-G, A-G, S-G, and A-G in terms of BIC, respectively. Overall, both the S-G and the WS techniques are effective techniques for most vegetation types using the MODIS NDVI time-series data. The IDR technique performs worst for most vegetation types and the M-BISE and the MVI techniques perform second worst. The three evaluation indexes are inconsistent for most vegetation types.

Figure 9 shows the RMSE of each pixel of the MODIS NDVI data set using the eight techniques of the study area. All of the techniques show good fitting results (i.e., low RMSE values). The S-G (Figure 9g) technique performs best. The IDR (Figure 9e) and the MVI (Figure 9f) techniques perform worst in the study area. It is difficult to distinguish which one is better for the rest five techniques because the techniques are complementary to each other in different areas.

4.Discussion4.1.Performance of the Reconstruction Techniques

Overall, the S-G, CW, and WS techniques show better reconstructed effects than the other techniques, and the IDR technique shows generally poor performance in terms of RMSE, AIC, and BIC for most vegetation types in the Heihe River Basin (Table 7). These findings are inconsistent with these of Hird and McDermid [15] who stated that A-G and D-L techniques perform better than S-G in terms of RMSE. In our study, the A-G technique performs better than S-G only for some vegetation types (desert and meadow) of MODIS NDVI in terms of BIC (Table 6). However, the findings are supported by Zhu et al. [26] and Jiang et al. [27], and both of the results indicated that the S-G technique performs better than the A-G and D-L techniques in terms of RMSE under different noise levels. Though Julien and Sobrino [44] revealed that the IDR technique performs better than the HANTS and the D-L techniques in terms of the distance to the raw data and the proximity to its upper envelope, IDR performed worst among the eight techniques in most of our test cases.

Except for the S-G and the IDR techniques, the performance of other techniques is relatively unstable, meaning that they change for different vegetation types and data sources. For example, for SPOT VGT NDVI data sets, the CW and the WS are effective techniques for all vegetation types, performing worse than the S-G technique (desert and grassland excepted) and better than the rest five techniques (Tables 4 and 7). But for GIMMS NDVI data sets these techniques’ performance changed with vegetation types. The WS technique performed better than the D-L for shrub and desert vegetation, while the reverse is true for the other three vegetation types of MODIS NDVI. The D-L and the A-G techniques performed similarly to each other. The D-L technique outperforms the A-G technique for all vegetation types of GIMMS and PAL NDVI, but for MODIS and SPOT VGT NDVI, the situation is changed with vegetation types. Thus, it is difficult to judge which is better. This result is supported by Jönsson & Eklundh [35,46,68] who argued that the two techniques complement each other and that they may be suitable in different areas depending on the behavior of the NDVI signal. Beck et al. [33], and Hird and McDermid [15] also obtained similar findings. In general, the D-L and the A-G techniques outperform the MVI and the M-BISE techniques. The MVI and the M-BISE techniques show pool performance, which only outperform the IDR. It is hard to determine which is better because their performances vary with different vegetation types and data sources. However, for PAL NDVI data sets, the results are consistency with those of Ma and Veroustraete [25] who indicated that the MVI performed better than the M-BISE.

4.2.Effect of Evaluation Index on the Final Reconstruction Results

As ground reference measurements are challenging to obtain due to the medium/coarse resolution of the imagery, the problem of developing a robust, accurate and fast filter is amplified by the difficulty of obtaining reference measurements to use for validation [43]. Thus, choosing an appropriate index is very important for the final evaluation of de-noising techniques. To reduce the uncertainty associated with relying on one evaluation index, we used three different statistical indexes in this study.

Generally, the results of the three indexes are consistent in most cases, especially for worst-performing technique. For this technique, the assessment results for all vegetation types for all four NDVI data sets are in complete agreement (Tables 3 and 6). However, some differences were observed in the other techniques. For the GIMMS NDVI data, for example, the WS technique performance for shrub vegetation is third according to the RMSE, but, according to the AIC and BIC values, it is best. A similar situation was found for desert vegetation of SPOT VGT NDVI data. Different evaluation indexes have great effect on evaluation the performance of each technique for MODIS NDVI. The performance of the A-G and D-L techniques is different for all vegetation types in terms of RMSE, AIC, and BIC. A similar result was found for the CW and WS techniques. Thus, in certain situations, the evaluation results may be arbitrary if only one index is used. Comprehensive consideration of several indexes is very important.

4.3.Factors Influencing Performance

The performance is different between the eight techniques even under the same vegetation type and data source according to the above results (Tables 3 and 6). Thus, one of the main factors affecting the techniques’ performance is the de-noise principle of each techniques. The S-G applies an iterative weighted moving average filter to the NDVI time series, with the weight defined by a polynomial of a particular degree. This polynomial is designed to preserve higher moments within the data and to reduce the bias introduced by the filter [24]. This approach can replace the noise data as well as keep the fidelity of the time series. The results of this study also proved that the S-G technique is an effective de-noise method. The CW technique designed to replace the noise data by a three-point changing-weight filter while preserving the curve shape, as well as the timing and the amplitude of the local maxima/minima in the NDVI time series for a broad range of phenologies [26]. In addition, the WS is based on penalized least squares to fits a discrete series to discrete data and penalizes the roughness of the smooth curve. It balances reliability of the data and roughness of the fitted data. Both the CW and WS techniques also can replace the noise data as well as keep the fidelity of the original data. The two techniques were also proved effective techniques which only next to the S-G in this study. The A-G and D-L techniques are using a series of parameters to model the NDVI time series [33,35]. The difference is the A-G is based on asymmetric Gaussian functions while the D-L is based on a double logistic function to determine the parameters. That is may be the reason of the performance of the two techniques is similar and difficult to judge which is better. The M-BISE, MVI and IDR techniques show pool performance in this study. The M-BISE is to look for a spike (i.e., an increase immediately followed by a decrease or a decrease immediately followed by an increase) [21]. Both the increased range and decreased range greater than a threshold are treated as noise values. Therefore, the key to this approach is to set a suitable threshold. However, the noise data are changed with time and space. It is difficult to replace all noise data by a fixed threshold. The same problem was existed for the MVI and the IDR techniques. The MVI remove the noise data by replacing the maximum difference date value by the average of the dates before and after it. Iteration will stop when all differences are less than a threshold. The IDR is somewhat similar to the MVI, with the difference that the threshold of IDR technique is carried out from the data itself, and not from a comparison to an average of different years [44].

Vegetation type is also a factor affecting the techniques’ performance. For the five vegetation types in our study, the performances of the techniques for each type are unstable for most of the NDVI data. Similar findings were found by the research of Atkinson et al. [45] who compared the performances of A-G, D-L, FT, and WS for four vegetation types using the Terrestrial Chlorophyll Index (MTCI), the RMSE results indicated that the best technique changed with different vegetation types. Hird and McDermid [15] also showed that six techniques perform differently for six different vegetation regions. Another factor that had an influence on performance was the temporal and spatial resolution of the data. Since different data sources, as well as different vegetation types, the characteristic of noise varied. Therefore, Hird and McDermid [15] recommend that the strength and character of the noise present in an NDVI data set be considered when selecting an approach for time-series noise reduction.

As this study was focus on the Heihe River Basin, the vegetation types were relatively simple; most of the study area was covered by desert. Our findings and conclusions may not be as applicable to areas with multiple growing seasons (e.g., subtropical zones). However, our comparison of the performances of eight reconstruction techniques was conducted systematically and comprehensively using four NDVI data sets covering five different vegetation types, and we determined obtained the optimum technique for GIMMS, PAL, SPOT, and MODIS data for the different vegetation types. These findings will be of great reference and actual using values for choosing de-noise techniques and their parameter values.

5.Conclusions

In this study, the performances of eight de-noised techniques were compared for different vegetation types represented in four NDVI data sets in the Heihe River Basin, and the following results can be observed: the S-G, CW, and WS techniques perform better than other techniques for almost all vegetation types according to the RMSE, the AIC, and the BIC. The IDR, M-BISE, and MVI techniques performed worse than the other techniques for most vegetation types using the four sensor data sets. The best technique varies with vegetation types and NDVI data sources. However, the S-G performs best in most situations, the CW and WS techniques are next to it. The assessment results are consistent among the three evaluation indexes for most situations, but subtle differences exist for some vegetation types; and comprehensive consideration of several indexes is very helpful to decide the best technique for certain situations.

This work was supported by the National Natural Science Foundation of China (grant number: 91125004), the Knowledge Innovation Program of the Chinese Academy of Sciences (grant number: KZCX2-EW-312), and the Chinese State Key Basic Research Project (grant number: 2009CB421305). We would like to thank Atzberger C. (Joint Research Centre of the European Commission), Julien Y. (University of Valencia), and Zhu W.Q. (Beijing Normal University) for providing the programs of the Whittaker smoother (WS), the iterative interpolation for data reconstruction (IDR) and the changing-weight filter (CW), respectively. We also thank the three anonymous reviewers for their valuable comments and suggestions on how to improve the manuscript.

Long time-series Normalized Difference Vegetation Index (NDVI) data sets used in the study.

Products

Time Period

Temporal/Spatial Resolution

Sensors

Data Source

GIMMS

January 1982–December 2006

15 d/8 km

AVHRR

Cold and Arid Regions Science Data Center [52]

Pathfinder

July 1981–July 1994January 1995–December 2000

10 d/8 km

AVHRR

Cold and Arid Regions Science Data Center [53]

SPOT VEGETATION (S10)

April 1998–April 2013

10 d/1 km

SPOT VGT

VITO Earth observation [54]

MOD13A2

February 2000–February 2013

16 d/1 km

Terra/MODIS

The Land Processes Distributed Active Archive Center [55]

Table 2.

Summary of the NDVI time-series reconstruction techniques selected for comparison.

Candidate Techniques (ab.)

Description

References

Modified-best index slope extraction (M-BISE)

Compares the current term value with the previous and the next term within a predefined sliding window, and replaces these values with the mean value of the previous and the next values if the percentage difference is greater than a predefined threshold.

[20,21]

Asymmetric Gaussian function-fitting (A-G)

Fits local, nonlinear functions at intervals around the local maxima and minima, then merges these into a global function describing the full NDVI time series.

[35]

Double logistic function fitting (D-L)

Uses six parameters to model the NDVI time series with a double logistic function. These parameters are the winter NDVI (wNDVI), maximum NDVI (mNDVI), two inflection points, one as the curve rises (S) and one as it drops (A), and the rate of increase or decrease (mS and mA) of the curve at the inflection points.

[33]

Savitzky-Golay filtering (S-G)

Based on a simplified least-squares-fit convolution for smoothing and computing derivatives of a set of consecutive values (a spectrum). The convolution can be understood as a weighted moving average filter with weighting given as a polynomial of a certain degree. The weight coefficients, when applied to a signal, perform a polynomial least-squares fit within the filter window. This polynomial is designed to preserve higher moments within the data and to reduce the bias introduced by the filter.

[24]

Mean value iteration filtering (MVI)

Iteratively compares each date with the average of the dates before and after it, replacing the date with this average if the difference is above a certain threshold. The maximum difference date value will be removed in an iteration process. Iteration will stop when all differences are less than the threshold.

[25]

Whittaker smoother (WS)

Based on penalized least squares, fits a discrete series to discrete data and penalizes the roughness of the smooth curve. In this way, it balances the reliability of the data and roughness of the fitted data.

[41,43,45]

Iterative interpolation for data reconstruction (IDR)

Creates an alternative NDVI time series by computing the mean of the immediately preceding and following observations and comparing it to that of the original time series, replacing the original data with the alternative time-series data if the maximum difference between the alternative and original time series is above a certain threshold.

[44]

Changing-weight filtering (CW)

Filters an NDVI time series with a three-point changing-weight filter and replaces the local maximum/minimum points in a growth cycle.

[26]

Table 3.

RMSE, AIC, and BIC values for the eight techniques of Global Inventory Modeling and Map Studies (GIMMS) NDVI time-series data from January 1982 to December 2006 (best-fitting technique shown in bold).

RMSE, AIC, and BIC values for the eight techniques for the Pathfinder AVHRR Land (PAL) NDVI time-series data from July 1981 to December 2000 (except August 1994–December 1994) (best-fitting technique shown in bold).