Abstract

:
This study explores ICESat/GLAS waveform data in Thuringian Forest, a low mountain range located in central Germany. Lidar remote sensing has been proven to directly derive tree height as a key variable of forest structure. The GLAS signal is, however, very sensitive to surface topography because of the large footprint size. This study therefore focuses on forests in a mountainous area to assess the potential of GLAS data to derive terrain elevation and tree height. The work enhances the empirical knowledge about the interaction between GLAS waveform and landscape structure regarding a special temperate forest site with a complex terrain. An algorithm to retrieve tree height directly from GLA01 waveform data is proposed and compared to an approach using GLA14 Gaussian parameters. The results revealed that GLAS height estimates were accurate for areas with a slope up to 10° whereas waveforms of areas above 15° were problematic. Slopes between 10–15° have been found to be a critical crossover. Further, different waveform shape types and landscape structure classes were developed as a new possibility to explore the waveform in its whole structure. Based on the detailed analysis of some waveform examples, it could be demonstrated that the waveform shape can be regarded as a product of the complex interaction between surface and canopy structure. Consequently, there is a great variety of waveform shapes which in turn considerably hampers GLAS tree height extraction in areas with steep slopes and complex forest conditions.

1. Introduction

Forests are an important natural resource and play a major role in the global carbon cycle. Accurate and repeated forest mapping at local, regional and global scales is therefore required for a better understanding of the global climate change and for the development of sustainable forest management strategies. Tree biomass is a key to a forest ecosystem’s function and its role for the carbon fluxes between biosphere, atmosphere and oceans. Plants store the greenhouse gas carbon dioxide in their above and belowground biomass. Forests can, thus, be considered as one of the most significant terrestrial carbon sinks [1,2]. Biomass estimations refer to the horizontal and vertical forest structure. Since direct measurements are difficult, tree biomass is usually predicted from forest structural parameters using allometric equations.

Lidar remote sensing is a powerful tool to directly assess the vertical dimension and thus to derive tree height as one important characteristic of forest structure. Airborne lidar systems already demonstrated their capability to estimate tree height and also biomass [3–5]. Especially small-footprint, discrete-return sensors achieve high accuracies and are already used operationally (e.g., [3,4,6–8]). However, airborne systems are very cost-intensive and they are limited to a small region. The Geoscience Laser Altimeter System (GLAS) onboard NASA’s Ice, Cloud and Land Elevation Satellite (ICESat) was the first spaceborne lidar mission providing lidar data at a global scale [9,10]. ICESat/GLAS was launched in January 2003 and acquired lidar waveform data until October 2009. The GLAS laser beams illuminated elliptical footprints with an average diameter of 65 m and which are separated by 172 m. The vertical resolution of the waveforms is 1 ns, i.e., 15 cm. The system was originally designed for ice-sheet measurements and thus the footprint size is actually not optimal for vegetation applications. However, several studies could also demonstrate the potential of GLAS data to characterize forest structure (e.g., [11–15]). A big research question is related to surface topography, because many forests are located in areas with steep slopes and high surface roughness and the GLAS system is very sensitive to surface topography due to its large footprint. Modeling work illustrates that particularly waveforms from large footprints like the GLAS sensor are stretched due to terrain topography and ground and vegetation return are mixed at a certain slope degree [16–18]. The terrain effect makes it challenging to retrieve vegetation height in a hilly landscape, as empirical research demonstrates [19–21]. The still existing problems occurring for waveforms over areas with a complex surface topography indicate that research should focus much more on the real performance of the GLAS waveforms in mountainous regions with complex vegetation structure. The main objective of the study presented in this paper is therefore to further improve the understanding of the interactions between surface topography, canopy structure and waveform shape. This paper investigates GLAS estimates of terrain elevation and tree height for a temperate forest site located in a low mountain range in central Germany. The main research questions addressed by this paper are:

How precise are ICESat/GLAS measurements of terrain elevation and tree height in mountainous terrain?

How does surface topography affect the accuracy of the GLAS tree height estimates?

How does surface topography influence GLAS waveform structure?

According to the first question, GLAS elevation and height will be compared to different reference data sets, namely a field based forest inventory database and a Digital Terrain Model (DTM) and Canopy Height Model (CHM) derived from airborne laser scanning (ALS) data. Two approaches to define the ground return, i.e., the signal referring to the terrain, directly from the waveform as a basis for tree height estimation will be presented. The performance of the ground return definition methods will be assessed by analyzing the GLAS and ALS based terrain elevations. The second question will be investigated by GLAS metrics for different landscape structure categories which are defined based on several ALS DTM measures characterizing the surface geometry. Finally, some waveform examples and their corresponding footprint structure, that means the corresponding forest characteristics and terrain conditions, will be investigated in more detail.

2. Study Area and Data

2.1. Thuringian Forest

The Thuringian Forest is a low mountain range of elevations up to 983 m ASL in central Germany between 09°44′–50°58′N and 11°38′–50°18′E (Figure 1). The whole area extends over about 150 km length and maximum 20 km width. The ridgeline from northwest to southeast constitutes a natural watershed and border of climate. The climate is in general cool and humid with an annual temperature of on average 5–6°C and 900–1,200 mm precipitation whereas the southern part is more humid than the northern side [22]. The main valleys start from the ridge and go with steep slopes into the lowlands to the north and south. Thus, the whole region is characterized by a hilly landscape with strong topography.

Approximately 80–85% of the area is covered by forest [22]. The main tree species are Norway spruce, European beech and Scots pine. The forests are used by silviculture and the timber industry. Table 1 summarizes some details for the ICESat/GLAS footprint locations of the three main tree species after the filtering procedure explained in Section 3.1. Two thirds of the ICESat/GLAS footprints are situated in a terrain with a slope up to 19°, maximum slope is 40°. The major part of the beech stands is located in steep terrain whereas the two needle-leaf species can be also found in more flat areas.

2.2. ICESat/GLAS Data

ICESat/GLAS release 531 data was ordered and downloaded from the National Snow and Ice Data Center (NSIDC). We used the two data products GLA01 Global Altimetry and GLA14 Global Land Surface Altimetry Data for all available laser operation periods between February 2003 and March 2009. The GLA14 data set was used to geolocate the footprints and to overlay them with the additional data sets. This resulted into 1,577 shots available for the test site. The statistical analyses are based on 442 shots (Table 1) which were retained after the pre-processing and filtering procedure illustrated in Section 3.1. The waveform data of the GLA01 product and the GLA14 Gaussian parameters were used to derive the terrain elevation and tree height metrics. A detailed description of the GLAS sensor and its measurement concept can be found in [9,10].

2.3. Additional Data Sets

Several additional data sets were available to analyze the remote sensing data. The first one is a GIS database of a common forest inventory for the federal state Thuringia provided by the Thuringian forest agency. The inventory includes information about main species, tree height as the basal area weighted mean height of a stand, diameter at breast height, age, stocking volume, basal area, stocking degree, stocking layers and management type for state forest at stand level. A forest stand represents a management entity and is defined as homogenous community of trees which can be distinguished from its surrounding forested area [23]. The data for the forest stands located at the GLAS footprint sites are based on in situ measurements from 1998–2010. The forest agency applied growth models and logging values to achieve January 2010 as common temporal basis before providing the data set. The inventory parameters were related to the GLAS footprint level. If more than one inventory polygon intersected a certain footprint, the maximum, minimum and weighted mean values were calculated: Each inventory polygon was weighted by the percentage of its area within a certain footprint.

The second reference are airborne laser scanning (ALS) provided by the Thuringian land surveying office. They were acquired between 2000–2005 as last pulse ground and non-ground points during different laser flights with slightly different technical specifications by TopScan GmbH. The sensors used were all of the ALTM series with an altitude of 800–1,000 m and an average point density of 1.0–2.3 m. The reported horizontal and vertical accuracies are 30 cm. The laser campaigns were conducted in December and April for most parts of the study site. However, only winter were available for some areas. The ground points were gridded to a Digital Terrain Model (DTM) and the non-ground points were interpolated to a Digital Surface Model (DSM) using kriging technique and with a spatial resolution of 5 m each. The difference of the DTM and DSM resulted in a Canopy Height Model (CHM). The ALS based heights were compared to the forest inventory to assess the impact of the lack of first pulse points and spring data, i.e., data during leaf-on season, for some regions. Two examples should demonstrate how well the ALS heights correspond to the inventory data: The correlation between mean ALS and inventory height is 0.81 and 0.66 for spruce and beech respectively, according to a region covered by laser flights in April 2003. The mean bias is −4.0 m for coniferous sites and −3.5 m for deciduous sites. On the other hand, the correlation is 0.89 and 0.36 and the mean bias is −4.0 m and −18.6 m for spruce and beech respectively, according to a region covered by laser flights in November and December 2003. The negative biases can be explained by the lower heights of the last return ALS data. The correlation for beech can be improved up to 0.59 and the mean bias is only −3.4 m when using the maximum ALS height. However, it could be concluded that the ALS and inventory heights of coniferous forests in general match well, but the heights of the deciduous forests have to be treated carefully. The maximum ALS tree height was used in the later analysis steps if not specified differently. The ALS data set was used as important reference data set in spite of its limitations because of its high spatial resolution which provides more information about the inner-footprint structure than the polygon based inventory data base. Also, some inaccuracies of the inventory data were detected during some own small field visits which raised the need for a second reference data set. Both references, inventory and ALS data, are not optimal, but their complementary use allows for an adequate GLAS data analysis.

Finally, digital orthophotos with a spatial resolution of 20 cm were made available by the Thuringian land surveying office. The data were acquired in April to May 2008 and were used in addition to GoogleEarth imagery to get an overview about the forest conditions for each selected GLAS footprint.

3. Methodology

3.1. ICESat/GLAS Pre-Processing

An overview of the processing chain of GLAS data preparation is illustrated in Figure 2. First, the binary GLA14 were converted into ASCII format using the NSIDC GLAS Altimetry elevation extractor Tool (NGAT) which is available from the NSIDC website. The tool allows reading the data into a well-arranged column format useable for later investigations, but it extracts only a small number of the parameters included in the GLA14 product. Therefore, the IDL based tool was modified and extended. The GLAS footprints were geolocated into a common GIS software package using the coordinates given in the GLA14 data set. The average horizontal geolocation accuracy for the laser periods used in this study ranges from 0.0 m ± 4.29 m for laser period L3H (spring 2007) up to 1.35 m ± 4.42 m for laser period L3F (summer 2006) [24]. The vertical error for the two laser periods is 0.0 cm ± 1.48 cm and 1.0 cm ± 3.08 cm for flat surfaces, respectively [24]. The exact area of the footprints was simulated based on the major axis, eccentricity and azimuth of the transmitted pulse as given in the GLA14 product. The parameters are provided per one second which corresponds to 40 laser shots since GLAS laser pulses are operating at 40 Hz. So it can be accounted for the differences of the footprint shapes also within the single laser periods. The information was not available for three of the laser periods. In these cases, the average major axis and eccentricity of the whole laser period as given in the meta data table of the laser periods [24] were used. Since the table in [24] does not provide information about the azimuth value, the actual area of the affected footprints was only approximated to account for possible geolocation errors. This was done by modeling a circular buffer with an equivalent area as a corresponding ellipse. As next step, the GLAS tracks were intersected with the forest inventory database and the ALS coverage. All layers were referenced to UTM projection before intersection. Only footprints with a forest area of at least 75% (as defined by the inventory polygons) were retained.

The waveforms of the GLA01 product were read using the IDL reader tool available at the NSIDC website and own Matlab scripts. The waveforms were first smoothed with a moving average filter to remove the background noise within the signal. The positions where the waveform exceeds 3.5 times the standard deviation of the noise above the mean noise level were defined as start and end of the signal. The standard deviation of the background noise was taken from the GLA14 product. According to the mean noise level, the intensities of the first and last part of the waveform were summarized in a frequency distribution and the respective maximum was computed. It was estimated separately for the signal begin and end because it was found that the mean noise is mostly larger after the signal than before which agrees with the findings in [14].

The complete GLAS data set was searched for valid shots useable for later analyses. The filtering of bad data is a critical step since the GLAS measurements can be affected by clouds or other atmospheric effects, sensor saturation and waveform truncation. The following approach was applied: (1) All waveforms with an GLA14 terrain elevation significantly higher than the corresponding ALS DTM elevation were removed; (2) All waveforms with a signal extent larger than the sum of the maximum inventory tree height and maximum relief extent within the footprints based on the ALS DTM of the study area were excluded; (3) Noisy waveforms with a signal to noise ratio lower than 10 were removed; (4) All saturated waveforms as indicated by the GLA14 parameter i_satElevCorr were removed. According to the first issue, the GLA14 d_elev parameter was used at this stage of processing because it can be extracted most easily. A threshold of 60 m was used to take into account that GLA14 d_elev does not truly reflect the terrain but is rather defined by the parameter d_ldRangOff [25]. In the case of the GLAS waveforms used in this study the d_elev always refers to the waveform centroid. Before comparing ALS terrain elevation and GLA14 d_elev elevation, the GLAS elevation was referenced to the WGS84 ellipsoid and the geoid height above the ellipsoid was subtracted to remove the effect of geoid undulation. The complete processing and filtering procedure finally resulted in 453 waveforms in total (Table 2) and 442 waveforms for the three main tree species (Table 1). The GLA14 parameter i_beam_coelev revealed that the off-nadir angle is only between 0.3°–0.4° for the selected shots. These values refer to the nadir operations of the ICESat/GLAS mission [26] and thus, off-nadir issues were not considered in this study.

Some analysis steps made it necessary to define an adjusted selection of shots. The waveform type classification (Section 3.2) and the definition of landscape structure categories based on different DTM measures (Section 3.3) were limited to GLAS footprints of laser periods L3A, L3F and L3H. The limitation was done to reduce the two analyses and to exclude possible effects of the different hardware modifications of the GLAS laser emitters. Periods of laser 3 were used because they provided most waveforms, the tracks are all ascending and laser 3 produced almost circular footprints indicating a good performance during the laser period.

3.2. Calculation of Waveform Parameters

Various waveform parameters were derived from the GLA01 and GLA14 data (Figure 3). Tree height estimates were calculated based on the difference between the start of the signal (signal begin), i.e., the vertical location of the uppermost canopy layer, and the peak referring to the terrain (ground return). We applied two approaches to find the ground return. The first one is based on the Gaussian parameters given in the GLA14 product. The second approach uses the GLA01 product to find the ground return based on local maxima within the waveform.

Regarding the first approach using GLA14 data, the waveforms were fitted into a maximum of six Gaussian distributions where the individual Gaussians are assumed to refer to the elevation of vertically distinct surfaces within the footprint [9,25]. The GLA14 ground return was then defined as the maximum of the last two Gaussian peaks of a waveform, as also suggested by [13,19]. The signal begin was also extracted from the GLA14 data set to calculate the tree height (from now GLA14 height).

The Gaussian decomposition, however, has some limitations and might be not the best way to characterize the waveform of large-footprint [27]. It further changes the original waveform and it can be only an approximation. Therefore, the second approach to derive tree height uses the pre-processed GLA01 waveform data. An algorithm was developed which searches for local maxima within the waveform. The first part of the waveform is regarded as the vegetation component whereas the second half includes the signal from the terrain and possibly also from the canopy. Local maxima above the median amplitude in the second half of a certain waveform were selected as candidate ground peaks. The amplitude threshold was set because the waveforms still contained some oscillations including very low peaks obviously not referring to the terrain. All candidate ground peaks were checked if they rather belong to the vegetation return by searching for any local minima below the amplitude threshold between them. Affected candidate peaks were excluded. The method was used because it is assumed that the vegetation and ground signal—or more generally: the signals of distinct surfaces within a footprint—are always separated by a waveform section with significant reduced amplitude. The candidate peak with the greatest amplitude was finally defined as GLA01 ground return. In some cases, the maximum is rather a small plateau or “flat peak”, or the maximum is surrounded by smaller local peaks which were not smoothed during pre-processing but actually still belong to the same return. In these cases the center location was used as the ground return. The GLAS tree height was then directly calculated as the range between the GLA01 ground return and the GLA01 signal begin (from now GLA01 height). Similar to the ground return procedure, the first half of the waveform was used to define the canopy return.

Ground elevation was derived to: (1) evaluate the GLAS altimetry information and to (2) to assess the agreement between the ground return location approaches and the true terrain. The elevation referring to the GLA14 ground return was used as first measure of terrain elevation (from now on GLA14 elevation). The second terrain elevation estimate is based on the location of the GLA01 ground return (from now GLA01 elevation). The difference between GLAS terrain elevation and the footprint based mean ALS DTM elevation was regarded as adequate indicator for the performance of the two ground return definition approaches.

Some additional parameters were derived to further characterize the waveform. Leading and trailing edge were calculated based on the concept by [12] using the GLA01 data. They are indicators of the upper canopy surface variability and the terrain topography, respectively. A modified leading and trailing edge were also computed (Figure 3). These measures are related rather to the center peak of the canopy and ground return and not to the location of the half maximum. The modified versions are expected to better represent canopy surface and terrain roughness, especially when regarding waveforms with high differences between canopy and ground signal amplitude.

3.3. Definition of Waveform Types

Waveform shapes were classified into six main waveform types by visual interpretation (Figure 4). Type 1 has distinct ground and vegetation peaks, ground return is larger than the vegetation return. Type 2 is the same, but with a larger vegetation peak. Types 3 and 4 are slightly fuzzy waveforms where the ground and vegetation peaks are still detectable, but tend to be mixed together. Type 5 has a distinct ground peak and weak canopy peak. Type 6 refers to very fuzzy waveforms with no distinct ground and vegetation returns. Although the classification is also subjective, it can be expected to gain useful information about some basic properties and trends. The waveform type classification was applied only for the selected waveforms of laser periods L3A, L3F and L3H (see also Section 3.1).

3.4. Calculation of ALS DTM Measures

Since the surface topography is expected to be a critical influence factor of the waveform shape, an important objective was to characterize the landscape structure of each footprint. There exist several approaches and concepts which try to quantify the land surface (see [28] for an overview). Well-known measures like slope and aspect are the simplest metrics which are in general easy to interpret. More enhanced concepts combine two or more parameters to one measure, or propose a relief classification that segments the landscape into landform units based on certain topographic attributes [28]. We focused on metrics dealing with slope and roughness and which are already implemented in common software packages or available as ready-to-use scripts. We computed various relief parameters based on the ALS DTM. It was found that some of them correlate with each other and five measures were finally selected for later analyses. First, mean slope and elevation range as the difference between the maximum and minimum terrain elevation within a footprint were used. Second, the enhanced metrics Vector Ruggedness Measure (VRM), Topographic Position Index (TPI) and Slope Position Classification (SPC) were selected. The VRM was proposed by [29] and includes variation in slope and aspect to characterize the heterogeneity of the terrain. The TPI and SPC concepts were first suggested by [30] and implemented into an ArcGIS extension by [31]. The TPI calculates the difference between a cell elevation and its neighboring cells. The SPC as combination of TPI and slope enables each footprint to be assigned to a specific landform, i.e., valley, ridge/hilltop, hill flank/open slope. Different scales of measurement were tested because the measures are highly dependent on the size of the neighborhood chosen. Finally, TPI and SPC were calculated with 30 m radius. VRM was calculated using a 7 × 7 pixel neighborhood which corresponds to 35 × 35 m. These values were chosen to meet the scale of the average footprint size as closely as possible.

Several categories of landscape structure were created based on the slope, the VRM and the TPI/SPC measures (Table 3). The landforms valley and ridge are summarized to the landform called “complex” because of the small number of available shots otherwise. Each GLAS footprint of the reduced selection (see Section 3.1) was assigned to a category and the waveforms of each category were compared to each other.

Aspect was not considered separately because it is assumed that its influence is negligible due to the almost nadir looking of the laser shots used in this study. On the other hand, a high variation of the aspect within one footprint might be an influencing factor. This issue is, however, already considered by the roughness measure VRM.

4. Results and Discussion

4.1. Comparison of ICESat/GLAS and ALS Terrain Elevation

The two GLA01 and GLA14 terrain elevations were compared to the mean ALS DTM elevation of each footprint. Table 4 summarizes the main statistics of the terrain elevation differences for each tree species. The GLAS elevations were generally in line with the ALS data. The GLA14 elevation underestimated the reference terrain elevation on average by 2 m and the mean bias to the GLA01 elevation was even smaller. RMSE was largest for beech according to both GLAS elevations. Beech stands in the study area have the highest average canopy closure which might have complicated the detection of the terrain and thus contributed to the slightly worse results regarding the terrain elevations of the broadleaf footprints. The further investigations will focus on spruce and beech since only very few pine footprints were available.

A detailed comparison according to the surface topography (Figure 5) revealed that the GLAS elevation accuracy decreased with increasing mean slope within the footprint area. The mean elevation bias changed only slightly, especially with respect to the GLA01 elevation of the spruce stands. But the range between the maximum and minimum elevation error clearly increased for both species. Thus, the terrain elevation estimation still worked on average but high slope areas produced more outliers due to difficult waveforms with merged vegetation and ground signal.

We explored the waveforms with a terrain elevation difference larger than ±10 m to figure out possible error sources. 28 spruce and 11 beech outliers for GLA14 elevation and 15 spruce and 5 beech outliers for GLA01 elevation could be detected. We then compared the locations of the GLA01 and GLA14 ground returns and the corresponding mean ALS DTM elevation within the waveforms. It was found that the major part of the outlier shots belonged to waveform type 6 and the three estimates of the terrain elevation often differed from each other. The ground return positions of those outliers were obviously misclassified or the waveforms were too fuzzy to clearly define the terrain. However, three spruce outliers showed waveforms with a distinct ground peak: The peak of the ground return could be clearly detected visually, but the location of the ALS terrain elevation was misleading. The ALS DTM seemed to be wrong here or the complete waveform was shifted vertically due to a wrong GLA14 d_elev elevation which was used as reference to calculate GLA01 and GLA14 ground return elevation.

The observed results of this study largely correspond with [32]. The GLA14 and GLA01 ground returns represented the terrain accurately and could be used as basis to calculate the tree height. The GLA01 elevation achieved better results with a mean bias of 0.19 m (SD = 5.01 m) versus −2.09 m (SD = 6.17 m) for the GLA14 elevation. This indicates that the usage of the original waveform is more flexible and meets the ground return more precisely. The investigation of some outliers showed that the Gaussian decomposition in the GLA14 data set apparently did not characterize the waveform accurately in some cases and missed the exact terrain location more often.

4.2. Comparison of ICESat/GLAS and Reference Tree Heights

We compared the GLAS tree heights with the maximum inventory and maximum ALS CHM height. The summary statistics are given in Table 5. It is noticeable that the mean difference between GLA01 height and reference height is on average 2.5 m lower than according to the GLA14 height estimates. However, the RMSE was still relatively large for the GLA01 height with RMSE between 6–14 m. Moreover, the correlation between the GLAS and reference heights with r = 0.3–0.7 can be assessed as only moderate to low. This shows that the GLAS height estimation worked on average, but was not very robust for a single case. The correlation and RMSE regarding GLA01 height are similar to the results of direct GLAS height estimates in steep slope areas published by [19]. The mean height differences for two coniferous sites in this study, for example, are only between −1.99 and 0.18, but the RMSE are 9–10 m and the correlation coefficients are 0.47 and 0.62. However, the author of this study used a GLAS height similar to our GLA14 height but our GLA14 results are less accurate. The lower mean bias in the study by [19] can be explained by the higher point density and by the availability of up to four returns per pulse of the ALS used as reference.

It can be further detected that the accuracy of the GLAS height estimates decreased with increasing slope class. Figure 6 visualizes the trend according to the ALS height comparison. Waveforms of coniferous forest with low slopes up to 10° demonstrated high potential to retrieve tree height, whereas waveforms above 15° seemed to be problematic. Slopes between 10–15° appeared to be a critical limit corresponding to the usefulness of the waveforms. Previous studies also denoted that GLAS height estimation for slopes above 15° is challenging [16,20]. The scatter plots in Figure 7 further illustrate the agreement between GLA01 and the two reference tree heights summarized to the three particular slope classes. GLAS height estimates of beech stands were also affected by slope. But since the accuracy is lower for deciduous forest also for low slope classes, it can be assumed that there is a strong influence of the complex structure of beech canopies and the irregular shape of beech crowns. Measuring beech trees is difficult even in the field. It also has to be considered, that the number of available beech shots was small for the first two slope classes. Also other studies report lower accuracies of lidar tree height for deciduous forest (e.g., [11,33]).

The comparison between GLAS and inventory heights revealed similar trends as for ALS height (Table 5). Correlation and standard deviations were however better for ALS than inventory because of the higher spatial resolution of the ALS CHM. On the other hand, the average height differences were about 5–7 m larger regarding ALS data because only last pulse data were available in this study. In addition, small footprint ALS data often underestimate tree height (e.g., [34,35]). The findings for the pine stands are not visualized here since they were sometimes difficult to interpret because of the small number of available footprints, but they were usually in the range of the spruce results.

Since the surface topography is very complex and cannot be fully represented only by slope, additional DTM measures were related to the differences between GLAS and reference heights (Table 6). The correlation was usually low for all measures. Pflugmacher et al. [36] also found almost no relationship between median terrain slope and the differences between GLAS and the used reference heights. But apart from the general weak relationship, slope and terrain elevation range were still the most influencing factors regarding spruce forest. The results for the beech sites indicate that the accuracy of the GLAS heights of deciduous forest was affected by a complex interaction of the different footprint characteristics instead of one single driving factor. However, the relation between the height differences to the ALS CHM and DTM measures have to be treated carefully, because the DTM metrics are also based on the ALS set.

4.3. Influence of Terrain Characteristics on the GLAS Waveform Structure

The waveform parameters leading edge, modified leading edge, trailing edge and modified trailing edge were related to the DTM measures. Highest correlation was found for slope range and elevation range, which were calculated as the difference between maximum and minimum slope and terrain elevation, respectively, based on ALS data within a certain footprint. But no significant relation could be identified according to the other DTM measures. Figure 8 only visualizes the results for ALS DTM elevation range because it is comparable to the terrain index introduced by [11] and it has the same units as the leading and trailing edge metrics. The plots reveal similar patterns for both, spruce and beech: The surface relief influenced the size of the trailing edge, but the modified trailing edge represented the terrain slightly better with a correlation coefficient r = 0.61. Interestingly, elevation range affected the modified trailing edge in a same order as the modified leading edge. Also, it was found that the correlation between modified leading edge and ALS DSM elevation range is similar (e.g., r = 0.58 for spruce) to the correlation with ALS DEM elevation range. The modified leading edge therefore is not only a measure of canopy surface variability and structure and density of the plant surfaces within the canopy, as previously expected and also described in [37]. It is also a function of topographic relief: The laser pulse covers a longer way through the canopy and intercepts more plant layers within a high slope footprint. The total canopy depth thus increases resulting into a broad vegetation signal and large modified leading edge although the trees might have all the same height.

However, the relationships between the trailing and leading edge metrics and the terrain were only moderate below r = 0.7 for spruce and below r = 0.5 for beech. Trailing and leading edge are influenced by the complete footprint structure including terrain and the spatial distribution of plant surfaces. Thus, single metrics like leading or trailing edge can characterize the surface relief only partially. We therefore investigated several waveform shape types with respect to different landscape structure categories representing the complex terrain characteristics. The distribution of the corresponding waveform types is plotted based on the landscape categories (Figure 9). First, some general differences between the tree species can be distinguished: Almost 50% of the spruce waveforms were classified as types 1 and 2, the most ideal ones. Beech waveforms tended to be fuzzier as indicated by the higher number of types 3 and 4. Also, the percentage of the very blurry waveforms (type 6) is a bit higher for beech. Possible reasons are the more irregular shape of spherical beech crowns and a variable canopy surface with several emerging branches. Further, it has to be considered that beech stands are located more on hilly areas and spruce stands are grown also on flatter surfaces.

Comparing the first two landscape classes OSF and OSS, it could be detected that the number of fuzzy waveforms (waveform type 6) increased for steep slope landscapes. If the surface was also rough (landscape category ORS), the percentage was even higher. The influence of complex landforms like a valley or a ridge top was difficult to interpret due to the limited number of shots. Also, the complex landscape makes it difficult to isolate single factors contributing to the waveform shape and it leads to a variety of waveform types including also ideal types such as types 1 and 2. But the comparison between the landscape classes ORS and CRS indicates that the valley and ridge location to some extent compensates for the effect of fuzzy waveforms due to steep and rough terrain.

We explored the waveforms which were classified as a waveform type we did not expect for the corresponding landscape category. They are of special interest because they allow discovering the relation between waveform, surface topography and other characteristics of the footprint structure more into detail. Figure 10(a) visualizes an example for a fuzzy waveform of a footprint with relatively flat and smooth terrain (waveform type 6 and landscape category OSF). It was found that the forest is young and quite low. Thus, the ground and vegetation return get easily merged together. This observation could be confirmed when investigating waveforms classified as type 6, but located at flat terrain all together: The mean inventory tree height was considerably lower than for waveforms of the two most ideal types 1 and 2 and with flat and smooth surface. The vertical difference between the lowest canopy layer and the highest position of the terrain within the footprint was also lowest for the type 6 which makes it difficult to clearly separate surface terrain and canopy. Furthermore, we recognized during some field visits that young stands often show dense understory. The trunks of young spruce trees have many little branches below the crown producing a lot of scrub and the crown depth is quite large. Older and higher spruce stands, on the other hand, are more open below the tree crowns, because there is not enough light at the terrain. The effect becomes apparent when checking the tree density: The number of trees per hectare decreases with increasing age for both spruce and beech. Consequently, the canopy of young stands consists of smaller, but more trees with many intersecting branches attenuating the laser return from the terrain. The problem of greater errors in tree height estimation from large-footprint lidar for low stature forest even in low relief areas was also addressed by [38]. Furthermore, the orthophoto shows that the footprint in Figure 10(a) is located at the border of a lower major stand and higher minor stand at the southwest edge. The canopy within the footprint thus consists of several layers which might intensify the merging of ground and vegetation return.

The second example Figure 10(b) shows a waveform with quite clear ground and canopy return although the footprint is located at a steep slope of 23° (waveform type 1 and landscape category OSS). A crosscheck with the digital orthophoto revealed a gap within the forest. Other waveforms with the same waveform type and landscape category also showed some gaps or paths in the forest. This suggests that the horizontal forest structure is another important factor to consider. Patchy or less dense forests can lead to waveforms with discrete canopy and ground return despite steep surface topography. A possible explanation is that the laser pulse more likely reaches the real terrain if there is a gap. In the case of the example presented in Figure 10(b), the forest gap follows the same gradient as the surface slope. The laser pulse thus can scan the whole range of terrain elevations within the footprint. Some other waveforms with distinct ground and vegetation signals within landscape category OSS, on the other hand, corresponded to footprints with a mean slope 10–15°. This class can be regarded as crossover, as already suggested by the results of the tree height comparison in Section 4.2.

The detailed analysis of the waveforms shows that the waveform shape is the product of the interaction between the vertical and horizontal footprint structure. This suggests that the accuracy of the GLAS tree height estimations rather depends on the internal waveform shape and not only by single external factors as surface slope or roughness (cf. Table 6). It is also interesting to note that the GLAS height estimates were not necessarily completely wrong for a fuzzy waveform. Examining all GLAS shots together, the average height difference to the two reference sets was not always lowest for the two ideal waveform shape type 1 and 2. However, the uncertainty of good height estimation was higher for fuzzy waveforms. The waveform in Figure 10(a) reveals another phenomenon according to the GLAS tree height estimation. GLA01 height and ALS maximum height rather coincides with inventory height of the lower major stand instead of the maximum inventory height. This shows four important issues to consider: First, the problem that only last pulse ALS data were available again arises. Second, the stand with the higher trees, the minor stand, is located only at a small part at the border of the footprint, the area with decreasing laser energy. The laser pulse obviously did not reflect enough energy from the emergent conical tree crowns of the major stand. Third, the values as given in the inventory database rather refer to the mean characteristics of the dominant species within a stand. The higher trees of the minor stand are even located more in the center of the particular stand as indicated by the ALS CHM dataset and thus not inside the footprint anymore. Fourth, small geolocation errors can lead to different height estimates which are particularly critical in such border footprints. This is also important for waveforms of type 5 which were found to be usually situated in very sparse vegetated areas mostly at or near a border between forest and non-forest areas. Further, the differences between the three terrain elevation measures GLA01, GLA14 and ALS DTM elevation of the waveform example Figure 10(a) again demonstrate the difficulty of defining the ground return in broadened and fuzzy waveforms. At the same time, it could be detected that the terrain elevation difference might balance the variability of canopy top definition and thus tree height divergence. It also shows that the proper definition of the signal begin is critical and the optimal thresholds might differ according to the waveform types, laser periods and/or footprint structure.

5. Discussion

Our findings regarding the interaction between surface topography, vegetation structure and waveform shape based on real waveform examples confirm the simulation results by Yang et al. [16]. They found that the critical slope angle, where ground and vegetation returns are completely mixed, changes depending on vegetation height, leaf area index, crown shape and vegetation height variation. The fair performance of our tree height estimates regarding beech stands are thus obviously caused by both, the fact that beech stands are usually located in very steep parts of the study area and the characteristic canopy structure of deciduous forest which tends to decrease the critical slope angle due to lower gap probability compared to needle-leaf forest with similar density and vegetation cover fraction [16]. The published critical slope angles of 10.5°–12.5° for 70 m footprints [16] are in range with our critical slope class 10°–15°. The combined influence of surface relief and vegetation characteristics also explains the great variety of waveform shapes we have found even within one slope class. In simple terms, favored conditions of forest structure within a footprint—e.g., a high and not too dense forest with only one vertical canopy layer—can lead to a waveform with distinct ground and vegetation return although the terrain is steep and rough. And waveforms can be fuzzy in almost flat terrain, but with unfavorable forest structure, e.g., low and dense forests.

The methods applied in this study include some potential error sources. The quality of the reference data sets available for the study site is an important issue, whereat the lack of availability of first return ALS data is certainly the most critical point and the main reason for the large negative bias between GLAS and ALS canopy heights. The ALS height used in this study does not refer to the outer canopy surface because the ALS pulse is recorded deeper in the canopy. The effect will be stronger in light forest stands with many gaps, which can be especially expected for broadleaf stands during leaf-off season. A further potential error source which has to be considered is the time difference between the GLAS and reference data sets. The problem of time difference should be minimized as much as possible regarding the inventory data set since it was provided as recalculated information using growth models and yield information. The maximum difference between the GLAS acquisition time and ALS data is 7 years. The growth rates of trees strongly depend on the site conditions. In general, the height of young spruce trees increases about 50 cm per year at maximum and the growth rate clearly drops down at an age of approximately 50 years to 20–25 cm per year on average [39]. Beech trees as a typical shade tree species grow slower, but continue growing also in high ages. The potential height difference due to natural succession can thus be estimated by around 2–4 m. The footprints included in the analysis refer to mainly mature and old spruce and beech forests mitigating the problem of natural tree growth. However, the issue of time difference remains critical since tree stands can also change due to external events, for example storms or snow damage.

Another potential error source is related to the waveform type classification which might be biased by subjective impression since it has been done manually. This part of the analysis can thus certainly not be directly generalized and the classifications approach should be automated in future work. The results of the waveform examples suggest that a reclassification of the landscape classes which will include both the topography and tree height might enhance the understanding of the interaction between terrain and forest structure. In addition, the maximum tree height which was used here is most sensitive to surface topography and some of the variances in the height differences could be compensated by using other height metrics like the mean height. Future work should also enhance the signal begin and end thresholds as suggested by [40,41] since this is particularly important for waveforms with a weak canopy signal and indistinct signal start as for waveform type 5.

6. Conclusions

This paper demonstrated the potential of estimating terrain elevation and tree height using ICESat/GLAS waveforms in Thuringian Forest, a mountainous region in central Germany. The work presented here further enhances the empirical knowledge regarding a special temperate forest site with a complex terrain and vegetation structure. Two approaches to retrieve tree height directly from the waveform based on the ground return location were proposed and assessed by comparing the corresponding GLAS terrain elevations and tree heights with ALS and field based inventory data. The first method uses the GLA14 Gaussians and is thus comparable to the method already applied for example by [13,19]. Due to the complex terrain conditions, a more flexible approach was needed and an algorithm based on the GLA01 data set was developed. This second method tries to overcome the slope issue already in the step of the tree height retrieval. The big advantage of lidar remote sensing—the direct measurement of the vertical dimension without any limitations by for example a statistical model—can thus be preserved. We could also illustrate the high variability of GLAS waveform shapes and how they are affected by landscape structure. The waveform shape classification is a new attempt to explore the waveforms in its whole structure instead of using single metrics. Our investigation of some concrete waveform examples of real landscape conditions could also support modeling results from recent research regarding the interaction between landscape and waveform structure [16]. Our major findings can be summarized as follows:

- The approach based on the GLA01 data product performed better than the approach using the GLA14 Gaussian decomposition parameters regarding both, terrain elevation and tree height.

- The mean tree height difference between field based inventory and GLAS tree height is 3.58 m (SD = 7.84 m) for GLA14 height and 1.13 m (SD = 6.39 m) for GLA01 height regarding all species and slope classes. The accuracy clearly decreases with increasing slope. GLAS waveforms are most useful to retrieve tree height for areas up to a 10° slope. Slopes between 10–15° are a critical limit and tree height retrieval is very challenging for steep areas above 15°.

- Waveform shape and deduced metrics like leading and trailing edge are not only affected by surface topography, but rather by the complex interaction of the three-dimensional footprint structure. The probability of broadened and fuzzy waveforms with mixed ground and vegetation signals increases with steeper and rougher terrain. Further factors influencing waveform shape are tree height, canopy density or gappiness and forest types as they develop species-specific vertical canopy structure and crown shapes.

The results clearly show that it is very difficult to remove surface topography effects from ICESat/GLAS waveforms because the waveform also always includes vegetation effects. Internal waveform metrics like leading and trailing edge are only of limited use, as indicated by the moderate correlation with surface topography. The modified leading and trailing edge metrics proposed in the paper seemed to be more useful since they are less influenced by the relation between vegetation signal intensity and ground signal intensity. Thus, they rather refer to the particular signal part of interest (e.g., the ground signal for the modified trailing edge) and are more independent of the remaining waveform characteristic. However, statistical models solely including leading or trailing edge indices as in [12] to correct the GLAS height estimate by terrain effects might be not sufficient for regions with very strong topography and complex forest conditions. Other attempts to model terrain relief include several waveform metrics [42] but the complexity of the proposed models again show that it is demanding to remove the topography effects without any additional external information about the footprint structure. A promising approach to incorporate surface topography effects to improve the waveform metrics is recently published by Allouis et al. [21].

Finally, the synergy with optical or SAR data will certainly improve the performance of GLAS terrain elevation and tree height retrieval. Texture parameters which characterize the horizontal footprint structure will for example complement the vertical information of the GLAS data [43]. Satellite-based waveform lidar technology combined with additional remote sensing data will thus be an excellent option to map the three-dimensional forest structure.

This research is funded by the German Aerospace Center (DLR). The authors would like to thank the National Snow and Ice Center (NSDIC) for distributing the ICESat/GLAS data. Moreover, we would like to thank the Thuringian Forest Agency for providing the inventory and the kind cooperation and the Thuringian Land Surveying Office for providing the airborne laser and the digital orthophotos. Thanks go also to Anka Nicke and Ralf Götze from the University of Applied Sciences Erfurt for their remarks about forestry related issues.