Diminishing return on investment for biodiversity data in conservation planning

Diminishing return on investment for biodiversity data in conservation planning
Grantham, Hedley S.; Moilanen, Atte; Wilson, Kerrie A.; Pressey, Robert L.; Rebelo, Tony G.; Possingham, Hugh P.
2008-10-01 00:00:00
Introduction Conservation planning is a dynamic decision‐making process facilitating the implementation of management actions for conserving biodiversity ( Margules & Pressey 2000 ; Wilson 2007 ). Protected areas are one of the main tools for achieving conservation outcomes. Their location and management are being increasingly guided by conservation plans, despite a previous lack of systematic assessment ( Pressey 1994 ). Conservation plans require various forms of data including: the distribution and dynamics of biodiversity ( Ferrier 2002 ; Drielsma & Ferrier 2006 ; Pressey 2007 ); the impact and severity of threatening processes ( Wilson 2005 ); and the social, economic, political, and human circumstances that shape the context for conservation planning ( Knight 2006a ). Conservation planning can also consider the estimates of the potential costs and benefits of different actions, and the opportunities and constraints for implementing these actions ( Naidoo 2006 ; Cowling & Wilhelm‐Rechmann 2007 ; Knight & Cowling 2007 ; Wilson 2007 ). Some conservation organizations invest much time and many resources into developing effective conservation plans to improve decision making ( Cleary 2006 ; Oetting 2006 ). However, there has been little investigation of the amounts and types of data that are most cost‐effective in guiding conservation on the ground ( Andelman & Fagan 2000 ; Cleary 2006 ; Possingham 2007 ). Collecting more data costs time and money. If conservation planning is to be truly efficient and effective, planners must decide if further investment in surveys, mapping, or modeling is likely to improve planning decisions and then weigh that expected improvement against lost opportunities while data are collected. If the return on investment from new surveys is low in terms of better planning, then resources might be better directed to other actions. It is possible for conservation agencies to overallocate resources to any one action, including data collection and analysis, and all actions should consequently be traded off against each other ( Possingham 2007 ). However, decisions regarding the allocation of resources between different actions are rarely explicit, although calls for improved accountability in conservation are becoming increasingly commonplace ( Cleary 2006 ; Ferraro & Pattanayak 2006 ). Balmford & Gaston (1999) were the first to investigate the cost‐effectiveness of new surveys for improving conservation plans. They demonstrated that new biodiversity surveys are invariably cost efficient, due to the improved efficiency of selecting new protected areas. However, their approach did not account for the dynamic nature of the conservation problem ( Meir . 2004 ). This includes budgetary constraints, limited capacity to fill data gaps, the potential influence of a changing landscape where valuable areas can be lost while data are collected, and the cost‐effectiveness and surrogacy value of different types of data. Our aim was to consider several of these factors in determining the return on investment from different initial expenditures in survey data before undertaking a program of implementing new protected areas. This included the extent of the initial protected area network and important aspects of a changing landscape: the protection rate and habitat loss rate. We also measured the surrogacy value of an existing habitat map. Methods Our study area ( Figure 1 ) was the Fynbos biome of South Africa ( Cowling & Heijnis 2001 ; Olson 2001 ). The total area of the region is around 81,000 km 2 . We divided this region into 80,773 planning units, each 1 km 2 . We excluded any planning unit that was partly outside the study region. Using data on vegetation cover from Cowling . (1999) , we categorized planning units as “cleared” (native vegetation removed) if their centers had no native vegetation. We then excluded these from the analyses. These exclusions left us with 53,385 planning units. Of these, we classified 18,457 as protected if their centers contained a protected area (IUCN types I–III) based on data from Reyers . (2007) . The remaining 34,928 planning units were available for conservation management but were also potentially vulnerable to habitat loss. 1 Map of South Africa showing the Fynbos region in gray, with a total area of around 81,000 km 2 . Proteaceae is a characteristic plant family in the region. To study the effect of different amounts of data on the effectiveness of spatial prioritization, we used the data in the Protea Atlas ( Forshaw 1998 ). Within our entire study area, this database contained over 40,000 plots with 0.22 million occurrence records for 381 taxa of Proteaceae. Some of the proteas were subspecies. The data were collected by volunteers over a 10‐year period. We removed all records for proteas that were cultivated or hybrids. We used Maxent version 1.8.6 ( Phillips 2006 ) to develop distribution models for those proteas with more than 20 records ( Pearson 2007 ). The models had the same resolution as the planning units and were based on 25 biological, physical, and climatic predictor variables. We derived the presence/absence of proteas from the models using the cumulative probability ( Phillips 2006 ), which is an indicator of habitat suitability. A protea was classified as present if the value was over 20 (S. Phillips, AT&T Labs, personal communication) and the planning unit was not classified as cleared. For proteas with less than 20 records, we did not use the distribution models, recording them as present within uncleared planning units containing records. We randomly checked about 25% of the models to ensure their predictive accuracy by measuring the area under the curve (receiver operating characteristic >0.95) ( Elith 2006 ) and compared several modeled distributions to published distribution maps. We used 334 modeled proteas and 47 unmodeled proteas as our evaluation data set, representing the “complete” distributions of Proteaceae in the study region. To simulate different levels of investment in survey data, we randomly selected and costed subsets of protea plots ranging from 100 to all 43,863. These were selected to cover a range of different scenarios from small investments to the entire data set. In the same way as for our evaluation data set, we modeled the distribution of proteas with more than 20 records and listed others as present in uncleared planning units where they had been observed. The cost of each survey plot was estimated to be around US$60 (∼380 South Africa Rand [ZAR] at the time of investment) using estimates of wages, fuel, overheads, and other costs (described in the Supporting Information). Among our scenarios was one with zero investment in survey data and complete reliance of planning on an existing habitat classification. This consisted of 68 habitat classes based on climate, geology, and topography (described in Cowling & Heijnis 2001 ), and we assumed that it was freely available. For each level of data investment, we developed a land‐use simulation extending over the 10 years following completion of the different data sets. Each year, new protected areas were implemented with the available data by protecting planning units available for conservation management. A total of 2,000 km 2 of protected areas was notionally implemented each year based on the average rate of establishment of statutory and nonstatutory protected areas over the last 20 years ( Cowling & Pressey 2003 ), with nonstatutory areas contributing 50% to the rate. Each year, after new protected areas were notionally implemented, we simulated spatially explicit, stochastic clearing with native vegetation loss based on the rate of clearing in South Africa between 1988 and 1993, estimated to be around 2% per annum ( Biggs & Scholes 2002 ). More recent and spatially refined data were not available. This rate was assigned to vegetation types (described in Mucina & Rutherford 2006 ) based on their vulnerability values ( Reyers . 2007 ). We normalized vulnerability values and assigned an annual clearing rate for each vegetation type as 0.02 × normalized vulnerability value. We then assigned each planning unit a clearing rate probability based on the vegetation types it contained. This value became the probability of a planning unit being cleared altogether in any one year. In the simulations that incorporated existing protected areas, planning units that were already protected were assigned a zero clearing probability. We repeated each simulation 20 times after testing that this amount of repetition produced moderately stable solutions with stochastic simulated clearing. We used a maximum gain algorithm to select protected areas. The algorithm selected planning units that gave, progressively, the highest marginal increases in conservation value relative to existing protected areas and previously selected notional protected areas (described in Moilanen & Cabeza 2007 ). As a comparison, we applied a minimum loss algorithm that took into account likely habitat clearance (described in Moilanen & Cabeza 2007 ). Both algorithms used a convex square root benefit function to skew selection toward rare proteas ( Arponen 2005 ). We evaluated the outcome of each land‐use simulation with two metrics: representation and retention ( Pressey . 2004 ). Representation is the proportion of a species distribution protected. We measured it as the proportion of protea distributions at the beginning of the simulation occurring within actual or notional protected areas at the end of the simulation. Retention is the proportion of a species distribution remaining in the landscape. We measured it as the proportion of protea distributions at the beginning of the simulation still remaining in uncleared planning units at the end of the simulation, regardless of the conservation status of those planning units. To summarize the effectiveness of conservation planning, we measured the mean representation and retention for each protea across the 20 replicate simulations. We then used the lower quartile of the 381 protea mean values to focus our measures of representation and retention on the proteas with least protection or most loss from clearing. As a sensitivity test, we ran additional simulations for each level of investment in data by doubling and halving the rates of habitat loss and protection. We also ran each simulation without the existing protected areas, reclassifying all formally protected planning units as available for protection and estimating their average clearing rates from vegetation types. These scenarios also involved the allocation of 2,000 km 2 of new protected areas each year. Finally, we ran simulations for each level of investment in survey by combining both the habitat map and the distributions of proteas. Results For representation of proteas, we found rapidly diminishing returns for increments of investment in survey ( Figure 2A ). There was a similar relationship for retention ( Figure 2B ), although the range of retention values was smaller than those for representation. For representation, the use of any survey data was more effective than the habitat map alone ( Figure 2A ). For retention, a small initial investment in survey (0.806) was less effective than the habitat map alone (0.985) ( Figure 2B ), although the effectiveness increased thereafter with more survey. The curves followed a pattern similar to the accumulation curve of the total number of proteas detected ( Figure 3 ). 2 Diminishing returns between initial survey investment and the conservation of proteas at the end of 10‐year simulations. We measured both representation of proteas in protected areas (proportion of a species distribution protected) (A) and retention of proteas in the landscape (proportion of a species distribution remaining in the landscape) (B) as lower quartiles of values across all proteas. For scenarios with no investment in survey data, we used a habitat map to guide selection of notional protected areas. Currency is U.S. dollars. 3 Data on proteas for different levels of investment in survey. Upper line shows the total number of proteas detected in at least one plot. Lower line shows the number of proteas with more than 20 records, the threshold chosen for modeling their distributions. Currency is U.S. dollars. We also found rapidly diminishing returns measuring both representation and retention when varying the rates of protection and habitat loss ( Figure 4A–D ). Predictably, halving the rates of protection reduced the effectiveness of investments and doubling them increased it ( Figure 4A and B ). Also predictably, halving the rates of habitat loss increased the effectiveness and doubling them reduced it ( Figure 4C and D ). 4 Diminishing returns between initial survey investment and the conservation of proteas at the end of 10‐year simulations. We measured both representation of proteas in protected areas (proportion of a species distribution protected) (A, C) and retention of proteas in the landscape (proportion of a species distribution remaining in the landscape) (B, D) as lower quartiles of values across all proteas. As sensitivity tests, we doubled and halved the rates of protection (A, B) and habitat loss (C, D). For scenarios with no investment in survey data, we used a habitat map to guide selection of notional protected areas. Currency is U.S. dollars. The results were similar when we used the habitat map in combination with the survey data, regardless of the measure of effectiveness ( Figure 5A and B ). Both measures of effectiveness were reduced when the established protected areas were ignored and regarded as areas available for conservation action but also vulnerable to habitat loss ( Figure 5C and D ). This highlights the roles of the existing protected areas in both promoting representation of proteas and, to a lesser extent in this region, reducing their loss. For each measure of effectiveness, there was little difference between the results of the maximum gain and minimum loss algorithms (not shown). 5 Diminishing returns between initial survey investment and the conservation of proteas at the end of 10‐year simulations. We measured both representation of proteas in protected areas (proportion of a species distribution protected) (A, C) and retention of proteas in the landscape (proportion of a species distribution remaining in the landscape) (B, D) as lower quartiles of values across all proteas. Panels A and B show the effect of using a habitat map in combination with survey data. Panels C and D compare the results with and without the existing protected areas incorporated into the analysis. Currency is U.S. dollars. Discussion We discovered that for a relatively small investment in protea survey data, we can design an effective protected area network on where to conserve proteas. There were strongly diminishing returns from investment in survey data for proteas. An investment of around US$100,000 (∼R637,000) led to almost the same effectiveness of protected areas as an investment of around US$2.5 million (∼R16 million), despite the larger investment substantially increasing knowledge of the distributions of proteas. There are at least two plausible reasons why there were strongly diminishing returns. First, with relatively small initial investments in survey data, we found strong correlations between the richness of proteas in planning units from partial databases and richness in the complete database ( Figure 6 ). This point indicates that, with only relatively small investments in survey data, protea‐rich planning units are likely to be detected and likely to contribute to efficient and effective selections of protected areas. Second and related to this, we also found that small initial investments in survey data resulted in most proteas being detected during surveying ( Figure 2 ). However, the number of proteas with more than 20 occurrence records only gradually increased with investment in survey. The main benefit of additional survey above US$100,000 (R637,000) was not to add taxa but to refine the distributions. 6 Spearman's coefficient of rank correlation of the richness of proteas in each planning unit between different initial survey investments and the entire data set used for evaluation. All correlations had low type 1 error probabilities ( P < 0.001). Note that the investments here range from $6,085 to $2,190,600. Currency is U.S. dollars. Several studies have previously used subsets of data to test the influence of the amounts of data on conservation planning ( Freitag & van Jaarsveld 1998 ; Gaston & Rodrigues 2003 ; Gladstone & Davis 2003 ; Grand 2007 ). Using the same data set as ours (the Protea Atlas), Grand (2007) tested the effects of different levels of sampling effort and bias on the selection of protected areas. They found that lesser data generally resulted in larger areas being required to represent at least one record of each protea in notional protected areas. Our results suggest that, while less comprehensive data might reduce the overall efficiency of a conservation network, investing in additional data might not be the most cost‐effective approach to conservation when implementation is gradual and accompanied by ongoing habitat loss. This combination of circumstances is typical of conservation planning in most parts of the world ( Pressey . 2004 ). The sensitivity analyses emphasize the importance of considering land‐use dynamics in conservation planning. Increasing the protection rate increased the representation of proteas, and this was clearly due to more area being protected. An increase in the protection rate also increased retention because more occurrences of proteas were protected from habitat loss. While increasing the habitat loss rate did not influence representation, it did marginally for retention. This was likely because of an increased habitat loss outside protected areas. If planners were starting a conservation plan in a region where there was little protection from habitat loss and few biodiversity data, our results indicate that there are initially large returns on further data investment. However, similar to when there are existing protected areas, investment in survey data involves strongly diminishing returns. Surprisingly, despite the minimum loss algorithm accounting for habitat loss, there was little difference between its results and those of the maximum gain algorithm. Two likely reasons are the relatively short period of the simulation and moderately low rates of habitat loss. The poor performance of the habitat map as a surrogate for proteas was probably due to it being developed using broad‐scale environmental variables that did not distinguish the fine‐scale variation in habitat, local endemism, and rapid spatial turnover typical of Proteaceae in this biome ( McDonald 1996 ). Considerable research has been directed at testing the effectiveness of biodiversity surrogates for conservation planning ( Rodrigues & Brooks 2007 ). There has been extensive debate on appropriate methods of testing surrogates ( Brooks 2004 ; Rodrigues & Brooks 2007 ). Surrogacy testing indicates whether further data investment is necessary to overcome limitations of using surrogates. A limitation of previous methods is that they provide only broad advice on data investment. For example, using a similar habitat map, test data (the Protea Atlas), and study area, Lombard . (2003) found that the habitat map was a better surrogate for more common proteas but worse for rarer ones. These results provide general guidance on data investment. Using our analytical framework and similar data, we were able to address more direct investment‐related questions, with more specific lessons for future conservation practice. If likely future landscape scenarios can be developed, this type of retrospective analysis can be applied at any time during data collection to determine if the effectiveness of decisions is improving or has plateaued. Are our results general? While we have shown that a tiny fraction of the available survey data yields a relatively good outcome for conservation, this conclusion might not be easily generalized. One set of factors concerns aspects of the distribution of proteas in our study area, which might be different for other taxa and other regions. Another set of factors concerns our land‐use simulations. It was necessary to apply relatively simple simulations of land‐use change, considering parameters that we expected would influence the results. Despite these simplifications, they provided a framework for testing alternative approaches to conservation in a more dynamic and realistic manner than in previous work that has applied complementarity‐based reserve selection algorithms without considering the interplay of conservation action and habitat loss. There were several assumptions within our model that should be noted. First, it is unlikely that implementation would be possible in all the areas selected due to factors such as unwilling stakeholders and our planning units did not delineate realistic selection units, such as cadastres ( Knight 2006b ). Second, we assumed that implementation costs were uniform across the study area, while in reality, these would have likely been highly variable ( Naidoo 2006 ). Third, the habitat loss rate is likely to be somewhat inaccurate for our study area, but improving upon this information is a complex task ( Lambin . 2001 ) beyond the scope of this study. Fourth, we assumed that past protection rates would be similar to future rates. To evaluate the sensitivity of the results to the protection rate, however, we halved and doubled the rates and reevaluated the model outcomes. Fifth, we only compared the return on investment in one type of data (protea survey data) and one type of predictive model (Maxent). Last, we assumed that all required survey data could be collected at the outset of a conservation planning exercise. In reality, collection and interpretation of survey data can be time‐consuming. How resources should be split between data collection, including different types of data, and the implementation of conservation action is an area requiring further research, given the important practical implications of the lessons that are learnt. We have shown that relatively small and incomplete data sets can be effective at identifying where to act. While biodiversity will remain a key source of information in making conservation decisions, there is increasing evidence that a lack of biodiversity data may not be the main limitation in the development of effective conservation plans. The ability of an organization to engage with stakeholders and understand constraints on and opportunities for its actions, particularly in production landscapes, are major impediments ( Knight . 2006a ). This emphasizes the importance of investment in other types of data, including social, human, and economic characteristics of regions to help reduce the research‐implementation gap ( Knight . 2008 ). Editor : Belinda Reyers Acknowledgments We are grateful to the volunteers who collected the data, without whom the project would have cost ∼ US$3 million (∼ZAR19 million); instead, the Protea Atlas Project cost ∼ US$0.35 million (∼ZAR2.2 million). We thank M. Watts for his help with data processing, and M. Lombard, B. Reyers, and the South African National Biodiversity Institute for data. Hedley S. Grantham was supported by the University of Queensland and an Environmental Futures Network (Australian Research Council) travel grant. Atte Moilanen was supported by the Academy of Finland, project 1206883. We thank B. Reyers, A. Knight, and an anonymous reviewer for their comments.
http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.pngConservation LettersWileyhttp://www.deepdyve.com/lp/wiley/diminishing-return-on-investment-for-biodiversity-data-in-conservation-dibPf8TjcH

Diminishing return on investment for biodiversity data in conservation planning

Abstract

Introduction Conservation planning is a dynamic decision‐making process facilitating the implementation of management actions for conserving biodiversity ( Margules & Pressey 2000 ; Wilson 2007 ). Protected areas are one of the main tools for achieving conservation outcomes. Their location and management are being increasingly guided by conservation plans, despite a previous lack of systematic assessment ( Pressey 1994 ). Conservation plans require various forms of data including: the distribution and dynamics of biodiversity ( Ferrier 2002 ; Drielsma & Ferrier 2006 ; Pressey 2007 ); the impact and severity of threatening processes ( Wilson 2005 ); and the social, economic, political, and human circumstances that shape the context for conservation planning ( Knight 2006a ). Conservation planning can also consider the estimates of the potential costs and benefits of different actions, and the opportunities and constraints for implementing these actions ( Naidoo 2006 ; Cowling & Wilhelm‐Rechmann 2007 ; Knight & Cowling 2007 ; Wilson 2007 ). Some conservation organizations invest much time and many resources into developing effective conservation plans to improve decision making ( Cleary 2006 ; Oetting 2006 ). However, there has been little investigation of the amounts and types of data that are most cost‐effective in guiding conservation on the ground ( Andelman & Fagan 2000 ; Cleary 2006 ; Possingham 2007 ). Collecting more data costs time and money. If conservation planning is to be truly efficient and effective, planners must decide if further investment in surveys, mapping, or modeling is likely to improve planning decisions and then weigh that expected improvement against lost opportunities while data are collected. If the return on investment from new surveys is low in terms of better planning, then resources might be better directed to other actions. It is possible for conservation agencies to overallocate resources to any one action, including data collection and analysis, and all actions should consequently be traded off against each other ( Possingham 2007 ). However, decisions regarding the allocation of resources between different actions are rarely explicit, although calls for improved accountability in conservation are becoming increasingly commonplace ( Cleary 2006 ; Ferraro & Pattanayak 2006 ). Balmford & Gaston (1999) were the first to investigate the cost‐effectiveness of new surveys for improving conservation plans. They demonstrated that new biodiversity surveys are invariably cost efficient, due to the improved efficiency of selecting new protected areas. However, their approach did not account for the dynamic nature of the conservation problem ( Meir . 2004 ). This includes budgetary constraints, limited capacity to fill data gaps, the potential influence of a changing landscape where valuable areas can be lost while data are collected, and the cost‐effectiveness and surrogacy value of different types of data. Our aim was to consider several of these factors in determining the return on investment from different initial expenditures in survey data before undertaking a program of implementing new protected areas. This included the extent of the initial protected area network and important aspects of a changing landscape: the protection rate and habitat loss rate. We also measured the surrogacy value of an existing habitat map. Methods Our study area ( Figure 1 ) was the Fynbos biome of South Africa ( Cowling & Heijnis 2001 ; Olson 2001 ). The total area of the region is around 81,000 km 2 . We divided this region into 80,773 planning units, each 1 km 2 . We excluded any planning unit that was partly outside the study region. Using data on vegetation cover from Cowling . (1999) , we categorized planning units as “cleared” (native vegetation removed) if their centers had no native vegetation. We then excluded these from the analyses. These exclusions left us with 53,385 planning units. Of these, we classified 18,457 as protected if their centers contained a protected area (IUCN types I–III) based on data from Reyers . (2007) . The remaining 34,928 planning units were available for conservation management but were also potentially vulnerable to habitat loss. 1 Map of South Africa showing the Fynbos region in gray, with a total area of around 81,000 km 2 . Proteaceae is a characteristic plant family in the region. To study the effect of different amounts of data on the effectiveness of spatial prioritization, we used the data in the Protea Atlas ( Forshaw 1998 ). Within our entire study area, this database contained over 40,000 plots with 0.22 million occurrence records for 381 taxa of Proteaceae. Some of the proteas were subspecies. The data were collected by volunteers over a 10‐year period. We removed all records for proteas that were cultivated or hybrids. We used Maxent version 1.8.6 ( Phillips 2006 ) to develop distribution models for those proteas with more than 20 records ( Pearson 2007 ). The models had the same resolution as the planning units and were based on 25 biological, physical, and climatic predictor variables. We derived the presence/absence of proteas from the models using the cumulative probability ( Phillips 2006 ), which is an indicator of habitat suitability. A protea was classified as present if the value was over 20 (S. Phillips, AT&T Labs, personal communication) and the planning unit was not classified as cleared. For proteas with less than 20 records, we did not use the distribution models, recording them as present within uncleared planning units containing records. We randomly checked about 25% of the models to ensure their predictive accuracy by measuring the area under the curve (receiver operating characteristic >0.95) ( Elith 2006 ) and compared several modeled distributions to published distribution maps. We used 334 modeled proteas and 47 unmodeled proteas as our evaluation data set, representing the “complete” distributions of Proteaceae in the study region. To simulate different levels of investment in survey data, we randomly selected and costed subsets of protea plots ranging from 100 to all 43,863. These were selected to cover a range of different scenarios from small investments to the entire data set. In the same way as for our evaluation data set, we modeled the distribution of proteas with more than 20 records and listed others as present in uncleared planning units where they had been observed. The cost of each survey plot was estimated to be around US$60 (∼380 South Africa Rand [ZAR] at the time of investment) using estimates of wages, fuel, overheads, and other costs (described in the Supporting Information). Among our scenarios was one with zero investment in survey data and complete reliance of planning on an existing habitat classification. This consisted of 68 habitat classes based on climate, geology, and topography (described in Cowling & Heijnis 2001 ), and we assumed that it was freely available. For each level of data investment, we developed a land‐use simulation extending over the 10 years following completion of the different data sets. Each year, new protected areas were implemented with the available data by protecting planning units available for conservation management. A total of 2,000 km 2 of protected areas was notionally implemented each year based on the average rate of establishment of statutory and nonstatutory protected areas over the last 20 years ( Cowling & Pressey 2003 ), with nonstatutory areas contributing 50% to the rate. Each year, after new protected areas were notionally implemented, we simulated spatially explicit, stochastic clearing with native vegetation loss based on the rate of clearing in South Africa between 1988 and 1993, estimated to be around 2% per annum ( Biggs & Scholes 2002 ). More recent and spatially refined data were not available. This rate was assigned to vegetation types (described in Mucina & Rutherford 2006 ) based on their vulnerability values ( Reyers . 2007 ). We normalized vulnerability values and assigned an annual clearing rate for each vegetation type as 0.02 × normalized vulnerability value. We then assigned each planning unit a clearing rate probability based on the vegetation types it contained. This value became the probability of a planning unit being cleared altogether in any one year. In the simulations that incorporated existing protected areas, planning units that were already protected were assigned a zero clearing probability. We repeated each simulation 20 times after testing that this amount of repetition produced moderately stable solutions with stochastic simulated clearing. We used a maximum gain algorithm to select protected areas. The algorithm selected planning units that gave, progressively, the highest marginal increases in conservation value relative to existing protected areas and previously selected notional protected areas (described in Moilanen & Cabeza 2007 ). As a comparison, we applied a minimum loss algorithm that took into account likely habitat clearance (described in Moilanen & Cabeza 2007 ). Both algorithms used a convex square root benefit function to skew selection toward rare proteas ( Arponen 2005 ). We evaluated the outcome of each land‐use simulation with two metrics: representation and retention ( Pressey . 2004 ). Representation is the proportion of a species distribution protected. We measured it as the proportion of protea distributions at the beginning of the simulation occurring within actual or notional protected areas at the end of the simulation. Retention is the proportion of a species distribution remaining in the landscape. We measured it as the proportion of protea distributions at the beginning of the simulation still remaining in uncleared planning units at the end of the simulation, regardless of the conservation status of those planning units. To summarize the effectiveness of conservation planning, we measured the mean representation and retention for each protea across the 20 replicate simulations. We then used the lower quartile of the 381 protea mean values to focus our measures of representation and retention on the proteas with least protection or most loss from clearing. As a sensitivity test, we ran additional simulations for each level of investment in data by doubling and halving the rates of habitat loss and protection. We also ran each simulation without the existing protected areas, reclassifying all formally protected planning units as available for protection and estimating their average clearing rates from vegetation types. These scenarios also involved the allocation of 2,000 km 2 of new protected areas each year. Finally, we ran simulations for each level of investment in survey by combining both the habitat map and the distributions of proteas. Results For representation of proteas, we found rapidly diminishing returns for increments of investment in survey ( Figure 2A ). There was a similar relationship for retention ( Figure 2B ), although the range of retention values was smaller than those for representation. For representation, the use of any survey data was more effective than the habitat map alone ( Figure 2A ). For retention, a small initial investment in survey (0.806) was less effective than the habitat map alone (0.985) ( Figure 2B ), although the effectiveness increased thereafter with more survey. The curves followed a pattern similar to the accumulation curve of the total number of proteas detected ( Figure 3 ). 2 Diminishing returns between initial survey investment and the conservation of proteas at the end of 10‐year simulations. We measured both representation of proteas in protected areas (proportion of a species distribution protected) (A) and retention of proteas in the landscape (proportion of a species distribution remaining in the landscape) (B) as lower quartiles of values across all proteas. For scenarios with no investment in survey data, we used a habitat map to guide selection of notional protected areas. Currency is U.S. dollars. 3 Data on proteas for different levels of investment in survey. Upper line shows the total number of proteas detected in at least one plot. Lower line shows the number of proteas with more than 20 records, the threshold chosen for modeling their distributions. Currency is U.S. dollars. We also found rapidly diminishing returns measuring both representation and retention when varying the rates of protection and habitat loss ( Figure 4A–D ). Predictably, halving the rates of protection reduced the effectiveness of investments and doubling them increased it ( Figure 4A and B ). Also predictably, halving the rates of habitat loss increased the effectiveness and doubling them reduced it ( Figure 4C and D ). 4 Diminishing returns between initial survey investment and the conservation of proteas at the end of 10‐year simulations. We measured both representation of proteas in protected areas (proportion of a species distribution protected) (A, C) and retention of proteas in the landscape (proportion of a species distribution remaining in the landscape) (B, D) as lower quartiles of values across all proteas. As sensitivity tests, we doubled and halved the rates of protection (A, B) and habitat loss (C, D). For scenarios with no investment in survey data, we used a habitat map to guide selection of notional protected areas. Currency is U.S. dollars. The results were similar when we used the habitat map in combination with the survey data, regardless of the measure of effectiveness ( Figure 5A and B ). Both measures of effectiveness were reduced when the established protected areas were ignored and regarded as areas available for conservation action but also vulnerable to habitat loss ( Figure 5C and D ). This highlights the roles of the existing protected areas in both promoting representation of proteas and, to a lesser extent in this region, reducing their loss. For each measure of effectiveness, there was little difference between the results of the maximum gain and minimum loss algorithms (not shown). 5 Diminishing returns between initial survey investment and the conservation of proteas at the end of 10‐year simulations. We measured both representation of proteas in protected areas (proportion of a species distribution protected) (A, C) and retention of proteas in the landscape (proportion of a species distribution remaining in the landscape) (B, D) as lower quartiles of values across all proteas. Panels A and B show the effect of using a habitat map in combination with survey data. Panels C and D compare the results with and without the existing protected areas incorporated into the analysis. Currency is U.S. dollars. Discussion We discovered that for a relatively small investment in protea survey data, we can design an effective protected area network on where to conserve proteas. There were strongly diminishing returns from investment in survey data for proteas. An investment of around US$100,000 (∼R637,000) led to almost the same effectiveness of protected areas as an investment of around US$2.5 million (∼R16 million), despite the larger investment substantially increasing knowledge of the distributions of proteas. There are at least two plausible reasons why there were strongly diminishing returns. First, with relatively small initial investments in survey data, we found strong correlations between the richness of proteas in planning units from partial databases and richness in the complete database ( Figure 6 ). This point indicates that, with only relatively small investments in survey data, protea‐rich planning units are likely to be detected and likely to contribute to efficient and effective selections of protected areas. Second and related to this, we also found that small initial investments in survey data resulted in most proteas being detected during surveying ( Figure 2 ). However, the number of proteas with more than 20 occurrence records only gradually increased with investment in survey. The main benefit of additional survey above US$100,000 (R637,000) was not to add taxa but to refine the distributions. 6 Spearman's coefficient of rank correlation of the richness of proteas in each planning unit between different initial survey investments and the entire data set used for evaluation. All correlations had low type 1 error probabilities ( P < 0.001). Note that the investments here range from $6,085 to $2,190,600. Currency is U.S. dollars. Several studies have previously used subsets of data to test the influence of the amounts of data on conservation planning ( Freitag & van Jaarsveld 1998 ; Gaston & Rodrigues 2003 ; Gladstone & Davis 2003 ; Grand 2007 ). Using the same data set as ours (the Protea Atlas), Grand (2007) tested the effects of different levels of sampling effort and bias on the selection of protected areas. They found that lesser data generally resulted in larger areas being required to represent at least one record of each protea in notional protected areas. Our results suggest that, while less comprehensive data might reduce the overall efficiency of a conservation network, investing in additional data might not be the most cost‐effective approach to conservation when implementation is gradual and accompanied by ongoing habitat loss. This combination of circumstances is typical of conservation planning in most parts of the world ( Pressey . 2004 ). The sensitivity analyses emphasize the importance of considering land‐use dynamics in conservation planning. Increasing the protection rate increased the representation of proteas, and this was clearly due to more area being protected. An increase in the protection rate also increased retention because more occurrences of proteas were protected from habitat loss. While increasing the habitat loss rate did not influence representation, it did marginally for retention. This was likely because of an increased habitat loss outside protected areas. If planners were starting a conservation plan in a region where there was little protection from habitat loss and few biodiversity data, our results indicate that there are initially large returns on further data investment. However, similar to when there are existing protected areas, investment in survey data involves strongly diminishing returns. Surprisingly, despite the minimum loss algorithm accounting for habitat loss, there was little difference between its results and those of the maximum gain algorithm. Two likely reasons are the relatively short period of the simulation and moderately low rates of habitat loss. The poor performance of the habitat map as a surrogate for proteas was probably due to it being developed using broad‐scale environmental variables that did not distinguish the fine‐scale variation in habitat, local endemism, and rapid spatial turnover typical of Proteaceae in this biome ( McDonald 1996 ). Considerable research has been directed at testing the effectiveness of biodiversity surrogates for conservation planning ( Rodrigues & Brooks 2007 ). There has been extensive debate on appropriate methods of testing surrogates ( Brooks 2004 ; Rodrigues & Brooks 2007 ). Surrogacy testing indicates whether further data investment is necessary to overcome limitations of using surrogates. A limitation of previous methods is that they provide only broad advice on data investment. For example, using a similar habitat map, test data (the Protea Atlas), and study area, Lombard . (2003) found that the habitat map was a better surrogate for more common proteas but worse for rarer ones. These results provide general guidance on data investment. Using our analytical framework and similar data, we were able to address more direct investment‐related questions, with more specific lessons for future conservation practice. If likely future landscape scenarios can be developed, this type of retrospective analysis can be applied at any time during data collection to determine if the effectiveness of decisions is improving or has plateaued. Are our results general? While we have shown that a tiny fraction of the available survey data yields a relatively good outcome for conservation, this conclusion might not be easily generalized. One set of factors concerns aspects of the distribution of proteas in our study area, which might be different for other taxa and other regions. Another set of factors concerns our land‐use simulations. It was necessary to apply relatively simple simulations of land‐use change, considering parameters that we expected would influence the results. Despite these simplifications, they provided a framework for testing alternative approaches to conservation in a more dynamic and realistic manner than in previous work that has applied complementarity‐based reserve selection algorithms without considering the interplay of conservation action and habitat loss. There were several assumptions within our model that should be noted. First, it is unlikely that implementation would be possible in all the areas selected due to factors such as unwilling stakeholders and our planning units did not delineate realistic selection units, such as cadastres ( Knight 2006b ). Second, we assumed that implementation costs were uniform across the study area, while in reality, these would have likely been highly variable ( Naidoo 2006 ). Third, the habitat loss rate is likely to be somewhat inaccurate for our study area, but improving upon this information is a complex task ( Lambin . 2001 ) beyond the scope of this study. Fourth, we assumed that past protection rates would be similar to future rates. To evaluate the sensitivity of the results to the protection rate, however, we halved and doubled the rates and reevaluated the model outcomes. Fifth, we only compared the return on investment in one type of data (protea survey data) and one type of predictive model (Maxent). Last, we assumed that all required survey data could be collected at the outset of a conservation planning exercise. In reality, collection and interpretation of survey data can be time‐consuming. How resources should be split between data collection, including different types of data, and the implementation of conservation action is an area requiring further research, given the important practical implications of the lessons that are learnt. We have shown that relatively small and incomplete data sets can be effective at identifying where to act. While biodiversity will remain a key source of information in making conservation decisions, there is increasing evidence that a lack of biodiversity data may not be the main limitation in the development of effective conservation plans. The ability of an organization to engage with stakeholders and understand constraints on and opportunities for its actions, particularly in production landscapes, are major impediments ( Knight . 2006a ). This emphasizes the importance of investment in other types of data, including social, human, and economic characteristics of regions to help reduce the research‐implementation gap ( Knight . 2008 ). Editor : Belinda Reyers Acknowledgments We are grateful to the volunteers who collected the data, without whom the project would have cost ∼ US$3 million (∼ZAR19 million); instead, the Protea Atlas Project cost ∼ US$0.35 million (∼ZAR2.2 million). We thank M. Watts for his help with data processing, and M. Lombard, B. Reyers, and the South African National Biodiversity Institute for data. Hedley S. Grantham was supported by the University of Queensland and an Environmental Futures Network (Australian Research Council) travel grant. Atte Moilanen was supported by the Academy of Finland, project 1206883. We thank B. Reyers, A. Knight, and an anonymous reviewer for their comments.