Comments 0

Document transcript

Comparing simple respiration models for eddy ﬂux anddynamic chamber dataAndrew D.Richardsona,*,Bobby H.Braswella,David Y.Hollingerb,Prabir Burmanc,Eric A.Davidsond,Robert S.Evansb,Lawrence B.Flanagane,J.William Mungerf,Kathleen Savaged,Shawn P.Urbanskig,Steven C.WofsyfaUniversity of New Hampshire,Complex Systems Research Center,Morse Hall,39 College Road,Durham,NH 03824,USAbUSDA Forest Service,Northern Research Station,271 Mast Road,Durham,NH 03824,USAcUC Davis,Department of Statistics,One Shields Avenue,Davis,CA 95616,USAdWoods Hole Research Center,P.O.Box 296,Woods Hole,MA 02543,USAeUniversity of Lethbridge,Department of Biological Sciences,4401 University Drive,Lethbridge,Alberta,Canada T1K 3M4fHarvard University,Division of Engineering and Applied Science,Department of Earth and Planetary Science,Cambridge,MA 02138,USAgUSDA Forest Service,RMRS-Fire Sciences Lab,P.O.Box 8089,Missoula,MT 59808,USAReceived 30 June 2006;received in revised form 2 October 2006;accepted 20 October 2006AbstractSelection of an appropriate model for respiration (R) is important for accurate gap-ﬁlling of CO2ﬂux data,and for partitioningmeasurements of net ecosystem exchange (NEE) to respiration and gross ecosystem exchange (GEE).Using cross-validationmethods and a version of Akaike’s Information Criterion (AIC),we evaluate a wide range of simple respiration models with theobjective of quantifying the implications of selecting a particular model.We ﬁt the models to eddy covariance measurements ofwhole-ecosystemrespiration (Reco) fromthree different ecosystemtypes (a coniferous forest,a deciduous forest,and a grassland),as well as soil respiration data from one of these sites.The well-known Q10model,whether driven by air or soil temperature,performed poorly compared to other models,as did the Lloyd and Taylor model when used with two of the parameters constrainedto previously published values and only the scale parameter being ﬁt.The continued use of these models is discouraged.However,avariant of the Q10model,in which the temperature sensitivity of respiration varied seasonally,performed reasonably well,as did theunconstrained three-parameter Lloyd and Taylor model.Highly parameterized neural network models,using additional covariates,generally provided the best ﬁts to the data,but appeared not to performwell when making predictions outside the domain used forparameterization,and should thus be avoided when large gaps must be ﬁlled.For each data set,the annual sum of modeledrespiration (annual SR) was positively correlated with model goodness-of-ﬁt,implying that poor model selection may inject asystematic bias into gap-ﬁlled estimates of annual SR.#2006 Elsevier B.V.All rights reserved.Keywords:Absolute deviations regression;Akaike’s information criterion (AIC);AmeriFlux;Cross-validation;Eddy covariance;Maximumlikelihood;Model selection criteria;Respiration;Uncertainty1.IntroductionModels are used in conjunction with measurementsof surface-atmosphere CO2ﬂuxes ðFCO2Þ for a varietyof reasons.These include:(1) ﬁlling gaps in the eddywww.elsevier.com/locate/agrformetAgricultural and Forest Meteorology 141 (2006) 219–234* Corresponding author at:USDA Forest Service,271 Mast Road,Durham,NH 03824,USA.Tel.:+1 603 868 7654;fax:+1 603 868 7604.E-mail address:andrew.richardson@unh.edu (A.D.Richardson).0168-1923/$ – see front matter#2006 Elsevier B.V.All rights reserved.doi:10.1016/j.agrformet.2006.10.010covariance ﬂux record (Falge et al.,2001);(2)estimating annual sums of the components of the netﬂux,such as total ecosystem respiration (Reco) or grossecosystem exchange (GEE) (Reichstein et al.,2005;Richardson and Hollinger,2005;Hagen et al.,2006);(3) extracting physiological parameters from the ﬂuxdata (Van Wijk and Bouten,2002;Braswell et al.,2005;Hollinger and Richardson,2005).Typically,thesemodels are relatively simple functions of only a fewindependent variables and several parameters.Here weevaluate a range of simple respiration models usingobjective model selection criteria,and we investigatethe implications of selecting a particular model.The nocturnal ﬂux measured above the canopy byeddy covariance is generally assumed to represent Reco,and thus includes soil respiration (both heterotrophicrespiration and root respiration) as well as varioussources of above ground respiration (e.g.,leaf,branchand stemrespiration) (Davidson et al.,2006b).Recoandits components have been modeled using a variety ofapproaches (e.g.,Lloyd and Taylor,1994;see alsoMorgenstern et al.,2004).In simple respiration models,the respiratory ﬂux generally scales as a function oftemperature (although the functional form of thisrelationship varies among models),representing thedominant role of reaction kinetics,possibly modulatedby secondary environmental factors,such as soil watercontent.Most of these models lack a strict theoreticalbasis (cf.the Farquhar et al.,1980,photosynthesismodel).This can be attributed to the fact that we stillhave a very poor mechanistic understanding of therelationships between environmental factors and Reco,and between carbon allocation and substrate availabilityfor respiration (Davidson et al.,2006a).Acomplicatingfactor is that soil and ecosystem respiration representthe aggregate respiratory ﬂux from a diverse (andchanging) array of organisms,each of which may besubject to somewhat different environmental conditionsor limiting factors.Previous studies have compared a number ofrespiration models using data from individual sites(Janssens et al.,2003;Del Grosso et al.,2005;Richardson and Hollinger,2005) or even multiple sites(Falge et al.,2001),but to date no single synthesis hascompared a wide range of simple models acrossdifferent ecosystemtypes and measurement techniques,as we do here.Our objective is to determine whichmodels give the best ﬁt,and to assess the effects (interms of model predictions) of choosing a particularmodel.Our emphasis is on model selection rather thanhypothesis testing.Cross-validation and an informationtheoretic criterion are used for objective modelselection,and models are ranked accordingly.We usethe modeled annual sum of respiration (‘‘annual SR’’)as a quantitative,but subjective,means by which toevaluate differences in model predictions (e.g.,Hagenet al.,2006),since annual sums of ﬂuxes are ofparticular interest to the community.We investigatewhether rankings for models that simulate Recoareconsistent across three different ecosystems:coniferousforest,deciduous forest,and grassland,using nocturnaldata fromthe Howland,Harvard Forest,and LethbridgeAmeriFlux sites.In addition to data fromthe main eddycovariance tower at Howland (‘‘Howland-Main’’),wealso use data from a below-canopy eddy covariancesystem (‘‘Howland-Subcanopy’’) and an array ofautomated soil respiration chambers (‘‘Howland-Auto-chamber’’) at this site to investigate whether modelrankings for Rsoilare similar to those for Reco.2.Models,data and methods2.1.Respiration modelsThe models we evaluate were selected from theliterature and are listed in Table 1 (note that although theparameters are denoted u1,u2,...,unfor each model,theoptimal parameter values differ among models).Thesemodels are all simple,in that they contain (at most) asingle static carbon pool,have no feedbacks,and aredriven by bulk measurements of the overall ecosystemstate.For example,soil temperature is typically used asa driving variable,although it may not accurately reﬂectthe thermal state of various respiring components withinthe system (e.g.,canopy temperature versus littertemperature versus O- and A-horizon temperature;Hollinger et al.,1994;Van Dijk and Dolman,2004;Reichstein et al.,2005).Respiration is controlled by both biological andphysical factors.Work by Arrhenius and van’t Hoff inthe late-19th century on the temperature dependence ofchemical reactions gave rise to notions of a relationshipbetween temperature and respiration (see review byLloyd and Taylor,1994).Either a linear (model A inTable 1) or higher-order polynomial (model B) modelwould sufﬁce as a simple,if naı¨ve,representation of thisrelationship (at least over a limited range),but theArrhenius equation (model C) more accuratelydescribes many chemical systems.van’t Hoff’s Q10model (model D),which gives an exponential relation-ship between respiration and temperature,has beenwidely used in many branches of biology.However,itassumes ﬁxed temperature sensitivity,and predicts thatrespiration increases at a steady relative rate,andA.D.Richardson et al./Agricultural and Forest Meteorology 141 (2006) 219–234220without limit,as temperature increases.This character-istic is a common criticism of the Q10model (e.g.,Tjoelker et al.,2001).Bycomparison,boththe LloydandTaylor model (model E) and the logistic model (model F)have a sigmoid shape,which may be more appropriate athigher temperatures,when respiration may be sup-pressed.Simulations (Richardson and Hollinger,2005)suggest that the Lloyd and Taylor model is over-parameterized because different combinations of para-meters can produce versions of the model that ﬁt the dataequally well given inherent carbon ﬂux measurementuncertainties.To avoid this equiﬁnality (e.g.,Schulzet al.,2001),we also use a one-parameter restrictedversion of the model (model Er),with the other twoparameters constrained to the values reported by Lloydand Taylor (1994) for their soil respiration data set.Similar to Kirschbaum (1995),we constructed twovariants of the traditional Q10model.In our temperature-varying Q10model (Model D1,‘‘Q10v-temp’’),the singleparameter controlling the temperature sensitivity ofrespiration is instead speciﬁed as a linear function oftemperatureitself.This allows thetemperaturesensitivityto decrease with increasing temperature (Janssens andPilegaard,2003),as inthe LloydandTaylor model.Inthetime-varying Q10model (Model D2,‘‘Q10v-time’’),thetemperature sensitivity of respiration is speciﬁed as aﬁrst-order Fourier function of the Julian day,and is thusconstrainedtofollowa seasonal course.We are not awareof models like these being applied previously to eddycovariance data (Kirschbaum’s analysis was based on aliterature survey of laboratory incubations),but our ownpreliminary results indicated potentially signiﬁcantimprovements over the original Q10model.A third variant of the Q10model is the Gresp model(model D3),in which the temperature–respirationrelationship is modulated by soil water content.Soilwater content may more tightly control respiration insome ecosystems than others.For example,soilrespiration at the coniferous Howland Forest has beenfound to be less sensitive to soil moisture than that at thedeciduous Harvard Forest (Savage and Davidson,2001).We use soil temperature to drive the abovetemperature–respiration relationships.This decision isbased on the fact that Rsoilaccounts for a largeproportion of Recoin forested ecosystems (Davidsonet al.,2006b).For the Q10model,however,we includeA.D.Richardson et al./Agricultural and Forest Meteorology 141 (2006) 219–234 221Table 1Respiration models used in the present analysisModel Formula Reference[A] Linear u1+ u2T Wofsy et al.(1993)[B] Polynomial u1+ u2T + u3T2+ u4T3+ u5T4+ u6T5[C] Arrheniusu1expu2283:15R

u4SWCþu4Carlyle and Ba Than (1988)[E] Lloyd and Tayloru1expu2Tþ273:15u3

Lloyd and Taylor (1994)[Er] L&T-Rest:restricted form of L&Tu1exp308:56Tþ46:02

[F] Logisticu11þexpðu2u3TÞBarr et al.(2002)[G] Fourier u1þu2sinðJDpÞ þu3cosðJDpÞ þðhigher order termsÞ Hollinger et al.(2004)[G1] Fourier-1:ﬁrst-order Fourier regression[G2] Fourier-2:second-order Fourier regression[G4] Fourier-4:fourth-order Fourier regression[H] Neural networks Hagen et al.(2006)[H1] NN-S f(Tsoil)[H2] NN-A f(Tair)[H3] NN-SA f(Tsoil,Tair)[H4] NN-SAJ f(Tsoil,Tair,JDp)[H5] NN-SAJW f(Tsoil,Tair,JDp,SWC)Note:Model parameters,u1,...,un,differ among models.‘‘T’’ refers to a measured temperature variable,either Tairor Tsoil.Trefis a ﬁxed referencetemperature,usually 10 8C.JDpis the Julian day expressed in radians (=2p JD/366).SWCis soil water content,in %by volume.Codes in squarebrackets are those used to identify plotted points in Figs.1 and 2.both a version driven by soil temperature (model DS,‘‘Q10-S’’) and by air temperature (model DA,‘‘Q10-A’’).The Fourier model (model G) is not based on apresumed temperature–respiration relationship.Rather,it is simply a representation of the inherent seasonalityof temperature and respiration.The Fourier model isthus appealing because it does not require data forenvironmental drivers (but for this same reason it maynot be very useful for making predictions into thefuture).We evaluate ﬁrst-,second-,and fourth-orderFourier models (‘‘Fourier-1’’,etc.);higher-order mod-els can capture more complex seasonal patterns thanlower-order models,but may be subject to over-ﬁttingand poor performance outside of the domain used forparameterization.Note that for some models,a number of different,butequivalent,formulations are possible.For example,thestandard Q10model formulation (model D) is function-ally identical to both the Type I exponential,u1exp(u2T),and the Type II exponential,exp(u3u4T).The Lloydand Taylor model listed in Table 1 (model E) has analternate formulation expressed in terms of the respira-tionrateat 10 8C(seeEq.(11) inLloydandTaylor,1994).The basic u1sin(x) + u2cos(x) structure of the Fouriermodel (model G) can,of course,also be written asu3sin(x + u4).Results obtained using these alternativeexpressions will not differ,except to the extent thatiterative numerical algorithms for non-linear optimiza-tion may proceed towards convergence more or lessslowly for one formulation than another,depending oninitial parameter guesses and optimization algorithm.In a different class of models from those justdescribed are the neural networks (models H) (Bishop,1996).Neural network regression models have beenapplied previously to eddy covariance data and aredescribed elsewhere (e.g.,Van Wijk and Bouten,1999;Papale and Valentini,2003;Hagen et al.,2006);they arevery ﬂexible and are thus well-suited to non-linearproblems,especially when there is no need to impose aparticular functional formon the respiration response toa particular independent variable.We used ﬁve differentneural network models:‘‘NN-S’’ (soil temperature),‘‘NN-A’’ (air temperature),‘‘NN-SA’’ (soil temperatureand air temperature),‘‘NN-SAJ’’ (NN-SAplus sine andcosine of Julian day),and ‘‘NN-SAJW’’ (NN-SAJ plussoil water content).2.2.Data sources2.2.1.Tower-based eddy covariance measurementsAs our primary data source,we used nocturnal above-canopy CO2ﬂux data from eddy covariance towers inthree contrasting ecosystems.The Howland tower(1997–2004 data;45.258N,68.738W,elev.60 m ASL)is located in a boreal-northern hardwood transition forestabout 50 km north of Bangor,ME,USA.The Harvardtower (1992–2003 data;42.538N,72.178W,elev.340 mASL) is located in a mixed temperate forest,about110 kmwest of Boston,MA,USA.The Lethbridge tower(1999–2004 data;49.438N,112.568W,elev.951 mASL)is located in a mixed grassland,just west of the city limitsof Lethbridge,Alberta,Canada.Quality control,ﬂux corrections,and data editingwere performed by the individual site investigators.Forconsistency across all three tower sites,a standardu*= 0.25 threshold was applied during nocturnal(PPFD <5 mmol m2s1) periods;this thresholdvalue is consistent with cutoff values that havepreviously been applied at these sites.Site-speciﬁcprocedures are described elsewhere (Howland:Hollin-ger et al.,1999,2004;Harvard:Barford et al.,2001;Lethbridge:Flanagan et al.,2002;Flanagan andJohnson,2005).For all data sets,the sign conventionused is that carbon ﬂux into the ecosystemis deﬁned asnegative,i.e.,respiration is a positive ﬂux.2.2.2.Below-canopy eddy covariancemeasurementsAbelow-canopyeddycovariance system,mountedona 3 m tripod in close proximity to the Howland maintower,was operated from 1996 to 2001.The understoryin the vicinity of the tripod is very sparse,and the canopylight transmittanceverylow,andthus themeasuredﬂuxesare dominated by soil respiration (Hollinger et al.,1999).We used both daytime and nighttime data in the presentanalysis.We implemented a more relaxed u*thresholdfor this systemtoensure anadequate sample size for eachyear.With u*= 0.10,there were 4–5 times as manyobservations (mean 1 S.D.= 2400 1000 observa-observations/year) as with u*= 0.20.However,modelrankings were almost identical regardless of whether u*was set to 0.10,0.15,or 0.20.Modeled annual SRdiffered somewhat depending on the u*threshold used,but the effect was not consistent across years.We notethat in this analysis below-canopy data have not beencorrected for high and low frequency losses due tomeasurement system inadequacies.2.2.3.Automated soil respiration chambermeasurementsSix automated soil respiration chambers (see Savageand Davidson,2003,for details on design andinstrumentation) were installed at Howland Forest,inclose proximity to the below-canopy ﬂux system,inA.D.Richardson et al./Agricultural and Forest Meteorology 141 (2006) 219–234222mid-May 2004 (JD 135).The chambers were sampledsequentially for 5 min each,so that each chamber wassampled every 30 min.Data from chamber 5 are notincluded in the present analysis,due to a broken pistonthat occurred half way through the growing season (JD236).Measurements of the remaining ﬁve chamberscontinued until early November (JD 310).Because the autochamber data cover only half theyear,and we have no wintertime data to constrain themodels,we do not present annual SR estimates for thisdata set.2.3.Maximum likelihood model ﬁttingFor evaluating models and estimating parametersbased on data,a maximum likelihood approach shouldbe used.In the maximum likelihood paradigm,theestimated model parameters are those that would bemost likely to generate the observed data,given themodel and what is known about the measurementuncertainty (Press et al.,1993).The correct use of thisapproach thus requires information about measurementerror in the data.Ordinary least squares optimization yields maximumlikelihood parameter estimates when the assumptions ofnormality and constant variance (homoscedasticity) ofthe error are met.However,recent work indicates thatthe eddy covariance ﬂux measurement error,d,follows adistribution that is better approximated by a double-exponential (Laplace) distribution than a normal(Gaussian) distribution;the variance of the measure-ment error (s2(d)) also generally scales as a function ofthe magnitude of the ﬂux (Hollinger and Richardson,2005;Richardson and Hollinger,2005;Richardsonet al.,2006).According to Press et al.(1993),whenerrors are distributed as a double-exponential with non-constant variance,maximum likelihood parameterestimates are obtained by minimizing the ﬁgure ofmerit,V,equal to the weighted sum of the absolutedeviations between measured (yi) and modeled (ypred)values,i.e.,the cost function in Eq.(1),V ¼Xni¼1jyiypredjsðdiÞ(1)The weighting factor,1/s(di),is the reciprocal of theestimated standard deviation of the measurement errorfor each observation i.As noted by Raupach et al.(2005),this weighting reﬂects our conﬁdence in thedata:observations with lows(di) receive more weight inthe optimization than observations with high s(di),because we have more conﬁdence in observations withsmaller measurement uncertainty.Previous research(Richardson et al.,2006) indicates that s(di) scales withthe magnitude of the ﬂux for FCO20 (i.e.,respiratoryefﬂux from the system),as given in Eqs.(2a and 2b):forested sites:sðdiÞ ¼ 0:62 þ0:63 FCO2(2a)grassland sites:sðdiÞ ¼ 0:38 þ0:30 FCO2(2b)Similar analyses for both the subcanopy ECdata and theautochamber data indicate that the measurement errorfor these time series is also better approximated by adouble-exponential,rather than a normal,distribution.Likewise,s(di) again scales with the magnitude of theﬂux (Eqs.(2c and 2d)):understory EC:sðdiÞ ¼ 0:06 þ1:51 FCO2(2c)autochambers:sðdiÞ ¼ 0:15 þ0:28 FCO2(2d)Note that the scaling of measurement uncertainty withﬂux magnitude depends upon both site characteristicsand methodology.For the purposes of Eqs.(2a–2d),weuse an estimated ﬂux,ˆFCO2,rather than the measuredﬂux FCO2,to estimate the measurement uncertainty.This yields an uncertainty estimate that is statisticallyindependent of the actual error in the measured ﬂux:ifthe measured ﬂux was used for Eqs.(2a–2d),thenobservations with negative measurement errors wouldreceive more weight (because estimated s(di) is lower),and observations with positive measurement errorswould receive less weight,in the optimization scheme.The estimated ﬂux was determined by ﬁtting a second-order Fourier regression to the data using unweightedabsolute deviations optimization:preliminary analysesindicated that this model captured the inherent season-ality of respiration in these temperate systems relativelywell.Most statistical analyses were conducted in SAS 9.1(SAS Institute,Cary,NC,USA) using weighted non-linear regression.Parameters were optimized throughminimization of V (Eq.(1)) using either the Gauss-Newton or the Marquardt method with automaticcomputation of analytical ﬁrst- and second-orderderivatives.Results obtained using these algorithmswere foundtobeessentiallythesame as thosedeterminedusing a simulated annealing algorithm(Metropolis et al.,1953).The neural network model was ﬁt using theBayesian-regularized neural network algorithm ofMacKay (1994) coded in MatLab (The MathWorks,Natick,MA).Separate model parameters were ﬁt foreach year of data,except for Howland-Autochamberdata,where we ﬁt a separate set of model parameters foreach chamber.A.D.Richardson et al./Agricultural and Forest Meteorology 141 (2006) 219–234 2232.4.Model evaluation2.4.1.k-Fold cross-validationIdeally,models should be validated against a whollyindependent data set.Often,this is impossible;in thecase of statistical modeling,it is rarely done.Moretypically,models are inadequately ‘‘evaluated’’ bycomparing ﬁtted model values,ypred,directly against theoriginal yi.An example of this type of within-samplevalidation would be evaluating models on the basis ofthe mean squared error of the model residual.Analternative out-of-sample approach,cross-validation,ispreferable,because model predictions are not validatedagainst the same data that were used to establish modelparameters.We used a standard k-fold cross-validationapproach (Hastie et al.,2001),in which the data aredivided into k subsets,the model is ﬁt using (k-1)subsets as the training set,and then validation isconducted using the one (kth) omitted subset.Theprocedure is then repeated k times,with a new set ofmodel parameters ﬁt at each step.In this way,every datapoint is included in the validation set exactly once.Models are then evaluated using the maximumlikelihood ﬁgure of merit,V (Eq.(1)),calculatedacross the k validation sets.As noted above,the sorts of models evaluated here areoften used to ﬁll nighttime gaps in the ﬂux data record.Invalidating the model predictions,we wanted to test theability of each model to ﬁll both small and large datagaps.We performed twodifferent cross-validations,eachwith k = 12 (a value of 10 k 20 is typically used).Inthe ﬁrst,each data point was randomly assigned to one ofthe subsets;we refer to this as the ‘‘Random-12’’ (Rand-12) cross-validation.Consequently,this approach teststhe stability of a model with small gaps distributedrandomlythroughout theyear.Inthe second,the year wasdivided into 12 ‘‘months’’ of equal length,and eachmonth was speciﬁed as one of the subsets (Month-12).This approach assesses a model’s performance with largegaps in the ﬂux record,which may involve makingextensive predictions beyond the domain of the data usedfor ﬁtting.Note that an assumption underlying k-fold cross-validation is that the training and validation data sets areindependent and identically distributed.We acknowl-edge that because of the time-series nature of ﬂux data,this assumption may not be entirely met.However,inthe case of the Month-12 validation,the number ofobservations in each subset is probably much larger thanthe effective ‘‘memory’’ of any autocorrelation,mean-ing that for practical purposes the non-independencecan be ignored.2.4.2.Information theoretic approachAkaike (1973) developed a method to choose amongalternative models evaluated via the maximum like-lihood approach.Akaike’s Information Criterion (AIC,Eq.(3)) was designed for model selection when data arenormally distributed with constant variance (Burmanand Nolan,1995;Anderson et al.,2000),AIC ¼ n logðˆs2Þ þ2ð p þ1Þ (3)where n is the number of observations,ˆs2equals theestimated within-sample residual sum of squaresdivided by n,and p is the number of parameters inthe model.The model with the lowest AICis consideredthe best model,given the data at hand,but it is importantto note that AICis not in any sense a formal or statisticalhypothesis test (Anderson et al.,2000).Since n is ﬁxedfor any given data set,AIC essentially balances bettermodel explanatory power (smallerˆs2) against increas-ing complexity (larger p);in this way,AIC selectsagainst models with an excessive number of parameters.Anumber of alternative criteria have been developed(e.g.,Hurvich and Tsai,1990;Burman and Nolan,1995) for cases when OLS assumptions are violated,and models are ﬁt by some formof robust regression.Inthe present analysis,we use the following form(Eq.(4)),which is appropriate when models are ﬁtusing weighted absolute deviations optimization (afterHurvich and Tsai,1990):AICWAD¼ 2n logðVÞ þ2ð p þ1Þ (4)where V is the within-sample ﬁgure of merit fromEq.(1),and n,p are as in Eq.(3).For a neural network model,p equals the number ofnonzero ﬁtted weight parameters (Nweights),whichdepends on the number of hidden nodes (Nhidden),thenumber of input variables (Ninputs),and the number ofoutput variables (Noutputs),as in is Eq.(5):Nweights¼ NhiddenðNinputsþ1Þ þðNoutputsÞðNhiddenþ1Þ (5)2.5.Secondary statistical analysesOptimization was based on minimizing the ﬁgure ofmerit,V,given in Eq.(1),for each model.Then,foreach data set and each year of data,we calculated awithin-sample null model (R ¼¯R,where¯R is a constantfor all yi) value for the ﬁgure of merit,V0.This was usedto scale V and generate,for the purpose of comparingacross data sets,a relative ﬁgure of merit statistic,r0,A.D.Richardson et al./Agricultural and Forest Meteorology 141 (2006) 219–234224that takes on a value between 0 and 1 (Eq.(6)):r0¼1 VV0(6)Because it is a measure of howmuch of the scatter in theoriginal data is explained by the model,this statisticcould be considered the absolute deviations analogue tothe multiple correlation coefﬁcient,r2,commonlyemployed in least-squares regression analyses of Gaus-sian data.Model ﬁt statistics (the r0statistic for Rand-12 andMonth-12 cross-validation,as well as AICWAD) and themodeled annual sumof respiration (‘‘annual SR’’) wereanalyzedseparatelyfor each site,using two-way analysisof variance with ‘‘model’’ and ‘‘year’’ as categoricalvariables.Visual inspection of ANOVA residualssuggested that the assumptions of normally distributedresiduals and homogeneity of residual variance wererelatively well met,except for annual SR.We thereforeconducted a non-parametric ANOVA on rank-trans-formed annual SR data;for comparative purposes,bothanalyses are reported in Table 4.Differences amongmodels were assessed using a Bonferroni multiplecomparison test to control the overall Type I error rate(a = 0.05).This approachis oftenconsideredexcessivelyconservative,as it increases the probability of a Type IIerror (failure to reject the null hypothesis).However,itshould be kept in mind that our objective is to identifybroad groups of models that perform more or lesssimilarly for a given data set (i.e.,‘‘models ‘A’,‘B’,and‘C’ are roughly comparable’’),rather than explicithypothesis testing (i.e.,‘‘at 95% conﬁdence,model ‘A’is signiﬁcantly better than model ‘B’’’).Overall model rankings for each ﬁt criterion werecompared,both for individual sites (‘‘how similar arethe model rankings by AICWADand Rand-12 r0atTower A?’’) and across multiple sites (‘‘howsimilar arerankings by AICWADat Towers A,B,and C?’’),usingKendall’s W coefﬁcient of concordance (Sokal andRohlf,1995).3.Results3.1.Overall patterns across sitesRegardless of the model used,it is possible to explainmore of the variation in measured nocturnal FCO2forsome data sets than others (Table 2).Across models,Rand-12 r0values range from 0.17 to 0.23 at Harvard,compared to 0.25–0.38 at Howland-Main,and 0.15–0.51 at Lethbridge.Thus,the model with the lowest r0atHowland-Main explains more variation than the modelwith the highest r0at Harvard.At Lethbridge,thedifference between the best-ﬁtting models and theworst-ﬁtting models is larger than at either of the othertwo tower sites.Similarly,the models explain far moreof the variation in the Howland-Autochamber data(r0range 0.23–0.54) than the Howland-Subcanopy data(0.15–0.21).The unexplained variation in measurednocturnal FCO2can be attributed to both the measure-ment system (e.g.,random measurement uncertaintyplus,in the case of the eddy covariance data,temporalvariation in the ﬂux footprint,and systematic butunknown biases associated with nocturnal measure-ments,possibly related to advection and storage;seeAubinet et al.,2001;Richardson et al.,2006;Oren et al.,2006) as well as the shortcomings of the modelsthemselves (e.g.,oversimpliﬁcation/incorrect represen-tation of processes,including omission of importantcovariates),which may not be adequate to describe thecomplex ensemble of processes that together combineto yield Reco.We discuss variation in the explanatorypower of the models later.3.2.Model evaluationThe relative ﬁgure of merit,r0,is always lower forRand-12 cross-validation than the within-sample ﬁts(i.e.,not cross-validated) (results not shown),as wouldbeexpected.On average,this difference is on the order of1%,but for the neural network models the difference iscommonly on the order of 3–5%.This loss of generalitymay be indicative of some degree of over-ﬁtting.Similarly,the relative ﬁgure of merit,r0,is alwayshigher for the Rand-12 than Month-12 cross-validations(Table 2).For Lethbridge,the difference between Rand-12 r0and Month-12 r0is especially pronounced,whereasfor Harvard,this difference is negligible for most models.Although the fourth-order Fourier model ranks well byRand-12r0for most data sets,it consistentlyranks poorlyby Month-12 r0,and thus this model appears particularlyunsuitable for large gaps.Based on analysis of concordance statistics (Table 3),there is strong (and statistically signiﬁcant) agreementbetween the three methods of ranking models for eachtower site.However,for each of the tower data sets,theRand-12 and AICWADrankings are in much betteragreement than either of these with the Month-12rankings.Across sites there is reasonable (andstatistically signiﬁcant) agreement in whole-ecosystemrespiration model rankings based on either the Rand-12or AICWADcriteria but not the Month-12 criteria.Theoverall similarity of model rankings observed acrossthe ﬂux sites is also observed within a forest—i.e.,A.D.Richardson et al./Agricultural and Forest Meteorology 141 (2006) 219–234 225among Howland-Main,Howland-Subcanopy,andHowland-Autochamber—when models are ranked byAICWAD(Fig.1).Several important and consistent patterns emergefrom Table 2.Perhaps most notable (given thewidespread use of these particular models) is that thebasic Q10model (whether driven by air or soiltemperature),and the restricted form of the Lloyd andTaylor model,tends to appear near the bottom of therankings for all data sets.The three-parameter Lloyd andA.D.Richardson et al./Agricultural and Forest Meteorology 141 (2006) 219–234226Table 2Respiration model rankings based on three different criteriaRand-12 and Month-12 denote two cross-validation schemes;AICWADis a version of Akaike’s Information Criterion appropriate for weightedabsolute deviations optimization.Models are described in Table 1.Continuous soil moisture data not available for Harvard,and so Q10-Gresp andNN-SAJWmodels not used.Data were analyzed by analysis of variance (ANOVA),with ‘‘Model’’ and ‘‘Year’’ as main effects.Vertical lines denotemodels that are not signiﬁcantly different (95%conﬁdence),based on multiple comparison test.Lethbridge,Harvard,and Howland-Main data reﬂectwhole-ecosystem respiration,whereas Howland-Subcanopy and Howland-Autochamber reﬂect soil respiration.Taylor,logistic,and Q10-vTemp models all have asigmoid temperature response,and are ranked in themiddle of thepackfor all data sets andevaluationmetrics.The modiﬁed Q10models show varying degrees ofimprovement over the basic Q10model.For example,inclusion of a soil moisture term(the Q10-Gresp model)results in greatly improved performance for Lethbridge,but little or no improvement for Howland-Main.Resultsfromthe Q10-vTemp and Q10-vTime models support theidea that the temperature sensitivity of respiration is notﬁxed,but rather decreases with increasing temperatureand is lower in summer than winter.By accounting forvariable temperature sensitivity,these models tend notto exhibit the seasonal biases in model predictions thatfrequently characterize the basic Q10model.There are also clear beneﬁts to using models that gobeyond the basic temperature–respiration relationship:incorporating additional data,such as soil water contentor Julian date,results in a superior model.For example,the more highly parameterized neural network models(NN-SAJ and NN-SAJW) consistently rank at the top ofthe Rand-12 rankings (and,with the exception ofHowland-Subcanopy,the AICWADrankings),althoughsomewhat lower in the Month-12 rankings.The neuralnetwork models are particularly useful for multipleenvironmental drivers because no functional form needbe speciﬁed.3.3.Model predictions3.3.1.Annual sums of respirationThe modeled annual sum of respiration (‘‘annualSR’’) depends on the particular model used,butdifferences among models are not very consistent acrosssites (Table 4;note that annual sums are not calculatedforthe Howland-Autochamber measurements because thissystem was only operated during the growing season).For example,the Q10-A model (which we have alreadyseen is a poor choice,e.g.,Table 2),predicts the highestmean annual SR for Harvard and Howland-Main,butvery low mean annual respiration sums for Lethbridgeand the Howland-Subcanopy.The fourth-order Fouriermodel predicts the highest mean annual SR for all datasets with the exception of Harvard,for which it givesamong the lowest mean annual SR.However,for all datasets,the sigmoidal models (Lloyd and Taylor,logistic,and Q10-vTemp) are ranked near the median while themore highly parameterized neural networks (e.g.,NN-SAJ and NN-SAJW models) give consistently higherpredictions for mean annual SR.There is relatively little difference among models forHarvard (range [maximum–minimum] of mean pre-dicted annual SR = 75 g C m2y1),whereas forHowland-Main (range = 355 g C m2y1) the effectof choosing among different models is more pro-nounced (Table 4).At Lethbridge,the absolute rangeA.D.Richardson et al./Agricultural and Forest Meteorology 141 (2006) 219–234 227Table 3Concordance of model rankings across sites and methods,based onKendall’s W coefﬁcientComparison WWithin individual sitesHowland,all three rankings 0.95**Harvard,all three rankings 0.89**Lethbridge,all three rankings 0.79**Howland,AICWADvs.Rand-12 0.99**Howland,AICWADvs.Month-12 0.95**Howland,Rand-12 vs.Month-12 0.95**Harvard,AICWADvs.Rand-12 0.98**Harvard,AICWADvs.Month-12 0.88**Harvard,Rand-12 vs.Month-12 0.88**Lethbridge,AICWADvs.Rand-12 0.99**Lethbridge,AICWADvs.Month-12 0.77*Lethbridge,Rand-12 vs.Month-12 0.76*Across all sitesRanking by AICWAD0.67**Ranking by Rand-12 0.68**Ranking by Month-12 0.32 NSSigniﬁcance level of coefﬁcient:**P 0.001;*P 0.01;NS:notsigniﬁcant.Fig.1.Concordance of model rankings (based on AICWAD) ﬁt to three data sets fromthe Howland Forest:Howland-Main,Howland-Subcanopy,andHowland-Autochamber.W is Kendall’s coefﬁcient of concordance.Models are identiﬁed by codes given in Table 1.(129 g C m2y1) is moderate,but the relative magni-tude of this range (40% of the mean annual SR) isvery large (cf.10% at Harvard).3.3.2.Correlations between goodness-of-ﬁt andannual sumsFor each of the four eddy covariance data sets,thereis a signiﬁcant ( p 0.01) positive correlation betweenmean (across all years) within-sample relative ﬁgure ofmerit,r0,and mean annual SR for each data set (Fig.2).Thus,the models that ﬁt better (based on within-sampler0) at a given site tend to predict larger annual sums ofrespiration,compared to models that ﬁt less well.Forthe Lethbridge (r = 0.96) and Howland-Subcanopy(r = 0.84) data sets,this relationship is especiallystrong.For Howland-Main,the correlation is r = 0.65,but the Q10-A model (model DA in Fig.1C) is anobvious exception to the general pattern:exceptingQ10-A,the correlation rises to r = 0.80.3.4.What is the appropriate temperature?Most of the models tested here are based on arelationship between temperature and ecosystemA.D.Richardson et al./Agricultural and Forest Meteorology 141 (2006) 219–234228Table 4Modeled annual sum of respiration (g C m2y1,mean across all years for a given site) for four eddy covariance data sets and 19 differentrespiration models (model details are contained in Table 1)Continuous soil moisture data not available for Harvard,and so Q10-Gresp and NN-SAJW models not used.Data were analyzed by analysis ofvariance (ANOVA),with ‘‘Model’’ and ‘‘Year’’ as main effects.Vertical lines denote models that are not signiﬁcantly different (95%conﬁdence),based on multiple comparison test.‘‘Annual sum’’ results differ from ‘‘Ranked sum’’ because for ‘‘Ranked sum’’ the ANOVAwas conducted onrank-transformed data (untransformed data appeared not to meet assumptions of normally distributed residuals and homogeneity of variance).Forcomparative purposes,both analyses are presented here,but note that Kendall’s coefﬁcient of concordance indicates very good agreement(W0.97,P 0.001) between the two approaches for all four data sets.respiration (Table 1).It is not clear what the appropriatetemperature metric is;respiration has both above andbelowground components,and to date there is noconsensus on the best measure to use as a ‘‘bulk’’ecosystem temperature (Van Dijk and Dolman,2004;Reichstein et al.,2005).For example,results presentedin Table 2 demonstrate that a Q10model based on soiltemperature ﬁts better than one based on air temperaturefor four of the ﬁve data sets used here (only at Harvard isthis not true;see also Van Dijk and Dolman,2004).Furthermore,the neural network analysis shows thatmore of the variation in ecosystemrespiration at all sitescan be accounted for when both air and soil temperature(NN-SA) are used together than when either is usedalone (i.e.,NN-S or NN-A).We note,however,thatbecause soil temperature varies with soil depth (deeperhorizons exhibit more muted diurnal and seasonalpatterns),model ﬁt and model predictions may varydepending on the depth at which the temperature ismeasured.We used soil temperature proﬁles at How-land-Main (2003–2004 only),Howland-Autochamber,and Lethbridge to investigate these relationships.At Howland-Main,for both 2003 and 2004,thewithin-sample r0steadily decreases (i.e.,model ﬁtbecomes worse) with increasing temperature depth;thedifference between r0for Tsoilat 2 cmand Tsoilat 40 cmwas 15% for the Q10-S model.Similarly,forHowland-Autochamber,the average (across all cham-bers) within-sample r0decreases by over 40%betweenTsoilat 2 cm and Tsoilat 40 cm for the Q10-S model.However,in contrast to these results,for Lethbridge,there is a slight (4%),but consistent,increase in within-sample r0as Tsoilmeasurement depth is increased from2 cm to 16 cm.Just as model ﬁt varies depending on the depth atwhich soil temperature was measured,there are alsoassociated effects on annual SR.For example,at bothHowland-Main (where model ﬁt becomes progres-sively worse as a deeper soil temperature is used) andLethbridge (where model ﬁt becomes progressivelybetter as a deeper soil temperature is used) modeledannual SR decreases monotonically with Tsoilmea-surement depth.At Howland-Main,the difference inmodeled annual SR between Tsoilat 2 cm and TsoilatA.D.Richardson et al./Agricultural and Forest Meteorology 141 (2006) 219–234 229Fig.2.For each eddy covariance data set,there is a positive correlation between model goodness-of-ﬁt (mean relative ﬁgure of merit,r0;see Eq.(6))and the modeled annual sum of respiration (mean annual SR,g C m2y1).Models are identiﬁed by codes given in Table 1.40 cm is 14%.At Lethbridge,the overall pattern isthe same,but in some years (1999,2000) thedifference in modeled annual SR between Tsoilat2 cm and Tsoilat 16 cm is larger (40%) than otheryears (2003,2004;difference 5%).Note that 1999and 2000 were dry years with low absolute ﬂuxes,whereas 2003 and 2004 had higher soil moisture andhigher absolute ﬂuxes.4.DiscussionMorgenstern et al.(2004) reported that 29 eddycovariance studies published between 1993 and 2001 allused different methods to model nocturnal CO2efﬂux(i.e.,respiration),but they suggested that eddycovariance data were too noisy to permit deﬁnitivestatements about the suitability of different modelingapproaches (see also Falge et al.,2001).We have shownhere that when objective model selection criteria areapplied,and results compared across a range ofecosystem types and data sources,some consistentpatterns emerge.Two of the most widely used models ofecosystemand soil respiration,the basic Q10model andthe ‘‘restricted’’ form of the Lloyd and Taylor model(Eq.(11) in Lloyd and Taylor,1994) do a poor job ofaccounting for observed variation in ecosystemand soilrespiration in comparison with other simple models(Table 2).The drawbacks of the basic Q10model havebeen enumerated previously (Lloyd and Taylor,1994;Janssens and Pilegaard,2003;Davidson et al.,2006a).The ‘‘restricted’’ Lloyd and Taylor model has one freeparameter,a scaling coefﬁcient reﬂecting the baserespiration rate (u1in Table 1,model Er).The twoparameters in the exponential were ﬁt (u2= 308.56,u3= 46.02;these control the temperature response) byLloyd and Taylor (1994) using a soil respiration dataset.We found that freeing the parameters u2and u3substantially improve the ﬁt to the respiration data,aresult previously reported by Falge et al.(2001).However,with all parameters free to vary,the model ispoorly constrained and parameter estimates are highlycorrelated with one another.In a previous study(Richardson and Hollinger,2005),we found that anyone of the three parameters may be ﬁxed with little illeffect,yielding a well-constrained two-parameterequation (see also Reichstein et al.,2005).The analysis here supports the contention that thetemperature sensitivity of respiration declines at highertemperatures (Lloyd and Taylor,1994;Kirschbaum,1995;Tjoelker et al.,2001) and thus a sigmoidal,ratherthan purely exponential (i.e.,basic Q10model) equationis recommended.The Q10-vTemp,Lloyd and Taylor,and logistic models are all essentially comparable three-parameter formulations of a sigmoidal function andprovide better ﬁts to (and predictions of) both whole-ecosystem and soil respiration,compared to the Q10model.Previous studies have noted that the short-termtemperature sensitivity of respiration tends to be lessthan the long-termsensitivity (at least in summer-activeecosystems,Reichstein et al.,2005),because the long-termsensitivity is confounded with seasonal patterns ofphenology and environmental conditions (Drewitt et al.,2002;Janssens and Pilegaard,2003;Curiel Yuste et al.,2004).Some of these authors have suggested that asolution to this problemis to ﬁt the basic Q10model on ashorter time step (e.g.,days or weeks to months;see alsoFalge et al.,2001).One advantage of the Q10-vTimemodel (in which the temperature sensitivity wasconstrained to vary seasonally) compared to thisapproach is that relatively few parameters need to beﬁt:four in total,versus 2t,where t is the number ofseparate model ﬁts during the year (at a minimum,t = 4for a model ﬁt separately to each season).Furthermore,the Q10-vTime model is not subject to problems such asthose reported by Janssens and Pilegaard (2003),whofound that ﬁtting the basic Q10model to shorter (4–7days) time periods could produce erroneous parameterestimates if the temperature range during a given periodwas too small.An additional problemwith ﬁtting Q10ata short time step is that spurious (i.e.,offsetting)seasonal variation in the base rate and temperaturesensitivity parameters may result as an artifact simplybecause of the multiplicative nature of the modelstructure and the resulting inherent correlation ofparameter estimates.This is why we did not consider amodel where both parameters could be simultaneouslytime varying.However,an alternative approach to theone we followed here would be to allowthe base rate tobe time varying but have the temperature sensitivityﬁxed over the course of the year.Our Q10-vTime model ranks near the top (in terms ofAICWAD) for all data sets.With this approach,seasonalbiases (e.g.,at Howland-Main:under-estimation of Recoin spring and fall and over-estimation of Recoduringsummer) associated with the standard Q10model aremore or less eliminated.The temperature sensitivity ishighest in the winter,and lowest in the summer,aswould be expected (Janssens and Pilegaard,2003;butcf.Van Dijk and Dolman,2004,who concluded that itwas the base rate of respiration,rather than thetemperature sensitivity,that varied seasonally).Whilethis is in principle similar to results fromthe Q10-vTempmodel (i.e.,temperature sensitivity decreases withA.D.Richardson et al./Agricultural and Forest Meteorology 141 (2006) 219–234230increasing soil temperature),it is interesting to note thatthe temperature sensitivity in the Q10-vTime modelexhibits seasonal hysteresis in its relationship with soiltemperature (Fig.3),which suggests that the linearrelationship between temperature sensitivity and soiltemperature that is implicit in the Q10-vTemp modelmay be an over-simpliﬁcation (note that by AICWAD,Q10-vTime is always ranked ahead of Q10-vTemp;seeTable 2).The hysteresis may be a manifestation ofdifferences between Rsoiland Recoin the seasonalpatterns of temperature sensitivity (Davidson et al.,2006a).For example,based on results from Howland-Main and Howland-Subcanopy,it appears that theamplitude of seasonal variation in the temperaturesensitivity is larger for Recothan Rsoil,and the point ofminimum temperature sensitivity is reached about 1month later for Recothan Rsoil.Morgenstern et al.(2004) noted that the choice of aparticular respiration model could be a source of bias inannual sums of NEE and its components,since differentgap ﬁlling approaches would yield different results.Ourresults go one step beyond this:the correlation betweenmodel goodness-of-ﬁt and annual SR (Fig.2) suggeststhat selecting a poorly ﬁtting model may result in asystematic under-estimation of the ‘‘true’’ annualrespiratory ﬂux (although we acknowledge that the‘‘true’’ ﬂux can never be known).We believe that thisrelationship arises from the fact that in temperateecosystems that experience cold winters there is amismatch between the leveraging of different soiltemperatures on the optimized ﬁgure of merit (i.e.,V)and on annual SR.For example,to use Howland-Mainas an example (Fig.4),for roughly half the year (51%ofall observations),Tsoilis below 5 8C.These cold-soilperiods account for about 55%of the total V.However,because of the strong temperature–respiration relation-ship,cold-soil periods account for just 15% of themodeled annual SR.On the other hand,Tsoilis above15 8Cfor only 18%of the year.These warm-soil periodsaccount for only 14% of the total V,but 40% of themodeled annual SR.All models tend to ﬁt more or lessequally well during cold-soil periods,when the ﬂux islow,but poor models (which are not ﬂexible enough tosimultaneously ﬁt well across the entire temperaturerange) tend to under-estimate the ﬂux under warm-soilperiods.The model ﬁt is,therefore,highly leveraged bycold-soil periods (which have minimal inﬂuence onannual SR),whereas annual SR is highly leveraged bywarm-soil periods (which have minimal inﬂuence onmodel ﬁt).A consequence of the systematic bias in Recothatmay be imparted fromselecting a poorly ﬁtting model isthat biased estimates of Recowill affect the partitioningof NEE.Since gross ecosystemexchange (GEE) may beestimated from Reco(i.e.,GEE = NEE Reco,wheredaytime Recois a modeled result),the choice of modelfor Recocan have signiﬁcant additional consequences(for an evaluation of NEE partitioning approaches,seeReichstein et al.,2005;Stoy et al.,2006).For example,lack of consistency in the Recomodel could lead toerroneous conclusions about site-to-site differences inGEE (Falge et al.,2001).A.D.Richardson et al./Agricultural and Forest Meteorology 141 (2006) 219–234 231Fig.3.Seasonal hysteresis in the relationship between soil tempera-ture and the temperature sensitivity of respiration,Q10,as estimatedfrom a model where the temperature sensitivity is time-varying.Thetemperature sensitivity is larger in winter than in summer,andgenerally declines with increasing soil temperature.However,at agiven soil temperature,the temperature sensitivity is greater in theautumn than the spring.Results are shown for two typical years fromthe Howland-Main data set.Fig.4.Leverage of different temperature classes on the modeledannual sum of respiration (annual SR),and the ﬁgure of merit (costfunction) used for model ﬁtting.Soil temperatures of 0 8Cexert a largeinﬂuence on model ﬁt,but make only a small contribution to theannual SR.By comparison,warmer soil temperatures exert less of aninﬂuence on model ﬁt,but make a far greater contribution to theannual respiratory ﬂux,since respiration is an increasing function oftemperature.The differences between Rand-12 and Month-12 r0rankings illustrates that long gaps are more difﬁcult toﬁll than short gaps,a result which is well-known.Thiscan be attributed to non-stationarity of the underlyingphysiology (factors controlling respiratory processesmay be changing over time,e.g.,acclimation,phenol-ogy of root growth,addition of new litter inputs,springmelting of snow cover;see Tjoelker et al.,2001;Janssens and Pilegaard,2003;Davidson et al.,2006b),coupled with the fact that un-modeled (or poorlyconstrained) and longer-term processes probably alsoinﬂuence R.These problems are especially acute whenlong gaps are located in a portion of the multi-dimensional ‘‘environment space’’ that extends beyondthe domain used for parameterization.High-orderpolynomial and Fourier models,as well as neuralnetworks,may be more subject to over-ﬁtting,and thusprone to poor behavior when extrapolated in thismanner.Seasonal patterns in ecosystem respiration resultfromtemporal changes in both physical factors (e.g.,thetemperature of different respiring components) andbiological factors (e.g.,the base activity of differentrespiring components),as well as interactions betweenthese two factors.Our analysis indicates that model ﬁt,and also model predictions,vary somewhat dependingon the temperature used as a driver for respiration.Forexample,the neural network model driven by both airand soil temperature consistently performs better than amodel driven by just one of these temperatures.On theother hand,our results indicate that use of a deeper soiltemperature improved model ﬁt at a grassland site(Lethbridge),although the reverse was true at a forestedsite (Howland).These results contribute to the ongoingdebate about the best measure of bulk ecosystemtemperature.5.ConclusionSelecting from among a range of candidate modelsrequires the application of objective model selectioncriteria.We used cross-validation methods and aversion of Akaike’s Information Criterion to evaluatedifferent models for both soil and whole-ecosystemrespiratory ﬂuxes of CO2.We found that the basic Q10model is a particularly poor choice,despite itswidespread use in the literature,as it was consistentlyranked poorly by all selection criteria and for the ﬁvedifferent data sets we used.The restricted,one-parameter Lloyd and Taylor model should similarlybe avoided.Neural network models are comparativelylittle-used,but consistently ranked the highest,especially when additional covariates (besides soiltemperature) were included (e.g.,NN-SAJW).Neuralnetworks may be the best tool for ﬁlling short gaps,but a potential problem is the possibility of over-parameterization,which may result in poor predic-tions for large gaps.A revised version of the Q10model,in which the temperature sensitivity ofrespiration was allowed to vary over time,was aclear improvement over the standard Q10model.Thisimproved performance (consistently ranked well byAICWAD) came at the expense of only a few extraﬁtted parameters.Compared to the neural networks,this approach has the advantage that the modelparameters can still be interpreted to obtaininsight into ecosystem processes;the ‘‘black box’’nature of neural networks makes such interpretationimpossible.Selecting a good model is especially importantbecause of the observed correlation between modelgoodness-of-ﬁt and model predictions at the annualtime step.Because the sorts of models we investigatedhere are widely used to ﬁll ﬂux data records,to partitionNEE to Recoand GEE,and as components of ecosystemmodels (such as PnET and Biome-BGC),we suggestthat carbon-cycle scientists and other ecosystemecologists need to pay more careful attention to issuesof model selection.AcknowledgementsThis project was supported by the Ofﬁce of Science(BER),U.S.Department of Energy,through the NationalInstitute for Global Environmental Change and TCPprograms,the USDA Forest Service Northern GlobalChange Program,and the Natural Sciences and Engi-neeringResearchCouncil of Canada(NSERC).Researchat Howland was supported by the Department of Energythrough Agreement Nos.DE-AI02-00ER63028,DE-FC03-90ER61010,DE-FC02-03ER63613 and DE-FG02-00ER63002,with additional support from USDAAward No.04-DG-11242343-016.Flux data fromthe tower sites are available at http://public.ornl.gov/ameriﬂux/subject to AmeriFlux ‘‘Fair-use’’ rules.ReferencesAkaike,H.,1973.Information theory and an extension of the max-imum likelihood principle,in:Petrov,B.N.,Csaki,F.(Eds.),Proceedings of the Second International Symposium on Informa-tion Theory.Akademiai Kiado,Budapest,pp.267–281.(Repro-duced in Kotz,S.,Johnson,N.L.(Eds.),2003.Breakthroughs inStatistics,Vol.I,Foundations and Basic Theory.Springer-Verlag,New York,pp.610–624.)A.D.Richardson et al./Agricultural and Forest Meteorology 141 (2006) 219–234232Anderson,D.R.,Burnham,K.P.,Thompson,W.L.,2000.Null hypoth-esis testing:problems,prevalance and an alternative.J.WildlifeManage.64,912–923.Aubinet,M.,Chermanne,B.,Vandenhaute,M.,Longdoz,B.,Yernaux,M.,Laitat,E.,2001.Long term carbon dioxide exchange above amixed forest in the Belgian Ardennes.Agric.For.Meteorol.108,293–315.Barford,C.C.,Wofsy,S.C.,Goulden,M.L.,Munger,J.W.,Pyle,E.H.,Urbanski,S.P.,Hutyra,L.,Saleska,S.R.,Fitzjarrald,D.,Moore,K.,2001.Factors controlling long- and short-termsequestration ofatmospheric CO2in a mid-latitude forest.Science 294,1688–1691.Barr,A.G.,Grifﬁs,T.J.,Black,T.A.,Lee,X.,Staebler,R.M.,Fuentes,J.D.,Chen,Z.,Morgenstern,K.,2002.Comparing the carbonbudgets of boreal and temperate deciduous forest stands.Can.J.For.Res.32,813–822.Bishop,C.M.,1996.Neural Networks for Pattern Recognition.OxfordUniversity Press,New York.Black,T.A.,Den Hartog,G.,Neumann,H.H.,Blanken,P.D.,Yang,P.C.,Russell,C.,Nesic,Z.,Lee,X.,Chen,S.G.,Staebler,R.,Novak,M.D.,1996.Annual cycles of water vapour and carbondioxide ﬂuxes in and above a boreal aspen forest.Global ChangeBiol.2,219–229.Braswell,B.H.,Sacks,W.J.,Linder,E.,Schimel,D.S.,2005.Estimat-ing diurnal to annual ecosystem parameters by synthesis of acarbon ﬂux model with eddy covariance net ecosystem exchangeobservations.Global Change Biol.11,335–355.Burman,P.,Nolan,D.,1995.A general Akaike-type criterion formodel selection and robust regression.Biometrika 82,877–886.Carlyle,J.C.,Ba Than,U.,1988.Abiotic controls of soil respirationbeneath an eighteen-year-old Pinus radiata stand in south-easternAustralia.J.Ecol.76,654–662.Curiel Yuste,J.,Janssens,I.A.,Carrara,A.,Ceulemans,R.,2004.Annual Q10of soil respiraiton reﬂects plant phenological pat-terns as well as temperature sensitivity.Global Change Biol.10,161–169.Davidson,E.A.,Janssens,I.A.,Luo,Y.,2006a.On the variability ofrespiration in terrestrial ecosystems:moving beyond Q10.GlobalChange Biol.12,154–164.Davidson,E.A.,Richardson,A.D.,Savage,K.E.,Hollinger,D.Y.,2006b.Adistinct seasonal pattern of the ratio of soil respiration tototal ecosystem respiration in a spruce-dominated forest.GlobalChange Biol.12,230–239.Del Grosso,S.J.,Parton,W.J.,Mosier,A.R.,Holland,E.A.,Pendall,E.,Schimel,D.S.,Ojima,D.S.,2005.Modeling soil CO2emis-sions from ecosystems.Biogeochemistry 73,71–91.Drewitt,G.B.,Black,T.A.,Nesic,Z.,Humphreys,E.R.,Jork,E.M.,Swanson,R.,Ethier,G.J.,Grifﬁs,T.,Morgenstern,K.,2002.Measuring forest ﬂoor CO2ﬂuxes in a Douglas-ﬁr forest.Agric.For.Meteorol.110,299–317.Falge,E.,Baldocchi,D.,Olson,R.,Anthoni,P.,Aubinet,M.,Bern-hofer,C.,Burba,G.,Ceulemans,R.,Clement,R.,Dolman,H.,Granier,A.,Gross,P.,Grunwald,T.,Hollinger,D.,Jensen,N.O.,Katul,G.,Keronen,P.,Kowalski,A.,Lai,C.T.,Law,B.E.,Meyers,T.,Moncrieff,H.,Moors,E.,Munger,J.W.,Pilegaard,K.,Rannik,U.,Rebmann,C.,Suyker,A.,Tenhunen,J.,Tu,K.,Verma,S.,Vesala,T.,Wilson,K.,Wofsy,S.,2001.Gap ﬁlling strategies fordefensible annual sums of net ecosystem exchange.Agric.For.Meteorol.107,43–69.Farquhar,G.D.,von Caemmerer,S.,Berry,J.A.,1980.Abiochemicalmodel of photosynthetic CO2assimilaiton in leaves of C-3 species.Planta 149,78–90.Flanagan,L.B.,Johnson,B.G.,2005.Interacting effects of tempera-ture,soil moisture and plant biomass production on ecosystemrespiration in a northern temperate grassland.Agric.For.Meteorol.130,237–253.Flanagan,L.B.,Wever,L.A.,Carlson,P.J.,2002.Seasonal and inter-annual variation in carbon dioxide exchange and carbon balance ina northern temperate grassland.Global Change Biol.8,599–615.Hagen,S.C.,Braswell,B.H.,Linder,E.,Frolking,S.,Richardson,A.D.,Hollinger,D.Y.,2006.Statistical uncertainty of eddy-ﬂuxbased estimates of gross ecosystem carbon exchange at HowlandForest Maine.J.Geophys.Res.Atmos.111,D08S03.Hastie,T.,Tibshirani,R.,Friedman,J.,2001.The Elements ofStatistical Learning:Data Mining,Inference and Prediction.Springer,New York,p.552.Hollinger,D.Y.,Kelliher,F.M.,Byers,J.N.,Hunt,J.E.,McSeveny,T.M.,Weir,P.L.,1994.Carbon dioxide exchange between anundisturbed old-growth temperate forest and the atmosphere.Ecology 75,134–150.Hollinger,D.Y.,Aber,J.,Dail,B.,Davidson,E.A.,Goltz,S.M.,Hughes,H.,Leclerc,M.,Lee,J.T.,Richardson,A.D.,Rodrigues,C.,Scott,N.A.,Varier,D.,Walsh,J.,2004.Spatial and temporalvariability in forest-atmosphere CO2exchange.Global ChangeBiol.10,1689–1706.Hollinger,D.Y.,Goltz,S.M.,Davidson,E.A.,Lee,J.T.,Tu,K.,Valentine,H.T.,1999.Seasonal patterns and environmental con-trol of carbon dioxide and water vapour exchange in an ecotonalboreal forest.Global Change Biol.5,891–902.Hollinger,D.Y.,Richardson,A.D.,2005.Uncertainty in eddy covar-iance measurements and its application to physiological models.Tree Physiol.25,873–885.Hurvich,C.M.,Tsai,C.L.,1990.Model selection for least absolutedeviations regression in small samples.Statist.Probab.Lett.9,259–265.Janssens,I.A.,Dore,S.,Epron,D.,Lankreijer,H.,Buchmann,N.,Longdoz,B.,Brossaud,J.,Montagnani,L.,2003.Climatic inﬂu-ences on seasonal and spatial differences in soil CO2efﬂux.In:Valentini,R.(Ed.),Fluxes of Carbon,Water and Energy ofEuropean Forests.Springer-Verlag,Berlin,pp.233–253.Janssens,I.A.,Pilegaard,K.,2003.Large seasonal changes inQ10of soil respiration in a beech forest.Global Change Biol.9,911–918.Kirschbaum,M.U.F.,1995.The temperature dependence of soilorganic matter decomposition,and the effect of global warmingon soil organic matter storage.Soil Biol.Biochem.27,753–760.Lloyd,J.,Taylor,J.A.,1994.On the temperature dependence of soilrespiration.Funct.Ecol.8,315–323.MacKay,D.,1994.Bayesian methods for backpropagation network.In:Domany,E.,van Hemmen,J.,Schulten,K.(Eds.),Models ofNeural Networks,vol.III.Springer,New York,pp.211–254.Metropolis,N.,Rosenbluth,A.W.,Rosenbluth,M.N.,Teller,A.H.,Teller,E.,1953.Equations of state calculations by fast computingmachines.J.Chem.Phys.21,1087–1092.Morgenstern,K.,Black,T.A.,Humphreys,E.R.,Grifﬁs,T.J.,Drewitt,G.B.,Cai,T.B.,Nesic,Z.,Spittlehouse,D.L.,Livingstone,N.J.,2004.Sensitivity and uncertainty of the carbon balance of a PaciﬁcNorthwest Douglas-ﬁr forest during an El Nino-La Nina cycle.Agric.For.Meteorol.123,201–219.Oren,R.,Hsieh,C.I.,Stoy,P.,Albertson,J.D.,McCarthy,H.R.,Harrell,P.,Katul,G.G.,2006.Estimating the uncertainty in annualnet ecosystem carbon exchange:spatial variation in turbulentﬂuxes and sampling errors in eddy-covariance measurements.Global Change Biol.12,883–896.A.D.Richardson et al./Agricultural and Forest Meteorology 141 (2006) 219–234 233Papale,D.,Valentini,A.,2003.Anewassessment of European forestscarbon exchanges by eddy ﬂuxes and artiﬁcial neural networkspatialization.Global Change Biol.9,525–535.Press,W.H.,Teukolsky,S.A.,Vetterling,W.T.,Flannery,B.P.,1993.Numerical Recipes in Fortran 77:The Art of Scientiﬁc Comput-ing.Cambridge,UP,New York,p.992.Raupach,M.R.,Rayner,P.J.,Barrett,D.J.,Defries,R.S.,Heimann,M.,Ojima,D.S.,Quegan,S.,Schmullius,C.C.,2005.Model-datasynthesis in terrestrial carbon observation:methods,data require-ments and data uncertainty speciﬁcations.Global Change Biol.11,378–397.Reichstein,M.,Falge,E.,Baldocchi,D.,Papale,D.,Aubinet,M.,Berbigier,P.,Bernhofer,C.,Buchmann,N.,Gilmanov,T.,Granier,A.,Grunwald,T.,Havrankova,K.,Ilvesniemi,H.,Janous,D.,Knohl,A.,Laurila,T.,Lohila,A.,Loustau,D.,Matteucci,G.,Meyers,T.,Miglietta,F.,Ourcival,J.M.,Pumpanen,J.,Rambal,S.,Rotenberg,E.,Sanz,M.,Tenhunen,J.,Seufert,G.,Vaccari,F.,Vesala,T.,Yakir,D.,Valentini,R.,2005.On the separation of netecosystem exchange into assimilation and ecosystem respiration:review and improved algorithm.Global Change Biol.11,1424–1439.Richardson,A.D.,Hollinger,D.Y.,2005.Statistical modeling ofecosystem respiration using eddy covariance data:maximumlikelihood parameter estimation,and Monte Carlo simulation ofmodel and parameter uncertainty,applied to three simple models.Agric.For.Meteorol.131,191–208.Richardson,A.D.,Hollinger,D.Y.,Burba,G.G.,Davis,K.J.,Flana-gan,L.B.,Katul,G.G.,Munger,J.W.,Ricciuto,D.M.,Stoy,P.C.,Suyker,A.E.,Verma,S.B.,Wofsy,S.C.,2006.A multi-siteanalysis of random error in tower-based measurements of carbonand energy ﬂuxes.Agric.For.Meteorol.136,1–18.Stoy,P.C.,Katul,G.G.,Siqueira,M.B.S.,Juang,J.-Y.,Novick,K.A.,Oren,R.,2006.An evaluation of models for partitioning eddycovariance-measured net ecosystemexchange into photosynthesisand respiration.Agric.For.Meteorol.141,2–18.Savage,K.E.,Davidson,E.A.,2001.Interannual variation of soilrespiration in two New England forests.Global Biogeochem.Cycles 15,337–350.Savage,K.E.,Davidson,E.A.,2003.A comparison of manual andautomated systems for soil CO2ﬂux measurements:tradeoffsbetween spatial and temporal resolution.J.Exp.Botany 54,891–899.Schulz,K.,Jarvis,A.,Beven,K.,Soegaard,H.,2001.The predictiveuncertainty of land surface ﬂuxes in response to increasingambient carbon dioxide.J.Climate 14,2551–2562.Sokal,R.R.,Rohlf,F.J.,1995.Biometry.Freeman,New York,p.887.Tjoelker,M.G.,Oleksyn,J.,Reich,P.B.,2001.Modelling respirationof vegetation:evidence for a general temperature-dependentQ10.Global Change Biol.7,223–230.Van Dijk,A.I.J.M.,Dolman,A.J.,2004.Estimates of CO2uptake andrelease among European forests based on eddy covariance data.Global Change Biol.10,1445–1459.Van Wijk,M.T.,Bouten,W.,1999.Water and carbon ﬂuxes aboveEuropean coniferous forests modelled with artiﬁcial neural net-works.Ecol.Modell.120,181–197.Van Wijk,M.T.,Bouten,W.,2002.Simulating daily and half-hourlyﬂuxes of forest carbon dioxide and water vapor exchangewith a simple model of light and water use.Ecosystems 5,597–610.Wofsy,S.C.,Goulden,M.L.,Munger,J.W.,Fan,S.M.,Bakwin,P.S.,Daube,B.C.,Bassow,S.L.,Bazzaz,F.A.,1993.Net exchange ofCO2in a mid-latitude forest.Science 260,1314–1317.A.D.Richardson et al./Agricultural and Forest Meteorology 141 (2006) 219–234234