Taking off the training wheels : the properties of a dynamic vegetation model without climate envelopes , CLM 4 . 5 ( ED )

We describe an implementation of the Ecosystem Demography (ED) concept in the Community Land Model. The structure of CLM(ED) and the physiological and structural modifications applied to the CLM are presented. A major motivation of this development is to allow the prediction of biome boundaries directly from plant physiological traits via their competitive interactions. Here we investigate the performance of the model for an example biome boundary in eastern North America. We explore the sensitivity of the predicted biome boundaries and ecosystem properties to the variation of leaf properties using the parameter space defined by the GLOPNET global leaf trait database. Furthermore, we investigate the impact of four sequential alterations to the structural assumptions in the model governing the relative carbon economy of deciduous and evergreen plants. The default assumption is that the costs and benefits of deciduous vs. evergreen leaf strategies, in terms of carbon assimilation and expenditure, can reproduce the geographical structure of biome boundaries and ecosystem functioning. We find some support for this assumption, but only under particular combinations of model traits and structural assumptions. Many questions remain regarding the preferred methods for deployment of plant trait information in land surface models. In some cases, plant traits might best be closely linked to each other, but we also find support for direct linkages to environmental conditions. We advocate intensified study of the costs and benefits of plant life history strategies in different environments and the increased use of parametric and structural ensembles in the development and analysis of complex vegetation models.


Introduction
The storage of carbon on the land surface, and how the land surface interacts with the atmosphere, are both determined to some extent by the distribution of plant types, or ecosystem composition, across the globe.Ecosystem composition is, at large scales, determined by past and present climate conditions (Holdridge, 1967;Woodward, 1987).Given projected changes in climate, the composition of ecosystems may well be expected to change in the coming decades and centuries (Cox et al., 2000;Sitch et al., 2003), and thus the carbon stored on the land is potentially subject to large deviations from the current state.Additionally, biome shifts such as woody encroachment in the Arctic with a warmer climate (Levis et al., 2000;Swann et al., 2010) and greening of the Sahara with a wetter climate (Levis et al., 2004) significantly alter climate by changing surface albedo and evapotranspiration (Rogers et al., 2013).Thus, the representation of biome distribution has emerged as a key new feature of Earth system models (ESMs) in recent years (Cox et al., 2000;Levis et al., 2004;Krinner et al., 2005;Sato et al., 2007).
Published by Copernicus Publications on behalf of the European Geosciences Union.
Models that simulate the redistribution of plant types in space and time are collectively referred to as dynamic vegetation models or DGVMs (in that vegetation cover is an emergent or dynamic outcome of the model).Most major climate models now include some functionality to simulate dynamic vegetation (Cox et al., 2000;Levis et al., 2004;Krinner et al., 2005;Friedlingstein et al., 2006;Sato et al., 2007;Arora et al., 2013).Their inclusion in ESMs, however, can give rise to large and uncertain feedbacks.For example, the land surface scheme of the Hadley Centre GCM (MOSES-TRIFFID, latterly known as JULES) originally predicted the rapid collapse of the Amazon rainforest in the mid-21st century (Cox et al., 2000).Later versions of the same model with altered vegetation physiology allowed the simulated forest to persist in the face of increasing temperatures and reducing rainfall (Huntingford et al., 2013), illustrating the strong sensitivity of vegetation distribution to underlying physiological assumptions, which are themselves the subject of debate (Lloyd and Farquhar, 2008;Atkin et al., 2008).In addition to this, Sitch et al. (2008) demonstrated that the underlying assumptions of five alternative DGVMs (all driven with the same climate scenario) generated extremely divergent outcomes.In particular, the five models exhibited a tendency to predict rapid and substantial collapse of forest biomass, but in markedly different places.For example, the LPJ (Lund-Potsdam-Jena) model (Sitch et al., 2003) projects reductions in forest cover for over 50 % of Eurasia, while the TRIFFID and to a lesser extent the HYLAND, Sheffield DGVM, and ORCHIDEE models all project declines in forest carbon over Amazonia.These divergent outcomes may be interpreted as evidence that the processes that control the extant of forest biomes are poorly understood by large-scale models.
Two main classes of dynamic vegetation schemes are in use in the Climate Model Inter-comparison Project (CMIP) models at present (Friedlingstein et al., 2006(Friedlingstein et al., , 2014)).The first class, derived from the BIOME and LPJ class of models (Prentice et al., 1992;Running and Hunt, 1993;Sitch et al., 2003), deploy the logic of "climate envelopes", whereby recruitment and survival are only permitted within the predefined climate tolerances for a given plant functional type.These envelopes represent the physiological tolerances of the vegetation types to cold, heat and drought, but are typically derived using the observed distributions of presentday vegetation and isolated experimental data (Woodward, 1987;Haxeltine and Prentice, 1996;Prentice et al., 2007).These climatic limits on recruitment and survival operate in lieu of physiological understanding of the reasons why different types of plants persist in some environments where others do not.Another class of model is derived from the Lotka-Volterra representation of competitive ecological processes (Cox et al., 1998;Arora and Boer, 2006).The TRIF-FID model (Cox et al., 1998) specifies a "dominance hierarchy" for each pairwise competitive interaction between plant types that represents the expected outcome of competition between any two plant types with similar growth rates.Thus, the distribution of plants is also not a direct function of their physiological performance or dominance over resources, but is to some extent determined by pre-defined rules based on existing vegetation distributions.The CTEM model (Arora and Boer, 2006;Melton and Arora, 2015) uses a dominance hierarchy between trees and grasses, and climate envelope constraints to define the maximum range of its seven natural plant functional types.Dominance hierarchies can be understood as a proxy for the outcome of light competition, and therefore are appropriate where significant differences in vegetation stature mean that the outcome of competition is relatively certain, such as competition for light between trees and grasses.
The science of quantitatively understanding plant biome boundaries is in its infancy (Moorcroft et al., 2001;Givnish, 2002;Wullschleger et al., 2014;Enquist et al., 2015) and the use of climate envelopes or dominance hierarchies as a proxy for understanding plant biome dynamics is, arguably, a pragmatic approach to a problem of extraordinary complexity.Although it remains a potentially valid means of understanding plant distributions under altered climates, there is growing interest in moving towards models that rely on more fundamental principles of plant physiology.At the same time, initiatives to collate information on plant traits and physiological functioning (Wright et al., 2004;Kattge et al., 2011) along with increases in the sophistication of process representation in land surface models (Blyth et al., 2010;Zaehle and Friend, 2010;Best et al., 2011;Goll et al., 2012;Mc-Dowell et al., 2013;Oleson et al., 2013) have provided a basis for advancing plant biome boundary modeling.Many groups have, therefore, proposed and developed vegetation models with greater process fidelity (Hurtt et al., 1998;Moorcroft et al., 2001;Moorcroft, 2006;Medvigy et al., 2009;Scheiter et al., 2013;van Bodegom et al., 2014;Wullschleger et al., 2014;Fyllas et al., 2014;Weng et al., 2015), with an aim of mechanistically predicting plant distribution, from considerations of climate, soil, and fundamental plant physiology and ecology.
One key argument for using this approach is that the vegetation distribution is an emergent property of the system, and thus can be considered independent of observations of the location of biome boundaries.This gives rise to the possibility of hypothesis testing and, in theory, increasing confidence in predictions of future biosphere functionality.Furthermore, while climate envelopes may be diagnosed as the biome assemblages that emerge in response to the long-term ecosystem dynamics of a given climate, they may not be well defined for emerging novel climates, especially given that some environmental drivers (or aspects of the "climate", e.g., CO 2 concentration and nitrogen deposition) are changing simultaneously, and thus all current climates are in a sense novel.Lastly, bioclimatic relationships are diagnosed from long-term quasi-steady-state distributions, and so models that impose these assemblages in response to dynamic changes may not have realistic transient responses, which are likely to be characterized by lags between change in climate and responses of vegetation, given the persistence of trees that have lifespans that are long relative to the timescale of forcing.
Hence, we here introduce and explore a modeling framework for testing hypotheses of vegetation distribution, integrated into the structure of the Community Earth System Model (CESM) (Hurrell et al., 2013).The framework is built around the Ecosystem Demography (ED) concept of Moorcroft et al. (2001).The Ecosystem Demography model is a method for scaling the behavior of forest ecosystems by aggregating individual trees into representative "cohorts" based on their size, plant type and successional status.Here we also integrate into the model changes introduced by Fisher et al. (2010), in particular a modified implementation of the perfect plasticity approximation (Purves et al., 2008) as well as the SPITFIRE fire model of Thonicke et al. (2010), the cold deciduous phenology model of Botta et al. (2000) and the concept of optimal allocation of leaf biomass (Dewar et al., 2009;Thomas and Williams, 2014).Many aspects of plant physiological representation remain poorly constrained in land surface models in general.Thus, this framework is proposed as a template for future generations of the Community Land Model.We present the full technical description of the CLM4.5(ED)(Supplement A).While we do not specifically examine model runs coupled to the rest of the Earth system here, the capacity to do so is inherent in the inclusion of the model within the CLM code that resides inside the software architecture of the Community Earth System Model (Hurrell et al., 2013).
For the purposes of this initial demonstration of the CLM4.5(ED),we concentrate on the main property of the model that differs from most commonly used dynamic global vegetation models, which is the capacity to predict distributions of plants directly from their given physiological traits.This property can be referred to as "trait-filtering", and has been employed in offline land models (Reu et al., 2010;Pavlick et al., 2013;Verheijen et al., 2013;Fyllas et al., 2014;Reichstein et al., 2014) and advocated heavily in the vegetation modeling literature (McGill et al., 2006;Prentice et al., 2007;Purves and Pacala, 2008;Morin and Thuiller, 2009;van Bodegom et al., 2012;Boulangeat et al., 2012;Scheiter et al., 2013;Violle et al., 2014;van Bodegom et al., 2014).To enable trait-filtering, traits must affect plant growth and survival.Growth must then affect the acquisition of limiting resources (in this case via competition for light within the vertical profile) that must feed back onto growth, survival and reproduction.Differences in growth, survival and reproduction rates must then directly control (in the absence of climate envelope constraints) the relative distributions of vegetation types (and hence also the distribution of their traits).This model structure thus implies sensitivity to the specific, quantitative details of how physiological processes are represented, and heightens the imperative to study the relative costs and benefits (or economics) of alternative plant life history strategies (Reich, 2014).
The hypothesis we investigate here is that the distribution of evergreen and deciduous trees can be predicted from the relative carbon economy of their leaf habits, meaning the costs and benefits, in terms of carbon assimilation and expenditure, of the alternative phenological behaviors.This idea is intended as an illustration of how one might use this class of model to test continent-scale hypotheses concerning vegetation distribution, and to raise important discussion points related to the methods used for such studies.Other biome boundaries, such as forest-tundra, forest-grassland and grassland-desert transitions, will be the subject of future investigations.

Model structure and concept
Descriptions of the ED concept exist in the vegetation modeling literature, (Moorcroft et al., 2001;Medvigy et al., 2009;Fisher et al., 2010), but we reiterate the major developments here for clarity.In reality, vegetation cover is heterogeneous in space for many reasons including soil composition, climate, microtopography, land use and disturbance history (Dahlin et al., 2012(Dahlin et al., , 2013)).In land surface models, the variations in exogenous drivers are captured by the representation of gridded soil, land use and climate forcing data.Within a grid cell, some of this exogenous heterogeneity is by definition ignored (although, in the CLM4.5, some exogenous variation is captured by the representation of lake, ice, wetland, urban, and managed vegetation tiles).In addition, much heterogeneity of vegetation composition and structure, is endogenous, in that it is driven by the ongoing processes of recovery and disturbance across a landscape, giving rise to a quasi-random spatial matrix of vegetation at different stages of recovery.The default CLM4.5 (Oleson et al., 2013), and the vast majority of land surface models operating in ESMs, represent variability in natural vegetation via a series of "tiles", each of which is occupied by a single plant functional type (but cf.Watanabe et al. (2011)).The tiles have no physical location within a grid cell, and no concept of whether they are well mixed or well separated.This method of representing vegetation does not allow for competition for light between different plant types, and also does not allow the representation of recovery from disturbance, a critical element of ecosystem organization (Moorcroft et al., 2001;Purves and Pacala, 2008).

Disturbance-partitioned landscapes
The incorporation of the Ecosystem Demography concept significantly alters the representation of the land surface in the CLM.The purpose of the changes is to represent, in a discretized manner, the disturbance-driven biotic heterogeneity.In the CLM(ED), the new tiling structure represents the dis- turbance history of the ecosystem.Thus, some fraction of the land surface is characterized as "recently disturbed", some fraction has not experienced disturbance for a long time, and other areas will have intermediate disturbances.Newly disturbed areas are generated periodically and mechanistically by events such as fire or the falling of large trees.The patchwork of different stages of succession within a given geographical area is discretized into a set of similar "disturbance history class" units.Note that within each of these disturbance history classes may exist a variety of plants of different types, each of which may have different ages themselves.This formulation is described next (Moorcroft et al., 2001;Medvigy et al., 2009;Fisher et al., 2010).

Cohortized representation of tree populations
Representing the heterogeneity of plants is challenging in ecosystem models operating the Earth system scale, considering the variability and myriad physiological attributes, sizes, and spatial positions of real plant populations.One way of addressing this heterogeneity is to simulate a forest of specific individuals, and to monitor their behavior through time.This is the approach taken by "gap" and individualbased models (IBMs), e.g., LPJ-GUESS (Smith et al., 2001), SEIB (Sato et al., 2007) and SORTIE (Uriarte et al., 2009).
Their increased computational requirements mean that these models typically use a daily time step for gas exchange calculations, while the Community Earth System Model, and most other ESMs, require gas exchange to be calculated at 30 or 60 min resolution (Lawrence et al., 2011).For the sake of computational efficiency within this framework, the ED model takes the approach of grouping this hypothetical population of plants into "cohorts".Cohorts are discrete groups of plants, which are essentially clones of each other, and are differentiated from other cohorts primarily by their plant functional type and size.Each cohort is associated with a number of identical trees, n coh (where coh denotes the identification or index number for a given cohort).
In each disturbance history class, the hypothetical population of plants is divided first into discrete plant types consistent with the standard approach to representing plant diversity in large-scale vegetation models.In addition to this, the ED model also groups plants into numerous size classes, thus enabling vertical interactions.Cohorts of the same functional type may co-exist and compete in the same shared space as different sizes.The exact nature of the size classes emerges from the cohort fusion routines, discussed in Supplement A. Importantly, for each plant type/size class combination, the properties of the cohort's representative individual plant are maintained and prognosed (numerically integrated through time).These properties can be thought of as an average for the group of plants represented by the cohort.Note that competition for below-ground resources, namely water, remains affected only by vertical root distribution, and is unaffected by the introduction of the ED concept into the CLM.All plants have access to the same water pool, as described in Supplement A.

The representation of trait diversity
We focus here on the problem of predicting the extent of evergreen and cold deciduous strategies in temperate regions.Deciduous and evergreen trees vary most obviously in their approach to leaf production.Typically, deciduous trees produce thinner leaves with lower leaf carbon mass per unit area (M a , gC m −2 ), or the inverse of specific leaf area, that only allow the plant to photosynthesize for the period of the year when these leaves are viable (Niinemets, 2010), whereas evergreen leaves typically have more expensive construction and persist year round.Leaf nitrogen content per unit area (N area , g m −2 ) and productivity also vary with leaf thickness (Reich et al., 2007), and are thus related to M a and leaf lifespan (L l , years).These three properties, M a , L l and N area , are among the best-quantified leaf traits in existing databases (Kattge et al., 2011), and together can plausibly define alternative leaf construction strategies.Furthermore, at a global scale, trade-offs exist between these three properties, and it has been suggested that the existence of such constraints on parameter space represents a key opportunity to simplify the representation of vegetation within DGVM models (Reich et al., 1997;Westoby et al., 2002;Wright et al., 2004;Reich, 2014).To investigate how parameter choice impacts the outcomes of the model, we use the GLOPNET leaf trait database (Wright et al., 2004) to define M a , L l and N area .Within plant functional types, which are defined here as evergreen vs. deciduous trees and needleleaf vs. broadleaf trees, there are large variations for all parameters within the database (Figs. 1 and 2).Thus, there exists a problem of parameter choice for these three properties.One approach is to simply use either the mean properties of the data for each plant type (Reich et al., 2007), or a single linear fit of the relationship between the different variables.This approach, while compellingly simple, presupposes that the database represents an appropriate sample, either of the mean of the ex-isting plants, or the relationships between the variables.This is, on account of sampling biases (Wright et al., 2004), quite unlikely to be the case; as such we take a different approach that retains the observed spread in the available data.In this study, we construct PFT-specific three-dimensional covariance matrices (Figs. 1 and 2) that represent our knowledge of the direction and fidelity of the trade-offs between the three traits and thus define a set of plausible "proxy species" within each plant functional type, defined in this case by phenological habit (i.e., evergreen or cold deciduous).We consider all parts of the normally distributed covariance matrix to be equally likely (since their likelihoods are derived from the observed data).We then re-sample, from this distribution, a set of 15 parameter combinations for deciduous broadleaf (DBT) and 15 for evergreen needleleaf (ENT) trees, using a multivariate normal distribution sampling routine, the mvnrnd function in MatLab (MATLAB, 2012).
N area values are substantially higher for ENT than for DBT.Kattge et al. (2009) report the relationship between photosynthetic capacity V c,max,25 (µmol m 2 s −1 ) and N area for DBT and ENT, and find that ENTs have much lower instantaneous nitrogen use efficiency than DBTs, using their coefficients.We thus calculate V c,max,25 as for DBT, and for ENT.Without this modification, a naïve approach to scaling from N area to V c,max,25 would give ENTs a photosynthetic capacity 50 % higher than DBTs.This model parameterization approach only modifies a small fraction of the total number of the parameters that are necessary within the CLM(ED) framework (Oleson et al., 2013) (Supplement A).To increase the tractability of the simulations and to constrain the changes in parameters between plant functional types, we kept all of the remaining model parameters constant.We acknowledge, and indeed emphasize, that the outcome of the simulations could be altered by modification of other parts of the model parameter space.Our aim here is not to derive the "best possible" simulation of biome boundaries, but more to investigate the consequences of parameter choice within a relatively small and well-constrained framework.Few other model parameters have the same density of observations (Kattge et al., 2011); thus, the scenario represented by M a , L l and N area is one of the best test cases for deploying trait data to predict biome boundaries.

Model setup
To explore the consequences of parameter choice for the fidelity of the predicted biome boundaries, we ran a series of ensemble simulations, each using one of the 15 parameter combinations resampled from the three-dimensional covariance matrix, as described above and in Table 1.To allow for direct attribution of biome boundary position to our hypothesis (e.g., that the relative carbon economy of deciduous vs. evergreen plants can explain their distributions), we assume here that there are no other differences between the properties of the ENT and DBT plant types.These simulations were run five times, using a control and four alternative structural assumptions described in Sect.4.4.Regional model runs were conducted for the eastern United States.We selected this region on account of the continent-scale biome boundary shifts evident between phenological habits along the north-south axis of this domain.In the eastern United States, there is a clear transition from evergreen vegetation in the north to heavily deciduous-dominated ecosystems in the mid-latitudes, then back to evergreen in the southern and subtropical regions.The problem of parameterization of plant functional type attributes within the context of structural variants is complex, therefore we intentionally focus on this limited-scope regional problem, to allow a more thorough investigation of the properties of the model.We acknowledge that historical land-use impacts affect this study area, but we both screen out heavily impacted areas from our analysis and only focus on forested ecosystems, reducing this impact substantially (see the latter section on observational constraints).Other clear shifts in phenological habit occur globally, most notably at the rainforestsavanna biome boundary (DeFries et al., 2000), but methodologies for simulating drought-deciduous phenology are not as well understood as for cold-deciduous phenology (Baudena et al., 2015), and the issue is complicated by interactions with modeled soil and plant hydrology (Dahlin et al., 2015).Future studies will investigate other biome boundaries and ultimately the properties of global simulations.Wright et al. (2004).Large circles are from the database, and smaller circles are randomly chosen points from the resampled normally distributed covariance matrix.
The model is forced with 6-hourly climate drivers derived from Qian et al. (2006), re-gridded to a 0.9 × 1.25 • resolution grid and run from 1972 to 2003 for the eastern USA (90-65 • W, 25-50 • N).Because of our prioritization of ensemble experiments to illustrate the dependence of modeled plant competition on parameter values and model structural variants, rather than to explore the consequences for the entire (soil, vegetation, atmosphere) carbon cycle, we ran the models until the vegetation distribution appeared stable.Because of the absence of a nitrogen cycle in our simulations, this period was relatively short (i.e., approximately 30 years).The carbon budget of the represented ecosystems was not necessarily in balance at this time, but there did not appear to be a trajectory affecting the ecosystem composition, the output variable of interest.Our other outputs of interest, LAI and GPP, stabilize well before this time.Each ensemble member was initialized from bare ground, seeded with equal numbers of saplings of each plant functional type (ENT and DBT).

Observational constraints
To evaluate the model predictions, we use the AVHRR vegetation continuous fields (VCF) product (DeFries et al., 2000), which assesses global vegetation patterns in terms of leaf type (i.e., needleleaf, broadleaf) and phenological habit (i.e., evergreen, deciduous).The fraction of vegetation in each class is determined for each 5 km cell, and the data were regridded to the same 0.9 × 1.25 • model grid.We generate a metric of average observed evergreen fraction (F eg ) for each grid cell.Furthermore, we also use the MODIS leaf area index (LAI) product to evaluate model performance across the simulated domain.Leaf area index is a property often used to benchmark plant physiology models because it is a critical determinant of both energy and carbon exchange processes, despite our imperfect ability to generate LAI products from canopy greenness indices (Quaife et al., 2004;Pfeifer et al., 2012;Loew et al., 2014).In this instance, our primary objective is to predict spatial variation in LAI at a regional scale.Further studies will be expanded into the use of other metrics of canopy greenness (e.g., fraction of absorbed PAR -photosynthetically active radiation), using CLM4.5(ED)'sincreased fidelity representation of the canopy structure (Supplement A).Areas with heavy (> 50 %) influence of anthropogenic land-use change, as determined by the CLM surface data sets (Lawrence and Chase, 2010), are masked out in model-data comparisons, since the model is only relevant to the prediction of natural vegetation LAI.Since the VCF product only reports values relevant to forest vegetation cover, it is relevant to test the model predictions against areas with land-use change because the herbaceous/crop areas are already screened out.Finally, we also compare model outputs to the Fluxnet GPP product (Jung et al., 2011;Bonan et al., 2012), which scales fluxes observed at eddy covariance measurements sites to a globally gridded product using climate and vegetation drivers.The Fluxnet GPP has previously been used to validate CLM GPP predictions (Bonan et al., 2012), and while it relies on data that are sparse for some regions, errors for this latitude band are relatively low (Beer et al., 2010).

Structural variants
Numerous aspects of carbon cycle process representation are uncertain in land surface models, and, using our mechanistic modeling framework, these uncertainties can propagate into predictions of biome distribution.To address a subset of this uncertainty, we conducted parametric ensembles across a variety of structural assumptions pertaining to the allocation of carbon resources across evergreen and deciduous trees.We investigate the importance of assumptions related to model initialization, which is a notable determinant of final ecosystem state in models with strong positive feedbacks.We also investigate the depiction of leaf and fine root carbon economy, taking advantage of new studies that report better con-straints on these processes than exist in the default model.The new data pertain to the correlation of leaf respiration with leaf nitrogen, the turnover of evergreen leaves, and the turnover rate of fine root matter.The default model setup, described in detail in Supplement A, is denoted as the control (CONT) simulation.The other four structural variants are described below.

Variant 1: allocation
The first structural variant relates to carbon allocation (and is thus denoted as ALLOC).In this variant, we address limitations in the existing CLM(ED) assumptions for leaf carbon allocation.In the default version of the CLM4.5(ED), using the assumption described in Fisher et al. (2010), leaf area index is expressed on a per-tree basis (and ultimately aggregated to calculate average surface LAI).The individual tree leaf area index is the number of leaf layers within the area occupied by the tree crown (l tree m 2 m −2 ).l tree is determined from leaf biomass, (b leaf , g), leaf mass per unit area (M a,ft , g m −2 (where ft denotes plant functional type), and the area occupied by the tree (A crown , m 2 ) as follows: Maximum target leaf mass is an empirical function of stem diameter (dbh), adjusted by the wood density ρ ft (taken from Moorcroft et al., 2001).b leaf,max = 0.0419dbh 1.56 ρ 0.55 ft (4) b leaf,max is a target maximum biomass that can be adjusted downwards by the leaf area optimization routines (Supplement A) that ensure that the net assimilation cost of the bottom leaf layer (taking into account construction) does not fall below zero.
In this form, for a given tree diameter, there is always the same maximum leaf biomass, irrespective of M a .Therefore, initial l tree is inversely proportional to M a .The ENT and DBT plants typically have markedly different M a distributions (Figs. 1 and 2) and therefore there is a correspondingly large difference in their maximum potential (and initial) leaf area index.While the leaf area optimization routines eventually act to ameliorate this initial difference in LAI between plant types, the early advantage in productivity obtained by the deciduous trees can cause them to grow faster to the extent that they close the canopy and out-compete the evergreen trees, reinforcing the difference in initial conditions.Asner et al. (2003) report LAI values for temperate ENTs as at least equivalent to (6.7 ± 6.0) if not higher than temperate DBTs (5.1 ± 1.8).These observations imply that absolute allocated leaf biomass for ENTs must, given their higher M a , be higher than the leaf biomass of DBTs, which is not the case in the control model.
To overcome this intrinsic model bias, we employ a modification to the target leaf biomass such that the initial tree leaf area index remains the same for DBT and ENT regardless of the values of specific leaf area.Specifically, the target leaf biomass is scaled by the quantity S lma as follows: where M a,max is a reference value, currently set at 300 g m −2 .

Variant 2: base rate of respiration
Leaf respiration rates are a critical element of the competitive interaction between ENT and BDT since a major cost of the evergreen habit is the maintenance of photosynthetic apparatus throughout the unproductive winter season.The second variant (RESP) pertains to the baseline rate of respiration.In the control version of the CLM4.5, respiration is a function of the leaf nitrogen content per unit area N area .Using this methodology, the leaf maintenance respiration rate at 25 • C at the top of the canopy lmr top,25 (gC s where b resp is the baseline rate of respiration per unit N area , given by Ryan (1991) as 0.2577 gC gN −1 s −1 .
A recent study by Atkin et al. (2015) provides greater constraints for the relationship between N area and lmr top,25 .In their study, they report different relationships for ENT and BDT functional types, as follows, for BDT, The outcome of these log-log relationships, if expressed in the same base rate units used by Ryan (1991), across the spread of N area values used in our ensemble, is 0.452 gC gN −1 s −1 for NET and 0.536 gC gN −1 s −1 for BDT.We replaced the linear dependence of lmr top,25 on N area with the log-linear functions described above.With this modification, the base rate is approximately double that used in the default model, and the new base rate for ENT is 16 % lower than that for BDT (when they were identical in the original model).We denote this model variant as RESP.

Variant 3: leaf lifespan as a function of temperature
The third structural variant we consider concerns the rate of evergreen leaf turnover.In the default version of the model, leaf lifespan is derived from the covariance matrix that relates it to M a and N area .However, interrogation of the GLOP-NET database reveals almost no correlation between leaf lifespan and M a for NET (R 2 = 0.004).Instead, there is a much stronger correlation with mean annual temperature (R 2 = 0.426, Fig. 3).This relationship was also reported for a subset of boreal needleleaf evergreen trees by Reich et al. (2014).The impact of using our default covariance matrix approach is that "expensive" leaf strategies can be proscribed in both hot and cold regions.In contrast, the observations suggest that, irrespective of leaf cost, leaves last longer in colder environments, and that the short-lived, more expensive leaf habits are confined to hotter areas.In this modification, we directly employ the relationship between (MAT) and L l for evergreen trees.The relationship we extract from the GLOP-NET data for this purpose is L l,ENT = −0.2885MAT+ 7.1069. (9) As temperature appears to have no significant impact on M a or N area (R 2 = 0.046 and 0.02, respectively), and as they are strongly related to each other (R 2 = 0.580), we retain the covariance matrix approach to define those parameters, independent of temperature.We also maintain the same maximum leaf lifespan prediction for the deciduous trees.We denote this variant as LL_TEMP.We discuss the implications of direct prediction of leaf lifespan from climatic drivers further in the discussion.Evergreen broadleaf and deciduous needleleaf tree data are not used in this analysis, but are shown for comparison here.

Variant 4: root lifespan as a function of temperature
The definition of root turnover rates is subject to extreme uncertainty in vegetation models, not least because root turnover rates are intrinsically hard to observe, but also because root longevity appears to be complex, having been statistically related to many factors including root order (Joslin et al., 2006;Guo et al., 2008;McCormack et al., 2012), depth, diameter, specific root length and wood density (Mc-Cormack et al., 2012), nitrogen content (Eissenstat et al., 2000) and temperature (Gill and Jackson, 2000).Arguably, models that predict root traits from correlated plant physiological properties and environmental conditions are needed to properly specify this trait, as described in detail by Warren et al. (2015).However, to illustrate the sensitivity of the biome boundary predictions to basic variability in assumptions of root turnover, we test both the default assumption (the turnover rate of the fine root pool is 1.0 yr −1 ) and a relationship derived from the analysis of Gill and Jackson (2000).The Gill metaanalysis found a log-log relationship between MAT and root tissue turnover (R l , years), with different coefficients for NET and BDT (with a slightly steeper decline in R l with MAT for BDT than for NET).Thus, for NET We denote this model variant as RL_TEMP.

Model simulations
Our four modifications give rise to a set of 2 4 = 16 potential structural combinations.Testing all 16 structural combinations for the 15-member parametric ensemble for the full eastern United States region is computationally prohibitive.Consequently, instead of testing all combinations, we add the structural modifications in one at a time to investigate the impact of each change in isolation.We therefore compute five ensembles of alternative structural variants, by adding the ALLOC, RESP, LL_TEMP and RL_TEMP changes sequentially.For each of the five variants, we run the model for 15 times with parameter values sampled from the space of M a , L l and N area .The structural variants are labelled i, ii, iii, iv, and v, and are described in Table 2.We compare the model output to the observed data using five comparison metrics, maximum and mean annual LAI, maximum and mean annual GPP, and the single set of evergreen fraction data available.We calculate the R 2 and root mean square error (RMSE) of the spatial distribution of each metric.We acknowledge that there exists a choice of metrics (maximum vs. minimum vs. range, and spatial vs. temporal correspondence), but also note that subjectivity in the definition of objective functions is generic to high-dimensional model output (Abramowitz et al., 2008;Randerson et al., 2009;Blyth et al., 2011;Abramowitz, 2012;Kelley et al., 2013;Luo et al., 2012;Schwalm et al., 2013;Anav et al., 2013).
Our analysis is concerned with the costs and benefits, or carbon economy, of the different leaf strategies.The cost of leaves is easily calculated as the investment (in terms of LMA), divided by the lifespan (in terms of LL), giving the cost in KgC per unit area per year of leaf.The benefits (in terms of carbon export), on the other hand, are more difficult to calculate, since they are manifested not only though leaf N area and hence photosynthetic capacity, but also by the nonlinear interactions of photosynthetic capacity with environmental drivers (light, CO 2 , temperature, etc.).Thus, the detailed physiological model is required to generate estimates of benefit in terms of assimilation, and it is not possible to do these calculations as a simple offline analysis.Furthermore, the implementation inside the physiological model includes the impact of self-shading of leaves lower in the canopy, and thus the costs and benefits of these strategies are actually only properly assessed at the canopy scale.To address this point, we conducted additional model runs that use only one PFT at a time, using structural variant v. Using these analyses, we can assess the differences in productivity and leaf area index of the PFTs in isolation.This removes the direct effects of light competition and allows interrogation of how the competition and productivity elements of the model combine to generate the resulting distribution.Figure 4 illustrates that, particularly for F eg and LAI, R 2 varies primarily with structural variation, as illustrated by the horizontal striation.In contrast, variation in RMSE, particularly for GPP, illustrates the dominance of parametric variation, shown by the vertical striation in the GPP and LAI comparisons in Fig. 5.We did not combine the R 2 and RMSE values directly, since calculating their relative weights would serve to reduce the clarity of the output exposed by using them both independently.

Prediction of biome boundaries
In the control simulation (Fig. 7), every parameter combination produced a near-complete dominance by deciduous vegetation, irrespective of the variation in parameters that were extracted from the leaf trait database.The mean R 2 of the predicted vs. observed F eg across the ensemble (0.04) illustrates this lack of predictive skill.Addition of the ALLOC modifications to initial leaf biomass (Fig. 8) returns significant variation in predicted F eg .The model still predicts complete dominance of BDT for some parameter combinations, but also successful dominance of ENT at high and low latitudes for others.Nonetheless, only three of the simulations have evergreen cover over 25 % (where the mean for the observations is 49.2 %).The mean (and max) R 2 is 0.13 (0.34), where "max" is the highest R 2 value for any of the 15 parametric combinations.
The impact of altering the leaf respiration fluxes to match the observed relationship with leaf nitrogen and plant functional type had only a slight impact on the overall RMSE and R 2 statistics for the evergreen fraction predictions (maps not shown on account of their similarity to Fig. 8).Making evergreen leaf lifespan a PFT-specific function of temperature (LL_TEMP) has a more profound impact on the competitive ability of the NET plants at high latitudes (Fig. 9).With this structural modification, seven of the simulations have evergreen cover over 25 %, and the mean (and max) R 2 increases to 0.20 (0.34).
The last modification, directly including the PFT-specific impacts of temperature on fine root turnover, further increases the dominance of evergreen trees in northern latitudes, again slightly increasing the correlation with the observations.Now nine of the simulations have evergreen cover > 25 % and the mean (and max) R 2 is 0.23 (0.35) (Fig. 10).In general, it is clear that all versions of the model considered here display something of a systematic bias towards the prevalence of deciduous trees using this parameter space.
The impact on RMSE of the sequence of structural modifications also showed a tendency towards improvement as the average RMSE of the predicted vs. observed fraction of ev-   ergreen trees dropped from 0.48 (model run i) through 0.41 (ii), 0.41 (iii), 0.37 (iv) and 0.35 (v) (Fig. 5).

Impacts on leaf area index
The alteration of both model structure and parameters also had a major impact on the predicted LAI.This is expected, since all of the modifications and parameters are concerned with carbon economy, and realized leaf area in the model is predicted from the vertical location of the lowest leaf layer in positive annual carbon balance (Supplement A).The increase in model-data spatial coherence (R 2 ) through the structural ensemble (from runs i to v) for F eg (see Sect. 5.2) is not echoed by changes in the R 2 of mean annual LAI, which instead decreases through the ensemble from 0.45 (run i) through 0.31 (ii), 0.30 (iii), 0.14 (iv) and 0.05 (v).This trend was not apparent for the R 2 of maximum annual LAI (which varies through 0.42 (i), 0.15 (ii), 0.32 (iii) 0.39 (iv) to 0.38 (v)) (Fig. 4).The model error (RMSE) was also relatively insensitive to changes in the model structure, aside from the change from run i to run ii, which improved the simulations (Fig. 5).
The direction of change of the R 2 and RMSE statistics was not consistent due to spatial complexities.Specifically, the control simulation (run i) systematically underestimated LAI across the entire domain (Supplement B: Fig. 1) and thus had a high RMSE.The lack of much spatial structure in LAI prediction across the geographical domain, however, meant that it had a relatively good spatial coherence with the LAI data product, which is also relatively homogenous across the domain.Increasing allocation to leaf biomass in simulation ii, and thus increasing LAI overall, intensified the spatial heterogeneity of the predictions (Supplement B: Fig. 2) and thus worsened the R 2 , but reduced the model error.
Changing the respiratory fluxes in run iii improved the R 2 fit to maximum LAI (from 0.15 to 0.32, Fig. 4), potentially on account of the higher respiration rates at low latitudes acting to even out the spatial distribution of LAI (Supplement B: Fig. 3), and in doing so compensated for the decline caused by the previous modification (illustrating the possibilities of model equifinality).
Altering the leaf turnover time caused an increase in the mean LAI (from 2.66 to 3.06) by reducing canopy replacement costs at high latitudes.The model predictions thus now approach and in some cases overshoot the values observed for high-latitude evergreen forests (3.5-4.5 m 2 m −2 ) in the data product (Supplement B: Fig. 4).In the simulations where evergreen trees are dominant, it is notable that their LAI values may be somewhat over-predicted.The final simulation (v, with the RL_TEMP modification) intensifies the reduction in tissue turnover demand at high latitudes, and thus the changes primarily amplify those imposed on LAI by the LL_TEMP modification.The model now illustrates a very wide range of potential LAI predictions, dependent on the parameters chosen to represent the ENT and DBT strategies (Fig. 11).The major systematic bias in the final LAI predictions is the underestimation in the mid-latitudes of the domain.The fact that this feature is persistent across the parameter space sampled (even though there is clearly room for more detailed parameter optimization) indicates a persistent structural bias, particularly in the performance of deciduous broadleaf trees in their higher ranges.This underestimate is not substantially changed by any of the structural modifications we deploy here (all of the simulations indicate the same issue) and does not appear to result from underestimates of productivity (Fig. 12), potentially implying a deficiency in carbon allocation.
It is worth noting that the LAI values predicted by the CLM4.5(ED)algorithm (which assumes leaf area optimized    for net canopy carbon gain) all appear to be in the range bracketed by the observations.Historically, the CLM4.0 and CLM4.5 models have suffered from issues related to the chronic overestimation of LAI (Lawrence et al., 2011;Dahlin et al., 2015).We suggest that limiting the production of leaf layers in negative carbon might ameliorate this issue.

Impacts on GPP
The correlation coefficients for GPP are consistently higher than those for LAI or for biome boundary prediction, illustrating that simulations of GPP appear generally more robust than either those for plant carbon allocation (Kauwe et al., 2014) or for biome boundary prediction (Supplement B: Figs. 5 and 6).The spatial correlations of maximum annual GPP flux are relatively insensitive to the effects of structural variation (R 2 values are 0.48 (i), 0.49 (ii), 0.49 (iii), 0.44 (iv) and 0.44 (v) (Fig. 4)).The R 2 values for mean annual GPP flux are more sensitive to model structure (0.63 (i), 0.58 (ii), 0.58 (iii), 0.44 (iv) and 0.39 (v)) and, in common with the LAI predictions, decline through the ensemble.
Notably, the overall mean and RMSE values for GPP are much more sensitive to variations in parameter values than to changes in model structure (Figs. 5 and 6), reflecting the impact of the parametric variation on the overall productivity, both directly via the impact of N area on V c,max , and indirectly via impacts of L l and M a on leaf area index.
GPP predictions using parameter setting no. 13 have a notably low R 2 for mean and maximum GPP (which is actually negative for runs ii though v, resulting from the residual sum of squares being larger than the total).This simulation has the highest fractions of evergreen vegetation, and generates very high LAI and thus high GPP values in the far north of the domain (Supplement B: Figs. 7 and 8).As a result, in the latter parts of the structural ensemble, no. 13 has a notably poor spatial correspondence to the observations (which show a decline in GPP with latitude).Several of the other high evergreen cover ensemble members (nos.5, 12, 15), all of which have an unrealistically high LAI in the northern areas, also show a degraded correspondence to the GPP data product.Not all parameter combinations show this, suggesting that some of the M a and N area combinations might be inappropriate for use in the far north (see discussion).

Relative performance of individual plant functional types
Figure 13 illustrates the absolute difference between the productivity (annual NPP) of the EBT and the ENT for the third year of the simulation for structural variant v.Each PFT was run in isolation to calculate these differences.Here it is clear that at the mid-latitudes, the EBTs have a significant productivity advantage, which broadly maps onto the eventual distribution of these PFTs in the competitive simulations discussed above.At higher and lower latitudes, the ENT and BDT have approximately equal productivity.Parameter choice affects the distributions of the areas where EBT has an advantage, but the pattern is consistent across the ensemble, excluding parameter combination no. 13.Looking at the performance of larger trees, where the LAI is equilibrated with productivity, and effects of initialization have disappeared (Fig. 14), there are either small differences or considerable productivity advantages of the ENT type (excluding ensemble member no.14).This implies that the EBTs gain dominance early in the competitive interaction, presumably by amassing leaf area at a greater rate than the ENTs.Thus, the representation of light competition is instrumental in producing biome boundaries in this example.

Discussion
We present here a demographic dynamic vegetation model (ED), coupled to the biophysical scientific and software architecture of the Community Land Model v4.5 (Oleson et al., 2013).The CLM4.5(ED) model represents a substantial modification to the representation of land surface heterogeneity in the CLM, and is intended as a template for the investigation of vegetation dynamics and their properties within the context of climate simulations.Particular features of this model structure include (1) the flexible representation of plant functional type parameterization, (2) the representation of plant demography and succession derived from the ED concept, (3) the representation of self-organization of plants into distinct canopy layers derived from the PPA model, (4) the solution of canopy processes at relatively high temporal (i.e., half-hourly) and vertical (i.e., multi-layer calculations at a resolution of 1.0 LAI units) resolutions, and (5) the ability to represent multiple different plant types within the same vertical light profile.These features together enable the model to select vegetation types based on their growth performance, and to thus predict vegetation dominance from the plant traits that affect relative productivity of different vegetation types.

R. A. Fisher et al.: Taking off the training wheels
The prediction of plant distributions from plant traits allows the testing of mechanistic hypotheses of plant biogeography, and reduces the dependence of vegetation models on climate envelopes.Successful prediction of vegetation patterns can act as an independent test of our understanding of the link between plant physiology and geographical spread.Therefore, this feature is often stated as an aspiration for future dynamic vegetation models (Purves and Pacala, 2008;Verheijen et al., 2013;Boulangeat et al., 2012;Scheiter et al., 2013;Fyllas et al., 2014;van Bodegom et al., 2014).Here we test the assumption that biome boundaries can be predicted as the emergent properties of relative carbon economies of evergreen and deciduous leaf habits.Removing empirically derived climatic constraints introduces additional internal model feedbacks, as competitive interactions act to amplify small differences in relative productivity.As we demonstrate here, relatively small structural and parametric changes can therefore have large consequences for predicted vegetation properties and biogeochemical cycling.In this study, we utilize the relationship between three of the traits most commonly featured in trait databases.Our intention is to highlight the sensitivity to how traits are utilized, an approach that demands some parsimony in the number of model components that are allowed to vary simultaneously.
We find that the default model structure universally overpredicted the dominance of broadleaf deciduous trees across the entire domain.Some of this bias could be corrected by increasing the maximum target leaf biomass quantity to be proportional to leaf mass per area, highlighting the issue of initial condition dependence in competitive models.Importantly, some of these simulations capture the properties of biome boundaries in the real world (evergreen trees being more prevalent in the north and south of the domain) and, therefore, indicate that the basic hypothesis -that the carbon economy of evergreen trees is favorable in those environments -has some quantitative support.Where DBTs are dominant, their dominance appears to stem from rapid smallstature growth rates, rather than from higher adult productivity.
Implementation of updated respiration functions had limited impact on the model output.The further implementation of observed interactions between mean annual temperature and leaf lifespan, and then root lifespan, had profound impacts on the success of evergreen vegetation, particularly at higher latitudes.For all structural variants, the choice of parameters for the leaf mass per area, leaf nitrogen and leaf lifespan (in cases where it covaries with M a and N area ) had significant impacts on the predicted biome boundaries.We find that the GLOPNET data as used here do not represent a set of equally productive plant types when the traits are used to drive modeled plant growth.

Potential avenues for structural model development
At least two large biases were indicated by the structural ensemble that were not resolved by any of the tested modifications.First, the under-performance of DBTs at the northern extent of their range, and second, the over-performance of ENTs in the far north in some of the model simulations.To address the latter, Reich et al. (2014) find some correlation between MAT and leaf nitrogen allocation for their set of ENT species.We did not detect a relationship between MAT and N area in the GLOPNET data; thus, this might be a topic of future investigation.It is worth noting, additionally, that the optimality criteria with which CLM(ED) predicts the leaf area index is based on the avoidance of leaves in negative carbon balance.In cases of severe nutrient limitation, this might be only an upper bound on LAI, and alternative metrics that take into account the cost of nitrogen acquisition might be more appropriate (Fisher et al., 2010;Brzostek et al., 2014;Thomas and Williams, 2014).

Trait-filtering models
The CLM4.5(ED) is designed as a trait-filtering model, in that it can predict successful vegetation types from their traits via the "filter" of environmental conditions.One central premise of trait-filtering models (Scheiter et al., 2013;Weng et al., 2015) is that "trade-off" surfaces are necessary inputs, and, implicitly, that moving along the surface means that performance increases by some metrics, but gets worse in others.The use of a proscribed trade-off surface is illustrated in the Jena Diversity (JeDi) model (Reu et al., 2010;Bohn et al., 2011;Pavlick et al., 2013).Potential plants (proxy species) are selected from a seven-dimensional trade-off surface, and the environment acts as a filter on this (large) population, reducing the realized population to those proxy species that are able to reproduce under given environmental conditions.Implicit in this methodology are the assumptions that all tradeoff surfaces are fixed, and that they are independent of climatic drivers.Scheiter et al. (2013) discuss three classes of trade-offs that may be considered in vegetation models -allocation trade-offs (investment decisions in different tissues), mechanical trade-offs (intrinsic structural properties) and empirical trade-offs that must be prescribed, in lieu of understanding of their mechanistic underpinning.In our study, the three-way trait relationship between N area , M a and L l is an empirical trade-off.Contrary to observations across multiple plant functional types (Wright et al., 2004), the within-PFT trait relationships appear weak.Specifically, large variations in leaf lifespan and in N area are possible for the same leaf carbon investment (M a ) (Figs. 1 and 2).If, for example, a higher L l value is chosen for the same M a , the cost of canopy replacement will go down, increasing plant leaf area index, productivity, and growth.There is no downside in this model framework to having longer-lived leaves.Therefore, in this case, the trait data fail to accurately define a trait trade-off.It is possible to empirically define a surface fitted to the data, and to remove the "noise" around the central tendency of the data.This approach would necessarily reduce the tendency to select plants with very high or low relative productivity, but also would, in this case, be an inaccurate reflection of the genuine spread of the data, given the lack of adherence to clear trade-off surfaces.
Higher N area increases both photosynthetic capacity and respiration rates, so should be subject to some degree of trade-off, depending on the climatological conditions (warm nights and long winters increase the costs of high leaf N).Nonetheless, the balance of these processes appears not to produce equivalent performance across the space defined in Figs. 1 and 2. This outcome highlights two potentially problematic issues with the trait filtering approach.The first is that costs and benefits of alternative strategies might not be represented completely by simple and easily observable trade-off surfaces.The true "cost" to plants of long-lived leaves may not be a linear function of M a .Long-lived leaves might well, for example, require investment resources in complex and energetically expensive defensive compounds, and so an alternative axis of investment and return might be functionally more appropriate.The second issue is that tradeoff surfaces might not necessarily be consistent across locations (Moncrieff et al., 2015).For example, differences in the environment (e.g., temperature) might increase the potential lifespan of leaves by reducing herbivory rates and damage from solar radiation.

Environmental drivers of plant traits
Here we find, in common with Reich et al. (2014) and Kikuzawa et al. (2013), that, for evergreen trees, there is a stronger relationship of temperature with leaf lifespan than there is with carbon investment (M a ).In this example, the inclusion of a temperature-dependent leaf lifespan allows for a greater fidelity representation of the real world, and results in an improved prediction of the dominance of evergreen trees at higher latitude.Thus, one might argue for the inclusion of some climatic controls over trait distributions.
The direct prediction of plant traits from climate variables in dynamic vegetation models was adopted by Verheijen et al. (2013) in their study using the JSBACH model, and has been further advocated and augmented by van Bodegom et al. (2014).This approach -directly implementing the observed relationships between plant traits and their climate drivers -has the benefit that it uses the data available at the present time with greater fidelity.In theory, and as we have demonstrated, this approach should improve our ability to allow prediction of current vegetation patterns.We are, for example, telling the model that leaf lifespan decreases with temperature, rather than expecting this property to emerge from a more complex set of dynamics.
Direct prediction of traits from their environmental drivers approach suffers, however, from at least three caveats.The first is that it predicts mean trait values for given environmental conditions and thus does not represent heterogeneity of plant strategies in a single location.Furthermore, it is subject to a similar circularity of logic as the original climate envelope approach, in that the relationships of plant traits and climate may well not hold under future circumstances where atmospheric CO 2 , nitrogen deposition and other metrics of climate are heavily modified.Lastly, under a changing climate, the shift in the mean trait values is considered instantaneous, no genetic limits to plasticity are implied and there is no demographic inertia to the adoption of new, better adapted plant types.
An ideal but data-intensive approach might involve the derivation of trade-off surfaces specific to a given climate; for example, for a given investment in leaf carbon there is a climate-dependent relationship with lifespan.For most traits, except those potentially observable from space (Serbin et al., 2014), the quantity of data required to populate such a matrix will likely remain prohibitive.

Alternative solutions: evolution and optimization
One alternative solution, exemplified by the aDGVM2 model proposed by Scheiter et al. (2013), allows plant traits to evolve in response to selection pressure.This approach would likely "correct" plant traits that performed poorly under given conditions, and let the optimum evergreen and deciduous strategies emerge from the competitive process.This approach is compelling, because it removes many of the subjective elements of other existing strategies; it does not require pre-selection of particular trait combinations (as with our parametric ensemble) and allows the representation of diversity of traits in a single grid cell.One important feature of the model, however, is the assumption of globally consistent trait trade-off surfaces (from which plant types are selected), and thus further modifications might potentially be needed to allow it to function in conditions where these were variable in space.
Yet another alternative method for trait prediction is the use of optimal models of plant function.Optimal models are based on the idea that in theory better performing plants will be favored by natural selection, and therefore plants that are in existence should not display functionality that would be detrimental to their evolutionary fitness (Dewar et al., 2009).Many such approaches are already operational within various types of vegetation model (Williams et al., 1996;Fisher et al., 2007;Dewar et al., 2009;Rastetter, 2011;Medlyn et al., 2011;Franklin et al., 2012;McMurtrie and Dewar, 2013;Thomas and Williams, 2014).In this framework, it is possible to propose explicit hypotheses for how plants avoid sub-optimal performance, and to make predictions that can be tested against observations.The success of the approach depends on the fidelity of the proposed optimality criteria, how closely they align with real evolutionary fitness, and how close ecosystems really are to optimal solutions (given genetic constraints and non-equilibrium processes).
From the perspective of land surface models, these approaches are interesting because of their mechanistic approach, which reduces concerns regarding out-of-sample extrapolation into future climates.For example, predictive models of within-leaf nitrogen allocation can explain environmentally driven variations in V c,max , J max and leaf respiration, thus reducing the dependence on empirical correlations between nitrogen content and photosynthetic capacity (Xu et al., 2012).In this case, trait databases might be used as validation data, rather than as model inputs.
The idea behind optimality models is occasionally undermined by studies using a game theory perspective, which show that the optimal plant strategy in isolation differs somewhat from the optimal strategy that can compete with other plants (Van Wijk and Bouten, 2001;Van Wijk et al., 2003;Anten and During, 2011;McNickle and Dybzinski, 2013;Farrior et al., 2013;Dybzinski et al., 2014;Weng et al., 2015), illustrating the difficulties in choosing an appropriate fitness metric.In common with the direct prediction of traits from their environment, optimal models often assume only a single optimal strategy for a given set of environmental conditions, unlimited genetic plasticity, and ignore demographic inertia that may prevent ecosystems from adapting instantaneously to a changing climate.

Ways forward for trait representation in dynamic vegetation models
At present, many land surface modeling efforts use a variety of approaches to predicting plant traits, inclusive of traitfiltering (Medvigy et al., 2009;Weng et al., 2015), direct prediction of plant traits from their environment (e.g., allocation from Friedlingstein et al., 1999) and ideas from optimization theory (e.g., stomatal conductance, vertical N allocation).Many parallel concepts exist for how to define plant traits within advanced vegetation models (Dewar et al., 2009;Scheiter et al., 2013;van Bodegom et al., 2014;Fyllas et al., 2014), but the circumstances under which it is most appropriate to use which methodology is a topic that has not been discussed widely.To move the science of vegetation modeling forward, we argue that it will become necessary to understand under what conditions empirical "short cuts" to predict traits are acceptable and necessary, and under what circumstances detailed mechanistic prediction is either possible or desirable.In the first instance, it is, of course, imperative to both further advance the collection of data on plant traits and processes where possible, and to continue investigations into plant trait databases that already exist, ideally in a context that is linked to the requirements of predictive models (e.g., Falster et al., 2011;Wang et al., 2012;Reich et al., 2014;van Bodegom et al., 2014;Fyllas et al., 2014).We consider that the analysis of plant trait data to determine how both envi-ronmental conditions and plant strategies (such as the "fastslow" axis, proposed by Reich, 2014) can be used to generate robust predictive models is an extremely high priority.It is worth noting also that while our study does not consider the impact of changing climate on carbon cycle processes, the alternative structural variants imply both different lag times and feedbacks to the impact of climate, via the use, or otherwise, of direct impacts of temperature on turnover processes.

On the use of ensembles in land surface modeling
Another aspect of our study highlights the importance of ensembles for the investigation of model properties.It is the default practice, in land surface modeling and climate science generally, to present results using the name of a particular model to depict an invariant set of default parameter and structural assumptions (e.g., CLM4.5, JULES1.0,ED2) and to assess the merits of only one version of a model from the hyper-dimensional set of potentially viable model predictions.Such "simple" tests of model performance against observations, however, explicitly convolute the structural, parametric and initial condition contributions to model error, and, therefore, interpretation of mismatches with data is difficult.We here argue that increased use of both structural and parametric ensembles is beneficial for the development of understanding of complex land surface modeling schemes.
In Earth system modeling more widely, the use of initial condition ensembles is increasingly considered to be critical for the evaluation of model behavior (Kay et al., 2014;Wettstein and Deser, 2014;Falloon et al., 2014;Lombardozzi et al., 2014;Swart et al., 2015).Model inter-comparison projects, both for Earth system models (Friedlingstein et al., 2014;Arora et al., 2013) and their land surface model components (Sitch et al., 2008;Powell et al., 2013;Kauwe et al., 2013;Zaehle et al., 2014;Christoffersen et al., 2014;Walker et al., 2014), are used as a means of investigating the impact of alternative model structures, although typically the high dimensionality of the inter-model differences renders it difficult to assess the causes of differences between models (but cf.Zaehle et al., 2014).In this study we investigate a variety of model structures within the same framework.This approach, also adopted by Williams et al. (2001), Bonan et al. (2012Bonan et al. ( , 2014)), Joetzjer et al. (2014), Reich et al. (2014), Burakowski et al. (2015), and Dahlin et al. (2015) among others, enables the differences caused by individual modifications to be quantified and understood, and therefore potentially provides a more tractable approach to understanding the processes leading to prediction differences than a standard model inter-comparison experiment.Perturbation of the parameters of land surface models (referred to as "perturbed physics" ensembles) is rarely undertaken at scales larger than one grid cell (but cf.Fischer et al., 2011 andBooth et al., 2012) on account of the high time and energy costs of global model simulations.Perturbed physics ensembles of Earth system models have been conducted but have typically focused on processes unrelated to the land component (Sanderson et al., 2008).While some objective statistical techniques have been used for single sites (Fox et al., 2009;Hou et al., 2012;Medvigy et al., 2009;Sargsyan et al., 2014), inverse model calibration of DGVMs over large regions is not yet considered a computationally tractable problem.More typical is the process of ad hoc parameterization, either using values of observable parameters from the literature that may or may not be representative of globally relevant values, or the use of "tunable" parameters that might be adjusted to bring the overall model behavior closer to observations, as also discussed by Scheiter et al. (2013) and Reich et al. (2014).Thus, model parameters are typically not optimized and therefore the comparison of model performance to benchmarking data (Randerson et al., 2009;Luo et al., 2012) is not necessarily a good test of the structural validity of the model components (Abramowitz et al., 2008).Model structural performance is therefore much more commonly assessed at individual sites, where sensitivity to parameters can be investigated more comprehensively (Bonan et al., 2014).An alternative path forward might be to present models with no default parameter values, and instead with a range of physiologically plausible parameters, thus reducing the correspondence between named model structures and a single deterministic set of outputs.

Conclusions
We introduce a new methodology for the simulation of vegetation dynamics into the Community Land Model (v4.5).The new module is based on the Ecosystem Demography framework of Moorcroft et al. (2001) with numerous modifications.We present an investigation into the properties of the model for the case study of evergreen-deciduous biome boundaries in eastern North America.We find that the model is sensitive to the variation in parameters drawn from existing plant databases, and to variation in the representation of the carbon cycle, in particular, to the initial target leaf biomass, and to the implementation of direct prediction of traits (leaf lifespan, and root lifespan) from environmental variables (mean annual temperature).We also find that the model is capable of predicting leaf area index and GPP within the range of the observations, and that for some trait combinations, prediction of the positioning of biome boundaries is close to the observations.Our study particularly emphasizes three challenges: (1) uncertainty about when it is appropriate to use environmental drivers to modify plant trait trade-offs, (2) remaining structural uncertainty within models, particularly with regard to carbon allocation processes, and (3) uncertainty resulting from "noise" around trait tradeoffs in existing databases.Nonetheless, echoing Reich et al. (2014), the capacity to understand the prediction of biome boundaries from first principles is both interesting and im-portant.We hope that further study of the quantitative nature of biome boundaries will be motivated by this analysis.

Figure 1 .
Figure 1.Relationships between log leaf mass per unit area and log leaf lifespan (upper panel) and nitrogen per unit leaf area (lower panel) for evergreen needleleaf trees, from data reported byWright et al. (2004).Large circles are from the database, and smaller circles are randomly chosen points from the resampled normally distributed covariance matrix.

Figure 2 .
Figure 2. Relationships between log leaf mass per unit area and log leaf lifespan (upper panel) and nitrogen per unit leaf area (lower panel) for cold deciduous broadleaf trees, from data reported byWright et al. (2004).Large circles are from the database, and smaller circles are randomly chosen points from the resampled normally distributed covariance matrix.

Figure 3 .
Figure 3. Relationship between mean annual temperature ( • C) and leaf lifespan (years) derived from the GLOPNET leaf trait database for evergreen broadleaf trees (yellow), evergreen needleleaf trees (blue), broadleaf deciduous trees (red), and deciduous needleleaf trees (green).Evergreen broadleaf and deciduous needleleaf tree data are not used in this analysis, but are shown for comparison here.

Figures 4 ,
Figures 4, 5 and 6 illustrate the R 2 , relative RMSE and summary statistics for each structural variant and parameter combination.Figures7, 8, 9 and 10 show the simulated evergreen

Figure 4 .
Figure 4. R 2 coefficients of the spatial correlation between model output and five different data product metrics.The x axis pertains to variation in the parametric ensemble, and the y axis pertains to variation in the structural ensemble.

Figure 5 .
Figure 5. Root mean square error, relative to the mean of the variable, of the spatial correspondence between model output and five different data product metrics.The x axis pertains to variation in the parametric ensemble, and the y axis pertains to variation in the structural ensemble.

Figure 6 .
Figure 6.Mean values (over the spatial domain) of GPP, LAI and F eg output.The x axis pertains to variation in the parametric ensemble, and the y axis pertains to variation in the structural ensemble.Units are KgC m −2 year −1 for GPP, m 2 m −2 for LAI and fraction cover for F eg .

Figure 7 .
Figure 7. Fraction of evergreen trees projected with structural ensemble member i (the control simulation).Panel (a): VCF product estimates of F eg .Panels (b)-(p) correspond to the 15 different combinations used in the parametric ensemble.

Figure 8 .Figure 9 .
Figure 8. Fraction of evergreen trees projected with structural ensemble member ii (control + ALLOC variant).VCF product data are shown in panel (a).Panels (b)-(p) correspond to the 15 different combinations used in the parametric ensemble.

Figure 10 .
Figure 10.Fraction of evergreen trees projected with structural ensemble member v (control + ALLOC + RESP + LL_TEMP + RL_TEMP variants).VCF product data are shown in panel (a).Panels (b)-(p) correspond to the 15 different combinations used in the parametric ensemble.

Figure 11 .
Figure 11.Mean annual leaf area index (m 2 m −2 ) projected with structural ensemble member v (control + ALLOC + RESP + LL_TEMP + RL_TEMP variants).MODIS LAI product data are shown in panel (a).Panels (b)-(p) correspond to the 15 different combinations used in the parametric ensemble.

Figure 12 .
Figure 12.GPP in KgC m −2 year −1 projected with structural ensemble member v (control + ALLOC + RESP + LL_TEMP + RL_TEMP variants).Flux-derived product data are shown in panel (a).Panels (b)-(p) correspond to the 15 different combinations used in the parametric ensemble.

Figure 13 .
Figure 13.Absolute difference in NPP (KgC m −2 year −1 ) between ENT and DBT (higher ENT productivity is positive) for year 3 of simulation.Panels (b)-(p) correspond to the 15 different combinations used in the parametric ensemble.

Figure 14 .
Figure 14.Absolute difference in NPP (KgC m −2 year −1 ) between ENT and DBT (higher ENT productivity is positive) for year 14 of simulation.Panels (b)-(p) correspond to the 15 different combinations used in the parametric ensemble.

Table 1 .
Parameter combinations for the 15 ensemble members for leaf lifespan (L l ) in years, leaf mass per area (M a ) in gC m −2 and area-based nitrogen content (N area ) in g m −2 .