Competition between plant functional types in the Canadian Terrestrial Ecosystem Model (CTEM) v. 2.0

The Canadian Terrestrial Ecosystem Model (CTEM) is the interactive vegetation component in the Earth system model of the Canadian Centre for Climate Modelling and Analysis. CTEM models land–atmosphere exchange of CO2 through the response of carbon in living vegetation, and dead litter and soil pools, to changes in weather and climate at timescales of days to centuries. Version 1.0 of CTEM uses prescribed fractional coverage of plant functional types (PFTs) although, in reality, vegetation cover continually adapts to changes in climate, atmospheric composition and anthropogenic forcing. Changes in the spatial distribution of vegetation occur on timescales of years to centuries as vegetation distributions inherently have inertia. Here, we present version 2.0 of CTEM, which includes a representation of competition between PFTs based on a modified version of the Lotka–Volterra (L–V) predator–prey equations. Our approach is used to dynamically simulate the fractional coverage of CTEM’s seven natural, non-crop PFTs, which are then compared with available observation-based estimates. Results from CTEM v. 2.0 show the model is able to represent the broad spatial distributions of its seven PFTs at the global scale. However, differences remain between modelled and observation-based fractional coverage of PFTs since representing the multitude of plant species globally, with just seven non-crop PFTs, only captures the large-scale climatic controls on PFT distributions. As expected, PFTs that exist in climate niches are difficult to represent either due to the coarse spatial resolution of the model, and the corresponding driving climate, or the limited number of PFTs used. We also simulate the fractional coverage of PFTs using unmodified L–V equations to illustrate its limitations. The geographic and zonal distributions of primary terrestrial carbon pools and fluxes from the versions of CTEM that use prescribed and dynamically simulated fractional coverage of PFTs compare reasonably well with each other and observation-based estimates. The parametrization of competition between PFTs in CTEM v. 2.0 based on the modified L–V equations behaves in a reasonably realistic manner and yields a tool with which to investigate the changes in spatial distribution of vegetation in response to future changes in climate.

timescales of days to centuries. Version 1.0 of CTEM uses prescribed fractional coverage of plant functional types (PFTs) although, in reality, vegetation cover continually adapts to changes in climate, atmospheric composition, and anthropogenic forcing. Changes in the spatial distribution of vegetation occur on timescales of years to centuries as vegetation distributions inherently have inertia. Here, we present version 2.0 of CTEM which includes 10 a representation of competition between PFTs based on a modified version of the Lotka-Volterra (L-V) predator-prey equations. Our approach is used to dynamically simulate the fractional coverage of CTEM's seven natural, non-crop PFTs which are then compared with available observation-based estimates. Results from CTEM v. 2.0 show the model is able to represent the broad spatial distributions of its seven PFTs at the global scale. However, Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | 1 Introduction Dynamic global vegetation models (DGVMs) are now considered an integral component of Earth system models (ESMs). DGVMs model vegetation as a dynamic component of the earth system allowing simulation of the atmosphere-land flux of CO 2 and other trace gases.
A corresponding ocean carbon cycle component in ESMs simulates the atmosphere-ocean 5 flux of CO 2 . Together they allow modelling of the atmospheric CO 2 concentration as a prognostic variable in ESMs and quantification of the biogeochemical feedbacks that affect the carbon cycle and physical climate system (Arora et al., 2013;Arora and Boer, 2014). Modelling vegetation as a dynamic component of the climate system also permits simulation of biophysical feedbacks. Land surface vegetation in ESMs responds to variations in weather 10 and climate and, in turn, influencse the climate, on hourly to centennial timescales. Over short periods (hourly to daily), the simulated latent heat flux is affected by changes in vegetation's stomatal conductance which controls the rate of transpiration. On seasonal timescales, changes in leaf area index (LAI) influence the land surface albedo and the transpiration and evaporation of intercepted precipitation from canopy leaves. On timescales 15 from inter-annual to centennial, changes in structural attributes of vegetation such as LAI, vegetation height, rooting depth and, in particular, the spatial distribution of vegetation, affect the land surface albedo and various components of the land surface energy and water balance. These biophysical interactions between vegetation and climate are well documented Kim and Wang, 2007;Gobron et al., 2010). 20 DGVMs typically represent vegetation in terms of a modest number of plant functional types (PFTs). This simplification is justified by the ability to categorize plant species on the basis of their form and interaction with the environment (Box, 1996). For example, fir (Abies), spruce (Picea), and pine (Pinus) can all be classified as needleleaf evergreen trees. From a purely practical standpoint, PFTs are used because it is impossible, at present, to obtain 25 model parameters for the thousands of plant species based on the sparse available data.
In its simplest form, the dynamic behaviour of vegetation can be described by two aspects: changes in plant structure and changes in areal coverage. First, as vegetation re-Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | sponds to changes and variability in climate, its structure changes affecting its height, LAI, and rooting depth, typically on seasonal to decadal time scales. Second, vegetation also adapts by changing its areal extent. Areal changes are slower than the structural changes, typically occurring on decadal to centennial time scales for woody vegetation and subdecadal to decadal time scales for herbaceous vegetation. Together, these changes in 5 structure and areal extent capture vegetation's dynamic behaviour in response to changes and variability in climate at different time scales.
DGVMs may incorporate only the structural, or both the structural and areal, aspects of vegetation dynamics. The Canadian Terrestrial Ecosystem Model (CTEM) (Arora, 2003;Arora and Boer, 2005a;Arora et al., 2009;Melton and Arora, 2014), for example, has so 10 far incorporated only structural vegetation dynamics, i.e. it uses prescribed fractional coverages of its nine PFTs (seven natural and two crop PFTs; see Table 1). The fractional coverages of PFTs in CTEM are based on a reconstruction of historical land cover from 1850 to present as described in Wang et al. (2006) and Arora and Boer (2010). Other DGVMs with prescribed fractional coverages of their PFTs include CLM4 (Lawrence et al.,15 2011) and TEM (Tian et al., 1999). Obviously, this approach does not account for changes in plant distribution due to changing climate or atmospheric CO 2 concentration. Examples of DGVMs that account for both structural and areal vegetation dynamics include TRIFFID, which is currently implemented in the Hadley Centre's Earth System Model (HadGEM2-ES) (Jones et al., 2011), and the LPJ DGVM . 20 Removing the constraint of specified fractional coverage of PFTs and simulating the areal vegetation dynamics adds another degree of freedom to DGVMs. Simulating both structural and areal vegetation dynamics in a realistic manner is a more stringent test of a DGVM's abilities. An incorrect simulation of the geographical distribution of PFTs will lead to a similarly flawed distribution of primary terrestrial carbon pools and fluxes, regardless if the 25 model correctly simulates the structural vegetation dynamics.
Modelling areal dynamics of vegetation requires simulating competition for available space and resources between PFTs. In the real world, plants compete for space to acquire both above-ground (light) and below-ground (water and nutrients) resources. Attempts to Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | capture competition between plants to allow these interactions have generally been accomplished using three different kind of models: (i) theoretical models, (ii) "gap" models, and (iii) DGVM-based models.
Theoretical models (e.g. Tilman, 1982 andTilman, 1994) generally relate mortality and colonization to bulk parameters that are not easily related to physiological 5 traits or mechanisms (although Pacala and Tilman, 1994, allow some mechanistic approaches). Gap models, which attempt to explicitly represent plant growth that occurs following the creation of a forest gap, provide a more mechanistic approach that simulates resource competition between individual trees at the spatial scale of few meters (e.g. Bugmann, 2001). There have been attempts to integrate gap model dynamics into DGVMs (e.g. 10 SEIB-DGVM Sato et al., 2007, LPJ-GUESS Smith et al., 2001, and the ED approach Moorcroft et al., 2001). At the large spatial scales at which DGVMs are typically implemented (0.5 to 4 • resolution), gap model dynamics impose a high computational load and thus have generally been applied only at local and regional scales (e.g. Eastern North America in Fisher et al., 2015). An exception is that SEIB-DGVM has been globally applied in the 15 Kyousei2 model albeit with difficulties representing the global vegetation distribution (for e.g. see Fig. 13 in (Sato et al., 2007). DGVM-based competition schemes are either based on, (i) the Lotka-Volterra (L-V) equations (e.g. TRIFFID Cox, 2001, USCM Brentnall et al., 2005and CTEM Arora and Boer, 2006a, (ii) simpler approaches that assume PFTs occupy area in proportion to their net primary productivity (NPP) (e.g. LPJ Sitch et al., 2003, 20 ORCHIDEE Zhu et al., 2015), or (iii) account for competitiveness via both NPP and mortality (e.g. JSBACH Brovkin et al., 2009). These latter DGVM-based approaches generally suffer from an amplified expression of dominance, i.e. the dominant PFT ends up occupying a disproportionately large fraction of a grid cell. As a result, forests are simulated in regions which, in reality, are largely savannas (e.g. see Fig. 2 in Cramer et al., 2001, for HYBRID, 25 TRIFFID, LPJ and IBIS models).
Here, we evaluate CTEM v. 2.0 which explicitly models the competition for space between PFTs using a modified version of the L-V equations (Lotka, 1925;Volterra, 1926) to interactively determine the fractional coverages of seven non-crop PFTs as a function of climate Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | and soils. The competition scheme used here was presented in Arora and Boer (2006a, b), but this study is the first evaluation of the scheme on a global scale. Section 2 summarizes CTEM v. 2.0 and includes a description of its competition parametrization based on a modified version of the L-V equations. Section 3 describes the experimental setup and observation-based datasets used for model evaluation. Results from simulations with pre- 5 scribed and dynamically simulated fractional coverages of CTEM's seven non-crop PFTs, along with discussion, are presented in Sect. 4. We also present results from simulations that use an unmodified version of the L-V equations for comparison. These simulations with unmodified L-V equations are included to demonstrate the necessity of modifying the equations for use in simulating competition between PFTs by highlighting the tendency for the 10 unmodified L-V relations to simulate excessive coverage of dominant PFTs. Finally, conclusions are presented in Sect. 5. A detailed model description of CTEM v. 2.0 is provided in the Appendix.

CTEM structure and processes
Version 1 of the Canadian Terrestrial Ecosystem Model (CTEM) is the terrestrial carbon cy- 15 cle component of the second generation Canadian Earth System Model (CanESM2) (Arora et al., 2011) where it is coupled to version 2.7 of the Canadian Land Surface Scheme (CLASS). CTEM v. 2.0, used here, is currently coupled to the CLASS v. 3.6 (Verseghy, 2012). CTEM models terrestrial ecosystem processes for nine PFTs, two of which are crop PFTs (see Table 1), by tracking the flow of carbon through three living vegetation com-20 ponents (leaves, stem and roots) and two dead carbon pools (litter and soil). The carbon budget equations for the model's five pools are summarized in Sect. A7 of the Appendix. The amount of carbon in these five carbon pools is simulated prognostically. CLASS models the land surface energy and water balance and calculates liquid and frozen soil moisture, and soil temperature for three soil layers (with thicknesses 0.1, 0.25 and 3.75 m). In the 25 coupled mode, CLASS uses structural vegetation attributes (including leaf area index (LAI), vegetation height, canopy mass and rooting depth) simulated by CTEM, and CTEM uses Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | soil moisture, soil temperature and net radiation calculated by CLASS. Combined, CLASS and CTEM simulate the atmosphere-land fluxes of energy, water and CO 2 . Version 1.0 of CTEM is described in a collection of papers detailing parameterization of photosynthesis, autotrophic and heterotrophic respiration (Arora, 2003); phenology, carbon allocation, biomass turnover, and conversion of biomass to structural attributes (Arora and 5 Boer, 2005a); dynamic root distribution (Arora and Boer, 2003); and disturbance (fire) (Arora and Boer, 2005b). These processes are modelled over prescribed fractional coverages of nine PFTs (Wang et al., 2006) and determine the structural vegetation dynamics including vegetation biomass, LAI, vegetation height, fraction of roots in each of the three soil layers, leaf onset and offset times, and primary CO 2 fluxes of gross (GPP) and net (NPP) primary 10 productivities. A full description of CTEM and changes from v. 1.0 to v. 2.0 are included in the Appendix.
A parametrization for competition between PFTs in an earlier version of CTEM (v. 1.1) is described by Arora and Boer (2006a, b) where it was evaluated at select locations. Here we present CTEM v. 2.0 which builds upon the model framework of CTEM v. 1.0 and can be run 15 in two different modes, either (i) using specified fractional coverages of its nine PFTs, or (ii) allowing the fractional coverages of its seven non-crop PFTs to be dynamically determined based on competition between PFTs. The parameterization for simulating competition between PFTs is summarized in Sect. 2.1. The fire parametrization has also been refined in the new model version as described in Appendix A9. The CLASS-CTEM modelling frame- 20 work has the capability of representing the sub-grid scale variability of PFTs using either a composite or a mosaic configuration (Li and Arora, 2012;Melton and Arora, 2014). In the composite (or single tile) configuration, the vegetation attributes for all PFTs present in a grid cell are averaged and used in energy and water balance calculations that determine the physical land surface conditions including soil moisture, soil temperature, and thickness 25 and fractional coverage of snow (if present). In the mosaic (or multi-tile) configuration each PFT is allocated its own tile for which separate energy and water balance calculations are performed. As a result, the simulated carbon balance evolves somewhat differently in the Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | two configurations despite being driven with identical climate forcing (see Melton and Arora, 2014). The results presented in this paper are obtained using the composite configuration.

Representation of competition between PFTs in CTEM
Competition between PFTs in CTEM is based upon modified L-V equations (Arora and Boer, 2006a, b). The L-V equations (Lotka, 1925;Volterra, 1926) have been adapted from 5 their initial application for simulating predator-prey interactions in ecosystem models as described below.

Competition parameterization
The change in fractional coverage (f ) of a PFT α through time, dfα dt , is expressed as the result of mortality and competition and colonization (CC) interactions with the other PFTs 10 present in a grid cell and bare ground, collectively represented as B where α / ∈ B: The CC interactions are represented symbolically by the g(f α , f B ) function. Mortality is assumed to be proportional to the number density of plants and represented by the mor- 15 tality term, m α f α . The PFT-dependent mortality rate (m α ; day −1 ) (described further in Sect. 2.1.3) produces bare ground via a number of processes, and that bare ground is subsequently available for colonization. We consider the fractional coverage for N PFTs plus bare ground (f N +1 = f bare ) where N +1 j=1 f j = 1. For competition between unequal competitors the PFTs are ranked in terms of their dominance. If PFT α is the most domi-20 nant, it will invade the area of other PFTs and the bare ground (f B , α / ∈ B). Woody PFTs are all more dominant than grass PFTs since trees can successfully invade grasses by over-shading them (Siemann and Rogers, 2003) and thus are ranked higher. Within tree or grass PFTs the dominance rank of a PFT is calculated based upon its colonization rate (c α ; day −1 ) with higher colonization rates giving a higher dominance ranking. For the general 25 case of PFT α with a dominance rank of i, we describe the ranking from most dominant 4858 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | to least as 1,2,. . .,i,i + 1,. . .,N . Equation (1) can then be reformulated following a phenomenological approach as: 5 where the exponent b is an empirical parameter which controls the behaviour of the L-V equations. In the original L-V formulation, b is 1, but we modify the L-V relations by using b = 0 following Arora and Boer (2006a, b)(implications of this choice are expanded upon below). The fractional cover of PFT α then changes depending on the gains it makes into the area of less dominant PFTs and the losses it suffers due to mortality and encroachment 10 by more dominant PFTs. The rate of change of the bare fraction, f bare , is expressed as (3) The rate at which PFT α invades another PFT β is given by 15 A PFT invading bare ground has an unimpeded "invasion" rate, c α . The ratio of the invasion rate by PFT α into area covered by another PFT β and its unimpeded invasion rate ( gives the relative efficiency of colonization, termed δ α,β , which is a scalar between 0 and 1. δ is 1 for invasion of any PFT into bare ground and 1 for tree PFT invasion into grass PFTs. 20 If a PFT β has a lower dominance ranking than another PFT α then δ β,α = 0 implying that sub-dominant PFTs do not invade dominant PFTs, but get invaded by them i.e. δ α,β = 1. Equation (2) can then be written more succinctly for each PFT as: The binary nature of the treatment of the b term is related to the manner in which two PFTs interact represented by f b α f β in equations 2-4. When b = 1, as in the unmodified L-V equations, the interaction between PFT α and PFT β occurs over the fraction f α f β . If the fractional coverages f α and f β are taken to indicate the probability of finding a particular PFT in some region of the grid square, then the probability of independently finding both 5 is the product f α f β . Interaction occurs in these common regions in what might be termed the "random interaction" case. The choice of b = 0, following Arora and Boer (2006a) by contrast, implies that the dominant PFT has full access to all subdominant PFTs and invades them in proportion to their coverage in what might be termed the "full interaction" case. This case may be thought of as corresponding to the general availability of the seeds 10 of the dominant PFT that may germinate and invade the coverage of the subdominant PFT provided the climate is favourable.
The fraction of NPP used for spatial expansion, Λ α , is calculated using the leaf area index (LAI α ; m 2 leaf (m 2 ground) −1 ) of a PFT. 25 4860 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | The original formulation of Arora and Boer (2006a) only considered λ 1,α but here we ad- 5 just the parameterization with the addition of λ 2,α which ensures that a small fraction of NPP is used for spatial expansion even at very low LAI values. This additional constraint allows improved fractional coverage of grasses in arid regions. Similar to S sap,α , LAI min,α and LAI max,α are PFT-dependent parameters (see Table 1).
The value of λ max is set to 0.1 so that a maximum of 10 % of daily NPP can be used for 10 spatial expansion. Finally, Λ α is set to zero for tree PFTs when they are in a full leaf out mode and all NPP is being used for leaf expansion (see Appendix A5).

Mortality
The PFT-dependent mortality rate (day −1 ), 15 reflects the net effect of four different processes: (1) intrinsic or age related mortality, m intr , (2) growth or stress related mortality, m ge , (3) mortality associated with bioclimatic criteria, m bioclim and (4) mortality associated with disturbances, m dist . Intrinsic or age related mortality uses a PFT-specific maximum age, A max (Table 1), to 20 calculate an annual mortality rate such that only 1 % of tree PFTs exceed A max,α . Intrinsic mortality accounts for processes whose effect is not explicitly captured in the model including insect damage, hail, wind throw, etc. Grasses and crops have m intr = 0. 25 4861 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | The annual growth related mortality m ge is calculated using growth efficiency of a PFT over the course of the previous year following Prentice et al. (1993) and Sitch et al. (2003): where m ge,max represents the PFT-specific maximum mortality rate when no growth occurs 5 (Table 1), g e is the growth efficiency of the PFT (g C m −2 ) calculated based on the maximum LAI (L α,max ; m 2 m −2 ) and the increment in stem and root mass over the course of the previous year (∆C S and ∆C R ; kg C m −2 , respectively) (Waring, 1983). k m is a parameter set to 0.3 m 2 (g C) −1 . 10 Mortality associated with bioclimatic criteria, m bioclim (0.25 yr −1 ), is applied when climatic conditions in a grid cell become unfavourable for a PFT to exist and ensures that PFTs do not exist outside their bioclimatic envelopes, as explained in the next section.
The annual mortality rates for m intr , m ge , and m bioclim are converted to daily rates and 15 applied at the daily time step of the model, while m dist is calculated by the fire module of the model based on daily area burned for each PFT as summarized in Appendix A9. In practice, the dfα dt = −m dist,α f α term of Eq. (5) is implemented right after area burnt is calculated.

Bioclimatic limits and existence
The mortality associated with bioclimatic criteria, m bioclim , ensures that PFTs do not venture 20 outside their bioclimatic envelopes. The bioclimatic criteria that determine PFT existence are listed in Table 2 for tree PFTs. Bioclimatic limits are not used for the C 3 and C 4 grass PFTs. The bioclimatic limits represent physiological limits to PFT survival that are either not captured in the model or processes that are not sufficiently described by empirical observations to allow their parametrization. Some examples of the latter include a plant's resistance Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | include: the minimum coldest month air temperature (T cold min ), the maximum coldest month air temperature (T cold max ), the maximum warmest month air temperature (T warm max ), the minimum number of annual growing degree days above 5 • C (GDD5 min ), the minimum annual aridity index (ratio of potential evapotranspiration to precipitation; arid min ), and the minimum dry season length in a year (dryseason min ), where the dry season length represents the num-5 ber of consecutive months with precipitation less than potential evaporation. The bioclimatic indices are updated on a 25 year time scale (T = 25) such that the slowly changing value of a bioclimatic index X(t + 1) for time t + 1 is updated using its previous year's value X(t) and its value x(t) for the current year: 10 Equation (15) implies that 63 % of a sudden change in the value of a bioclimatic index ∆x is reflected in 3 Simulations and observation-based data 15

Simulations
We perform three equilibrium simulations for pre-industrial conditions on a Gaussian prescribed (hereafter referred to as the PRES simulation), (ii) based on the CTEM competition parametrization as outlined in Sect. 2.1.1 (hereafter the CTCOMP simulation), and (iii) based on competition between the PFTs using the original L-V formulation (hereafter the LVCOMP simulation). The competition parameterization in the LVCOMP simulation uses a value of b = 1 in Eq. (5) similar to the L-V equations used in the TRIFFID (Cox, 2001) and 25 4863 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | USCM (Brentnall et al., 2005) models. By contrast, the CTCOMP simulation uses a value of b = 0.
The simulations are driven with the CRU-NCEP v. 4 climate data (Viovy, 2012). The climate corresponding to the 1901-1940 period are used repeatedly until the model carbon pools reach equilibrium. The climate data are disaggregated from their original six-hourly 5 temporal resolution to a half-hourly time step. Surface temperature, surface pressure, specific humidity, and wind speed are linearly interpolated. Longwave radiation is uniformly distributed across a six hour period, and shortwave radiation is diurnally distributed over a day based on a grid cell's latitude and day of year with the maximum value occurring at solar noon. Precipitation is treated following Arora (1997), where the total six hour pre-10 cipitation amount is used to determine the number of wet half-hours in a six-hour period. The six hour precipitation amount is then spread randomly, but conservatively, over the wet half-hourly periods. Soil texture information (percent sand, clay, and organic matter) for the three soil layers is adapted from Zobler (1986). The driving climate data and soil and land cover information are interpolated to 3.75 • resolution. All simulations are performed off-line, 15 i.e. feedbacks between the vegetation and climate are not possible. Off-line simulations allow for more straightforward interpretation of model results and are a necessary first step before proceeding with coupled simulations that incorporate two-way interactions between the land surface and atmosphere.
The simulations with prescribed PFT fractions (PRES) use a reconstruction of land cover 20 for the year 1861 based on the approach described in Wang et al. (2006) (hereafter referred to as W2006) and Arora and Boer (2010). Briefly, the snapshot of land cover in the Global Land Cover 2000 (GLC2000) data set for year 2000 is first mapped on to the CTEM PFTs and then extended back in time to account for changing crop area according to the HYDE v. 3.1 dataset . We refer to this as the modified W2006 data set compared 25 to the original W2006 land cover product which was based on the Ramankutty and Foley (1999)

Observation-based data sets
A Moderate-Resolution Imaging Spectroradiometer (MODIS) derived land cover product (Friedl et al., 2013) and the modified W2006 product are used to evaluate the simulated fractional coverages of CTEM's seven non-crop PFTs. The GLC2000 land cover product comprising of 22 land cover types is mapped on to the 10 nine PFTs represented in CTEM by W2006 (their Table 2). W2006 split the broadleaf deciduous PFT into cold deciduous and drought deciduous versions. They made the simple but reasonably realistic assumption that at high latitudes (above 34 • ) deciduousness is caused primarily by seasonality in temperature while at lower latitudes (below 24 • ) deciduousness is caused by seasonality in precipitation, with a linear transition in between. In addition, they 15 separated grass and crops into their C 3 and C 4 sub-types as mentioned above.
The MODIS land cover product contains 17 land cover types and we map it to CTEM PFTs following the mapping scheme of W2006, as closely as possible (see Supplement  Table). The MODIS product is averaged across the years 2001-2011 and interpolated to the Gaussian 96 × 48 resolution. We do not attempt to subdivide the fractional coverage 20 of MODIS broadleaf deciduous trees into their cold and drought deciduous versions, nor do we subdivide fractional coverage of MODIS grasses and crops into their C 3 and C 4 subtypes, since our classification scheme is not ground-truthed to the extent the approach used by W2006 has been. As such, the MODIS based dataset is used only for comparisons of global total coverages while the modified W2006 data set is also used for comparison of 25 geographic distribution of CTEM's seven non-crop PFTs.
Both the MODIS-derived product (17 land cover types) and the modified W2006 land cover (based on GLC2000's 22 land cover types) product are subject to errors in catego-Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | rizing remotely-sensed vegetation into broad scale vegetation types and then their further mapping onto CTEM's nine PFTs. The latter mapping requires assumptions about which of the land cover types contribute to which of the CTEM PFTs and in what proportion, including the bare fraction. For instance, W2006 assign the GLC2000 "tree cover, needle-leaved, deciduous" land cover type to CTEM's needleleaf deciduous tree, grass and bare fraction in 5 proportions of 80, 10 and 10 %, respectively. The subjectiveness inherent in reconstructing observation-based data sets of PFT fractional coverage that can be directly compared to model output therefore implies that these data sets only provide information to constrain simulated results at large continental scales.
We also compare the latitudinal distribution of simulated gross primary productivity

Global values
The performance of the original LV-based competition parameterization (LVCOMP) and its modified version as implemented in CTEM (CTCOMP) is evaluated at the most basic level in Fig. 1a comparing the simulated coverage of trees and grasses, and the bare and vegetated 5 areas, at the global scale with observation-based estimates of these quantities from the modified W2006 and MODIS-derived datasets.
Overall, the modelled global coverage of trees and grasses, and the global bare and vegetated areas in the CTCOMP and LVCOMP simulations compare reasonably with their observation-based estimates. The global tree cover in the CTCOMP simulation is about 6% 10 lower than the W2006 dataset but 20% more than the MODIS-derived estimates, which are for the modern period. Simulated coverages by CTCOMP lie in between the observationbased estimates based on the modified W2006 and the MODIS-derived products, except for the area covered by grasses which is lower than both observation-based estimates (13% and 24%, respectively). LVCOMP simulated global area covered by trees is highest while 15 coverage of grasses and vegetated area is lowest compared to the CTCOMP simulation and observation-based estimates. Total vegetation cover in the CTCOMP (90.7 × 10 6 km 2 ) and LVCOMP simulations (87.6 × 10 6 km 2 ) is slightly too small compared to the modified W2006 data set (97.8 × 10 6 km 2 ) but close to the MODIS-derived estimate (89.9 × 10 6 km 2 ) for the modern period. Consequently, the simulated bare area (CTCOMP, 53.0 × 10 6 20 km 2 , and LVCOMP, 56.1 × 10 6 km 2 ) is overestimated compared to the modified W2006 estimate (45.8 × 10 6 km 2 ) and closer to the MODIS-derived estimate (56.8 × 10 6 km 2 ). dry-deciduous and cold-deciduous sub-types, and grass and crop PFTs into C 3 and C 4 sub-types, as mentioned earlier.
The global areas of individual PFTs in Fig. 1b from CTCOMP and LVCOMP simulations compare reasonably to the two observation-based estimates. The most dominant and least dominant non-crop PFTs are BDL-DCD trees and needleleaf deciduous (NDL-DCD) trees, 5 respectively, for both simulations and observation-based estimates. Crop areas are specified based on the modified W2006 data set in the LVCOMP and CTCOMP simulations and therefore same for all three cases. The total crop area based on the MODIS data set for the modern period is higher than the value from the modified W2006 data set which corresponds to 1860, consistent with the increase in crop area over the historical period. As 10 a result MODIS-derived global area covered by all trees (in Fig. 1a) and individual tree PFTs (in Fig. 1b) is lower than the value based on the modified W2006 product.
As with Fig. 1a, the LVCOMP simulation does somewhat poorly compared to the CT-COMP simulation. The simulated global area for broadleaf deciduous trees is the highest, and C 3 grasses is the lowest, in the LVCOMP simulation compared to CTCOMP simulation 15 and observation-based estimates.  Fig. 5 which plots root mean square difference (RMSD) against the spatial correlation when results from the LVCOMP and CTCOMP simulations are compared to the modified W2006 product.

Geographical distributions
The broad spatial distribution of tree cover is simulated reasonably well in the CTCOMP simulation, including the boreal and tropical forests, although the model tends to simulate less tree cover in arid regions, especially the southwest US. The spatial correlation between the modelled tree fraction in the CTCOMP simulation and the modified W2006 data set is 5 0.79 with a RMSD of 22.8 %. Three main factors make simulating tree cover challenging in our modelling framework. First, the sub-grid scale climatic niches are unresolved at the ∼ 3.75 • grid resolution used in this study. This is especially problematic for grid cells with large variations in climate. For example, this is the case in western Mexico where a strong elevation gradient creates climate niches for trees on the windward side of the Sierra Madre 10 Occidental, while the leeward side of the mountains is very arid. Second, CLASS and CTEM do not presently simulate shrubs as a separate PFT. Shrubs are extensive in both hot (e.g. Australia) and cold (e.g. the Arctic) semi-arid to arid regions. In these regions the model either grows only grasses, or small trees. Finally, the limited number of natural PFTs represented in CTEM (seven) limit our ability to capture the entire spatial range of PFTs, 15 for e.g. needleleaf evergreen trees occur naturally both in the Canadian Yukon territory and the southwest US. The trees found in these two regions have extensive physiological adaptations to their very different habitats. The single needleleaf evergreen PFT in the model, however, has only one set of parameters and as a result the model is unable to represent needleleaf evergreen trees realistically in both regions (as also seen in Fig. 3a). 20 The spatial distribution of trees in the LVCOMP simulation is similar to that in the CT-COMP simulation but generally shows less tree cover in arid regions and higher cover in predominately forested regions compared to both the W2006 dataset and the CTCOMP simulation. The LVCOMP simulation demonstrates similar spatial correlation (0.76) but poorer RMSD (27.5 %) against the modified W2006 data set than the CTCOMP simula-25 tion (Fig. 5). The higher simulated tree cover in the LVCOMP simulation, in areas where trees do well, is a characteristic of the LV formulation, which allows the dominant PFT to cover a large fraction of the grid cell allowing little coexistence (as discussed in detail by Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Arora and Boer, 2006a). The result of this inability to allow adequate PFT coexistence also leads to less gradation from areas with high tree cover to areas of low tree cover in the LVCOMP simulation compared to the CTCOMP simulation. This amplified expression of dominance, which supresses PFT coexistence, is also seen in the Community Land Model-Dynamic Global Vegetation Model (CLM-DGVM) which uses the competition module of the 5 LPJ dynamic vegetation model (Levis et al., 2004). In the CLM-DGVM simulation reported in Zeng et al. (2008) trees tend to occupy more than 90 % of a grid cell in locations where they are able to exist (their Fig. 7a). The Zeng et al. (2008) simulation is also unable to capture the continuous stretch of boreal forest from Canada's west to east coast. 10 The spatial coverage of grass in the CTCOMP simulation shows a fairly reasonable agreement compared to modified W2006 data set, especially on the African continent ( Fig. 2), although the spatial correlation of 0.38 with the modified W2006 data set is lower than that for trees (Fig. 5).

Grass cover
Grass cover in CTEM, regardless of the competition parameterization (CTCOMP or LV-15 COMP), is influenced by three main factors: (i) tree coverage, as trees are considered superior to grasses because of their ability to invade them, (ii) moisture availability, and (iii) the disturbance regime. Higher disturbance rates act to reduce tree cover and increase grass cover because grasses colonize faster than trees after a disturbance. Frequent disturbance thus reduces the superiority of trees over grasses. 20 Grass coverage is overestimated in the northern high latitude regions in the CTCOMP simulation (Fig. 2). This is likely due to CTEM not representing shrub PFTs (as mentioned above), allowing greater grass coverage in areas where shrubs should otherwise be extensive. Additionally, CTEM has no moss or lichen PFTs which are also wide-spread in those regions and generally known to inhibit grasses. Another constraint is that CTEM represents 25 only one C 3 grass PFT which does not allow us to accommodate physiological adaptations over the broad climatic envelope in which grasses exist. As grasses cannot outcompete trees, low grass cover is seen in the CTCOMP simulation in areas where CTEM overes-4870 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | timates tree cover, for example the southern regions of Brazil and Uruguay. In areas with very high tree cover (like the western Amazon) the CTCOMP simulation tends to slightly underestimate tree cover (by < 10 %) which allows some grass cover (< 10 %) while the modified W2006 dataset shows negligible grass cover.
The grassland extent in the parts of the US Plains and Canadian Prairies is also under-5 estimated in the CTCOMP simulation. The modified W2006 data set shows around 60 % grass cover in this region while CTEM estimates about 30 % grass cover. This underestimate could be due to the following three reasons. First, fire is simulated interactively in CTEM and biases in simulated area burned will influence the PFT dynamics as fire tends to reduce tree cover and increase grass cover. The fire parameterization includes both human 10 (as a function of population density) and natural influences but cultural differences in the human influence are not represented (see Appendix A9). While CTEM simulates more fire in the Plains/Prairie region (annual fraction burned is around 1-5 %) than indicated by the remotely-sensed GFED v. 3 dataset (Giglio et al., 2010(Giglio et al., ) (which covers 1996(Giglio et al., -2009, it is difficult to estimate a realistic amount of annual burned area during the pre-industrial period. 15 Second, the plains region was home to large herds of grazing herbivores prior to the 1880s, which also tend to damage tree seedlings giving beneficial conditions for grass expansion. Herbivory is not represented in CTEM. Finally, the simulated soil moisture in these regions could be overly moist allowing trees to cover a larger fraction than if the soil moisture was more limiting. 20 Grass cover in the LVCOMP simulation is relegated to areas where trees are not able to effectively colonize, due to conditions being overly arid, overly cold and/or with high disturbance regimes. In the LVCOMP simulation, even more extensive grass cover is simulated in the high latitudes than in the CTCOMP simulation, approaching 90 % in many regions. The LVCOMP simulation's low grass cover in many other regions reflects the high tree cover 25 estimated in the LVCOMP simulation due to its inability to allow appropriate PFT coexistence (as mentioned earlier) acting to exclude grasses. As a result the grass cover in the LVCOMP simulation is low and patchy and differs greatly from the modified W2006 dataset, as is also indicated by a low spatial correlation and higher RMSD in Fig. 5.
Bare ground in the modified W2006 data set occurs primarily in arid, mountainous and arctic regions (Fig. 2c), as is normally expected. Simulated bare fraction in the CTCOMP simulation compares reasonably well with the modified W2006 estimates (spatial correlation of 0.85), especially in Africa and Eurasia although the model simulates a higher bare fraction 5 in the southwest United States, Mexico and Australia. The model generally also simulates a greater bare fraction in other areas like the Andes mountains of South America and the Kalahari desert of Africa. The overestimation of bare fraction in the CTCOMP simulation in these regions reflects the processes mentioned earlier which affect the simulated distribution of trees and grass. As well, the coarse spatial resolution (∼ 3.75 • ) of our study prevent 10 us from capturing sub-grid scale niches. For example, in mountainous regions, large elevation gradients create climatic niches that allow vegetation to exist whereas using mean climate averaged over a large grid cell may prevent vegetation existence in the model. Other regions, such as the arid and semi-arid interior regions of Australia and Mexico have sub-grid scale features that allows the formation of landscapes with banded (Tongway and 15 Ludwig, 2001) or patchy (Tongway and Ludwig, 1994) vegetation. These sub-grid scale processes are not resolved in our modelling framework which assumes uniform soil and climate conditions within each grid cell. The LVCOMP simulation shows a similar pattern to that in the CTCOMP simulation, but with slightly more bare area especially in the southwestern US and Mexico and poorer RMSD and spatial correlation (Fig. 5).

Individual PFTs
The comparison of geographical distributions of simulated tree, grass cover, and bareground cover with observation-based estimates shows that, while some limitations remain, the competition parameterization of CTEM performs reasonably realistically. A more stringent test of the model performance is the comparison of geographical distributions of in- 25 dividual tree and grass PFTs against observation-based estimates, as shown in Fig. 3 for tree PFTs and Fig. 4 for grass PFTs. Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Needleleaf evergreen (NDL-EVG) trees
The modelled global total areal extent of the needleleaf evergreen tree PFT (NDL-EVG) in the CTCOMP simulation is around 5 % less than the modified W2006 estimate and about 21 % more than the modern-day MODIS-derived estimate (Fig. 1b). Spatially, the simulated fractional cover of NDL-EVG trees compares reasonably to W2006 in the boreal regions 5 ( Fig. 3 row a). The modelled fractional coverage of NDL-EVG trees is, however, overly extensive in central and western Europe as well as the southern tip of South America with too little NDL-EVG in Mexico, parts of the boreal forest of Canada, and in a band across Eurasia. A model limitation is that it is difficult to have NDL-EVG trees exist as far south as Mexico, as observations suggest, but not encroach into southern Africa (we incorrectly sim-10 ulate a small fraction of NDL-EVG for a few grid cells in that region). Much of this over-and under-estimation can be attributed to the use of a single PFT for NDL-EVG trees that are in reality physiologically distinct depending on their geographical location (as discussed in Sect. 4.1. The issue of physiological disctinction between needleleaf evegreen trees growing in differnt regions is also illustrated by Reich et al. (2014) who show that ned- 15 dle longevity reduces as mean annual temperature increases and that this is an adaptive, rather than genetic, response. In contrast, the NDL-EVG trees in the model have a constant needlelife span regardless of their geographic location. The limitations of the use of a single NDL-EVG tree PFT in CTEM has also been demonstrated by Peng et al. (2014) who implemented CTEM at a regional scale for the province of British Columbia, Canada. 20 Peng et al. (2014) found while the use of a single NDL-EVG tree PFT yields reasonable LAI and GPP in coastal British Columbia, which experiences mild temperate climate, the same PFT parameters result in lower than observed LAI and GPP in the continental interior of the province which experiences a colder and drier climate.
The simulated distribution of NDL-EVG trees in the LVCOMP simulation differs greatly 25 from the modified W2006 estimate with much lower coverage of NDL-EVG trees except in certain regions where the simulated coverage of NDL-EVG trees is high (> 60 %). The global simulated coverage of NDL-EVG trees in the LVCOMP simulation is about 25 % Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | less than the modified W2006 estimate although it compares well to the modern-day MODIS-derived estimate. Comparing the RMSD and spatial correlation between the modified W2006 data set and the model results shows a reduced RMSD and a higher correlation, both of which imply an improved performance, in the CTCOMP compared to the LVCOMP simulation (Fig. 5).

Needleleaf deciduous (NDL-DCD) trees
The observation-based distribution of needleleaf deciduous (NDL-DCD) trees is primarily limited to areas in Eastern Siberia with some additional areas of low coverage in North America (< 10 %) ( Fig. 3 row b). This pattern predominantly reflects the distribution of larches (Larix spp.), a common NDL-DCD tree species found in much of Canada, Scandi-10 navia, and Russia, with their primary habitat being the boreal forests of Russia. To prevent NDL-DCD trees from occupying expansive regions in both Eurasia and North America, bioclimatic limits are used to limit their range. As a result, NDL-DCD in both CTCOMP and LV-COMP simulations exist in a closely defined area with little gradation from the regions where NDL-DCD exist to regions where they do not, reflecting the effect of the common bioclimatic 15 limits. Within the climatically-favourable regions for NDL-DCD, the simulated fractional cover in the CTCOMP simulation is similar to the modified W2006 data set while the simulated fractional coverage in the LVCOMP simulation is too high. The total area covered by NDL-DCD trees in the CTCOMP simulation is ∼ 28 % lower than in the modified W2006 data set but ∼ 43 % higher than the modern-day MODIS-derived estimate (Fig. 1b). The total area 20 covered by NDL-DCD trees in the LVCOMP simulation is ∼ 16 % lower than the modified W2006 estimate but over twice the MODIS-derived estimate. There is a slight reduction in RMSD and small increase in spatial correlation, when compared to the modified W2006 values, in the CTCOMP compared the LVCOMP simulation (Fig. 5).
While CTEM parameterizes the broadleaved evergreen (BDL-EVG) PFT as tropical trees (through bioclimatic limits, see Section 2.1.4), the W2006 dataset also includes evergreen shrubs when mapping the GLC2000 land cover to the nine CTEM PFTs (see Table 2 of Wang et al., 2006). As a result, the modified W2006 data set shows small areas of BDL-5 EVG outside the tropical regions. Excluding regions outside of the tropics, the modelled distribution of BDL-EVG trees in the CTCOMP simulation compares reasonably well to the modified W2006 data set with expansive coverage along the equator. The model generally simulates higher than observed coverage of BDL-EVG around Indonesia and the western margin of the Amazon and parts of southeast Asia. In the LVCOMP simulation, BDL-EVG 10 trees have a smaller range and almost full coverage in regions where they are simulated to exist with little gradation from areas with high to areas with low coverage, which is characteristic of the unmodified L-V formulation. The global coverage of BDL-EVG trees in the modified W2006 data set is 19.4 million km 2 with 16.7 million km 2 in the tropics (between 30 • S and 30 • N). The simulated global coverage of the BDL-EVG trees in the CTCOMP 15 and LVCOMP simulations is somewhat larger at 18.7 and 18.2 million km 2 , respectively.
The improved spatial distribution of BDL-EVG trees in the CTCOMP compared to the LV-COMP simulation is also seen in Fig. 5.
The western margin of the Amazon River basin shows the limitations of the ∼ 3.75 • spatial resolution used in our study. The observations show a gradation from high coverage 20 in the interior of the basin to lower values near the Pacific coast, reflecting the rise of the Andes mountains and the change in climate across them (Fig. 3 right column of row c). In our modelling framework, however, climate averaged over the large ∼ 3.75 • grid cells yields amenable conditions for BDL-EVG trees to exist across the mountain range. The model also does not take into account the fraction of a grid cell that may be covered by rocks and 25 stony outcrops which would make it impossible for the natural PFTs to expand into those areas. Consideration of the "rocky" fraction of a grid cell will likely improve model simulated tree cover in mountainous regions. Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Cold (BDL-DCD-COLD) and drought (BDL-DCD-DRY) deciduous broadleaf trees
In CTEM, broadleaf deciduous (BDL-DCD) trees are divided into cold deciduous (BDL-DCD-COLD) and drought/dry deciduous (BDL-DCD-DRY) sub-types. The bioclimatic indices for BDL-DCD-COLD and BDL-DCD-DRY trees are assigned so that their geographical 5 distributions are broadly limited to regions where their deciduousness is primarily controlled by temperature and soil moisture, respectively. For the observation-based data set, this distinction is more arbitrary. W2006 used latitudinal limits of > 34 and < 24 • for characterizing deciduousness due to temperature and soil moisture, respectively, with a linear gradation in between, as mentioned earlier (also visible in Fig. 3). As a result, in the modified W2006 10 data set, the cold and drought deciduous sub-types of the BDL-DCD PFT can coexist between the latitudes of > 24 and < 34 • . A bioclimatic index that has been evaluated at a few sites to predict the trigger of deciduousness could present a better predictor and will be investigated in future work. The bioclimatic indices and the methodology used in CTEM preclude the two deciduous versions from coexisting. While it is fairly obvious that temper- 15 ature based bioclimatic indices will be needed for determining the geographical distribution of BDL-DCD-COLD trees, we found that temperature based indices alone are not sufficient for determining the geographical distribution of BDL-DCD-DRY trees, as would also be intuitively expected. In addition to minimum coldest month temperature, we therefore also use the annual aridity index and the length of the dry season for determining the geographical 20 distribution of BDL-DCD-DRY trees. The simulated geographical distribution of BDL-DCD-COLD in the CTCOMP simulation is broadly similar to the modified W2006 data set, although some differences remain. Compared to the modified W2006 data set, the model simulates larger BDL-DCD-COLD extent in western Canada (and lower NDL-EVG as a result) and smaller BDL-DCD-COLD cover- 25 age in southeast Europe (with comparably more NDL-EVG). These areas demonstrate that the competition between the NDL-EVG and BDL-DCD-COLD PFTs is not perfectly modelled in these regions. The arbitrary latitudinal limits used by W2006 imply that BDL-DCD-COLD Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | trees exist as far north as Australia's Gold Coast in the eastern state of Queensland which is certainly not realistic given its tropical climate. The model does not simulate any broad scale existence of BDL-DCD-COLD trees between 24 and 34 • S, although it reasonably estimates their coverage in New Zealand. The simulated expanse of BDL-DCD-COLD trees is much smaller in the LVCOMP simulation and most regions, where it is present, generally 5 show high coverage (60-90 %). The global total areal coverage of BDL-DCD-COLD in the CTCOMP simulation is very similar to the modified W2006 data set while it is ∼ 22 % higher in the LVCOMP simulation, despite its restricted spatial extent.
The simulated geographical distribution of BDL-DCD-DRY trees in the CTCOMP simulation is also broadly similar to the modified W2006 data set with some differences over 10 India, and in southern Africa near Botswana and Zambia. CTEM coverage in these regions is lower by approximately 20 % than the modified W2006 estimate. As with other tree PFTs, the simulated areal extent of BDL-DCD-DRY trees in the LVCOMP simulation is more limited compared to the CTCOMP simulation and the modified W2006 data set, but of higher percent cover in the grid cells where BDL-DCD-DRY trees do exist. The global 15 total areal coverage of BDL-DCD-DRY trees in the CTCOMP and LVCOMP simulations are ∼ 12 % lower and ∼ 5 % higher than the modified W2006 estimate, respectively. In Fig. 5, the RMSD between the simulated and modified W2006 values shows a large decrease in the CTCOMP compared to the LVCOMP simulation.
The total BDL-DCD coverage, consisting of both the cold and dry deciduous sub-types, in 20 the CTCOMP simulation is similar to the estimates based on MODIS and modified W2006 products but much higher in the LVCOMP simulation (Fig. 1b).

C 3 and C 4 grasses
Both C 3 and C 4 sub-types of grasses are represented in CTEM. The geographical distributions of C 3 and C 4 grasses are determined solely on competitive advantages provided 25 by their different photosynthetic pathways and parameter values that determine their differing responses to environmental conditions; bioclimatic indices are not used to limit the range of C 3 and C 4 grasses (see Table 2). The fractional coverage of grasses is, of course, 4877 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | influenced to a large extent by the fractional coverage of trees which are considered hierarchically superior to grasses. The C 3 and C 4 grasses therefore compete for the remaining fraction of a grid cell that the trees are unable to occupy. Figure 4 compares the simulated geographical distribution of C 3 and C 4 grasslands with the observation-based modified W2006 data set, but with the caveat that the W2006 data 5 set itself is based on the part-modelled and part-observation-based product of Still et al. (2003) which estimates the C 4 fraction of total vegetation. The model is successfully able to simulate the coexistence of C 3 and C 4 grasses, especially in the CTCOMP simulation. The simulated global areal coverage of C 3 grass in the CTCOMP simulation is ∼ 19 % less than the modified W2006 data set and this is especially noticeable in areas like the US Plains 10 and the Brazilian highlands (Fig. 4). Low grassland estimation reflects either an overly high tree cover (as is the case in the Brazilian highlands, see Fig. 2a) or too extensive areas of bare ground (as is the case in the US Plains, see Fig. 2c).
Reasons for overly extensive bare ground include either the inadequacies of using a single grass PFT globally for each photosynthetic pathway (C 3 and C 4 ) or simulated soil mois- 15 ture that is too dry in those regions to allow grasses to effectively colonize. Appropriate grass cover is, however, simulated for other regions including southern Australia and the Tibetan plateau in the CTCOMP simulation. The model overestimates grass cover in the high latitudes regions. This overestimation possibly reflects: (i) the lack of sensitivity to high latitude climate of the single C 3 grass PFT used in the model, or (ii) the lack of shrubs, 20 mosses, and lichens in CTEM which allows the expansion of grasses into areas that are otherwise shrublands or moss and lichen covered, as mentioned earlier. An additional possibility is that CTEM does not presently take in sub-grid scale soil depth, which may be important in parts of the Arctic and would limit plant cover. The simulated grasslands in the LVCOMP simulation are more isolated with even higher coverage in parts of the Arctic 25 region. In addition, the original L-V formulation does not capture the African savannas and the grasslands of South America.
The simulated distribution of C 4 grass in the CTCOMP simulation compares favourably to the modified W2006 data set with a similar geographical distribution and fractional cover.
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Small differences include an underestimation in the Brazilian highlands (due to an overestimation of the simulated tree cover) and an overestimation over India (due here to an underestimation of simulated tree cover). The total global area covered by C 4 grass also compares well to the modified W2006 estimate (Fig. 1), while the simulated global coverage in the LVCOMP simulation is low and patchy.

Primary terrestrial carbon pools and fluxes
The differences between the simulation with prescribed fractional coverages of PFTs (PRES) and the simulations in which fractional coverages of PFTs are dynamically simulated (CTCOMP and LVCOMP) are also the result of somewhat different parameter values between the two. The simulations with dynamically determined fractional coverages of PFTs 10 provide an additional constraint which the model must meet, i.e. the observation-based estimates of the fractional coverages of PFTs. We found that the default parameter values used for simulations with prescribed fractional coverages of PFTs did not yield optimum comparison with observation-based estimates of fractional coverages of PFTs. Changes in parameter values were required because the simulations with dynamically determined frac- 15 tional coverages of PFTs include additional processes such as mortality generating bare ground (see Sect. 2.1.3) and spatial expansion consuming a fraction of NPP that then reduces vegetation productivity. Bare ground fraction does not need to be generated when fractional coverages of PFTs are prescribed. In the PRES simulation mortality of grasses is implicitly represented through the normal turnover of the leaves and root components 20 ( Table 1). The additional constraint provided by observation-based fractional coverages of PFTs provides the opportunity to improve model parameters. Parameter changes when spatial competition between PFTs is explicitly modelled include a slight reduction in base respiration rates for NDL-EVG and BDL-EVG with an increase for BDL-DCD-DRY (Table  A2). These values were adjusted to optimize the CT-COMP results against observation- 25 based datasets. The maximum cold and drought stress loss rates, BDL-EVG leaf life span, as well as the temperature threshold for determing the cold stress scalar were optimized in a similar manner for several PFTs (Table A5 and A6). The PFT-specific base allocation Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | fractions for leaf, stem, and roots (Table A4) were revised to be in line with values derived from Luyssaert et al. (2007) and Litton et al. (2007) and then further optimized against the observation-based datasets. These changes in parameter values are listed in Appendix tables alongside the values used in PRES simulations (specifically Tables 1, A2, A4, A5, and  A6). While, in the CTCOMP simulation, these changes in parameter values allow simulation 5 of reasonable geographical distribution of modelled PFTs and generally improve aspects of the model, they also adversely affect some aspects as discussed later in Sect. 4.4.2. Table 3 compares the global values of primary terrestrial carbon pools and fluxes from the PRES, CTCOMP and LVCOMP simulations with each other, and other model-and 10 observation-based estimates. Figure 6 shows the simulated zonally-averaged values for vegetation and soil carbon, and GPP from the three simulations together with available observation-based estimates. Generally, the simulated global values for primary terrestrial carbon pools and fluxes in the CTCOMP and LVCOMP simulations are similar to those in the PRES simulation, and values from all simulations compare reasonably to previously 15 reported model and observation-based estimates (Table 3 and Fig. 6).

Global and zonally-averaged values
GPP is slightly higher in the CTCOMP and LVCOMP simulations compared to the PRES simulation. The primary reason for this increase is the changes in PFT specific parameters, rather than the spatial distribution of PFTs itself. Performing the PRES simulation, with the same parameter set as in the CTCOMP and LVCOMP simulations, yields a global GPP 20 value of 140.8 Pg C yr −1 . The zonally-averaged GPP in the CTCOMP and LVCOMP simulations compares well to that simulated in the PRES simulation and the observation-based estimate of Beer et al. (2010) (Fig. 6) with an overestimate below 40 • S, where there is relatively little land.
Similar to GPP, NPP in the CTCOMP and LVCOMP simulations is 9.1 and 6.2 % higher, 25 respectively, compared to the PRES simulation (Table 3). Evapotranspiration is also slightly higher in the CTCOMP and LVCOMP simulations (3.2 and 4.8 %, respectively) than in the Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | PRES simulation, although values from all three simulations compare well with model and observation-based estimates summarized in Trenberth et al. (2011).
The simulated vegetation biomass and soil carbon mass are lowest in the CTCOMP simulation amongst the three simulations. The LVCOMP simulation conversely yields the highest vegetation biomass and soil carbon mass. The PRES simulation has intermedi- these regions compared to the PRES simulation and observation-based estimate of Ruesch and Holly (2008). Similarly the high northern latitude tree cover is somewhat less than that in the PRES simulation and this brings the amount of estimated vegetation biomass lower in the CTCOMP and LVCOMP simulations. We believe, however, that Ruesch and Holly (2008) estimates of vegetation biomass are too low at high latitudes as discussed by Peng Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Annual fire emissions are highest in the CTCOMP simulation even though it has the lowest annual burned area (24.1 % less than in the PRES simulation) while the LVCOMP has the largest burned area (7.8 % more than in the PRES simulation) but the same amount of emissions as the PRES simulation (Table 3). The lowest annual burned area in the CTCOMP simulation, however, compares best with contemporary observation-based es-5 timates (Table 3). Fire in CTEM influences the vegetation biomass differently in the simulations with prescribed (PRES) and dynamically-modelled (CTCOMP and LVCOMP) fractional coverages of PFTs. In the PRES simulation, biomass burning associated with fire reduces vegetation biomass density since the fractional coverage is not allowed to change, essentially causing thinning of the vegetation biomass. In the CTCOMP and LVCOMP sim-10 ulations, fire additionally creates bare ground which reduces PFT fractional cover and is subsequently available for colonization.

Geographical distributions
The geographical distribution of GPP, vegetation biomass, soil carbon mass and burned area from the PRES, CTCOMP and LVCOMP simulations are compared in Fig. 7 and Fig. 8   15 with available observation-based estimates.
The geographical distribution of GPP (Fig. 7a) shows somewhat higher productivity in the tropics for the CTCOMP and LVCOMP simulations compared to PRES simulation, as also seen in the zonally-averaged values. Simulated GPP is also marginally higher in the CTCOMP and LVCOMP simulations in the Arctic region than in the PRES simulation due to 20 the overestimated grass cover. Modelled GPP in all three simulations (LVCOMP, CTCOMP, PRES) is generally lower than observation-based estimates in arid regions including the Atacama desert, the Australian outback, and the deserts of the Central Asian region.
The geographical distribution of vegetation biomass between the three simulations ( Fig. 7b) differs the most in the mixed forests of the northeast US and eastern Canada with 25 a lower biomass in the CTCOMP and LVCOMP simulations compared to the PRES simulation. Parts of Scandinavia and north-western Russia show higher vegetation biomass in the PRES simulation compared to the CTCOMP and LVCOMP simulations. The revised model 4882 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | parameters (see tables in the Appendix) yield lower and more realistic vegetation biomass in some high-latitude regions in the CTCOMP simulation, while in the PRES simulation the vegetation biomass in these high-latitude regions is unrealistically as high as in the tropics. However, the model performance in the CTCOMP and LVCOMP simulations is adversely affected in the tropics where the simulated vegetation biomass is high compared to the PRES 5 simulation and observation-based estimates (Saatchi et al., 2011). As with simulated GPP, modelled vegetation biomass is lower than observation-based estimates in arid regions.
As with the zonally-averaged values, the geographical distribution of soil carbon (Fig. 8a) is broadly similar in the PRES, CTCOMP and LVCOMP simulations. Differences exist in areas where the grassland extent differs between the simulations (e.g. the US 10 Plains/Canadian Prairies and the high Arctic) because in CTEM grasses are characterized by low soil decomposition rates which result in high below-ground carbon amounts. Compared to the geographical distribution of observation-based estimates of soil carbon, modelled values are generally higher over the forested areas including the Amazonian region, Central Africa, and south eastern China. 15 The geographical distribution of annual burned area is broadly similar in the PRES, CT-COMP and LVCOMP simulations. All three simulations show higher area burned in savanna regions of the tropics (Fig. 8b) and lower values in extra tropical regions. CTEM, however, generally underestimates area burned in the savanna regions compared to observationbased estimates (e.g. GFED v. 3 Giglio et al., 2010). The simulations differ as area burned 20 is especially sensitive to grassland extent. Larger fractional coverage of grasses in a grid cell implies a higher fire spread rate (see Appendix A9) and lower grid-averaged root-zone soil moisture content (since grasses are shallow rooted compared to trees), both of which increase area burned.
Overall the impact of dynamically modelling fractional coverages of PFTs in the CT- 25 COMP simulation yields largest differences for simulated vegetation biomass and annual area burned by fire, while other simulated primary carbon pools and fluxes remain similar to those in the PRES simulation and to observation-based estimates.
Modelling vegetation spatial dynamics explicitly in DGVMs overcomes the limitations inherent in prescribing a vegetation cover that is unable to respond to changes in climate, atmospheric CO 2 concentration and other forcings. Current approaches used in global-scale DGVMs include: (i) the use of unmodified L-V equations (e.g. as in TRIFFID Cox, 2001 5 and USCM Brentnall et al., 2005), (ii) simple approaches that assume the best performing PFT can occupy more space (typically ranked by their NPP, e.g. LPJ Sitch et al., 2003), and (iii) approaches that account for competitiveness via both NPP and mortality (e.g. JSBACH Brovkin et al., 2009) The strength of bioclimatic limits imposed within models varies greatly. At one end of the spectrum there are biogeographic models where present-day geographical distributions of PFTs are used to derive climate envelopes within which PFTs can exist. While this 20 approach gives information about a PFT's areal extent under present-day conditions, the information is essentially binary, i.e. within a grid cell the PFTs exist or they do not. This biogeography approach does not provide any information about the fraction of a grid cell that the PFTs occupy nor how the PFTs' competitive success might evolve under changing environmental conditions. At the other end of the spectrum are parameterizations that re-25 quire no bioclimatic information to limit species extent, i.e. each PFT's geographic extent is solely derived from its physiological responses and competitive interactions with other PFTs arising from modelled processes. We are not aware of any competition parameterizations Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | used within a global-scale DGVM that does not require the use of bioclimatic constraints. There are, however, recent attempts at developing models without bioclimatic limitations at the regional-scale. Fisher et al. (2015) recently implemented the Ecosystem Demography (ED) concept into the CLM-DGVM to model competition between needleleaf evergreen and broadleaf cold deciduous trees based on their leaf traits in the Eastern United States. Their 5 attempt at prognostically determining biome boundaries for these two PFTs has some success, but only for particular model traits and structural assumptions. Attempts like these are promising but much work needs to be done to bring this concept to the global scale, across the full suite of PFTs, and with reasonable computational cost. At present, most models fall somewhere in between these two endpoints with the aim of removing imposed bioclimatic 10 constraints and allowing model physiologic processes to determine PFT geographical extents. The inability to globally model realistic geographic distributions of PFTs without the use of bioclimatic constraints is not surprising given our limited understanding of plant physiological processes that control PFT distributions and their competitive advantages.
We believe the use of bioclimatic constraints is reasonably moderate in CTEM. With 15 the exception of the needleleaf deciduous PFT, the model uses fairly relaxed bioclimatic constraints for its tree PFTs. We do not use any bioclimatic limits for the C 3 and C 4 grass PFTs. The grasses' simulated distribution, including the competition between them and their coexistence in the tropics, is the result solely of the physiological processes that are explicitly modelled. The result is that the number of natural, non-crop PFTs that can exist 20 and attempt to colonize in a grid cell varies from two (in extreme climates like the high Arctic or alpine where trees cannot survive) to a maximum of five, out of a possible total of seven (Fig. 9). For these two to five PFTs that can exist in every grid cell, the model uses a modified version of the L-V equations, as introduced in Arora and Boer (2006b), to explicitly model competition between them. 25 Equilibrium off-line simulations corresponding to the year 1861 using the original and modified L-V equations show that the model is able to capture the broad geographical distributions of tree and grass cover, and the bare fraction, when compared to the observationbased modified W2006 data set, especially when using the modified L-V equations. The Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | global areas covered by tree and grass PFTs, the bare fraction, and the individual PFTs also compare reasonably to the observation-based estimates of the modified W2006 data set and those derived from a MODIS product (Friedl et al., 2013). Indeed, Fig. 5 demonstrates that the CTCOMP simulation, that uses modified version of the L-V equations, is better able to reproduce the observation-based PFT distributions of the modified W2006 5 data set for all PFTs, than the LVCOMP simulation, which uses the original L-V equations. The LVCOMP simulation is shown to suffer from an amplified expression of dominance which results in high tree cover in regions where trees can exist and a sparse grass cover except in the Arctic region where grasses do not face competition from trees. This suggests that other models using the unmodified L-V equations would likely also suffer from this 10 deficiency.
Some limitations are evident in our simulations. Overall, the simulated tree and grass cover is low in hot arid regions and the simulated grass cover is overly high in cold arid regions. As a result the model generates more bare fraction in Australia, the Andes mountains of South America and the Kalahari desert of Africa, and too little bare fraction in the high 15 Siberian Arctic. The simulated geographical distribution of individual PFTs appears reasonable although limitations remain here too. The model simulates a small fractional coverage of NDL-EVG trees in warm regions of southern Africa, Australia and South America that is not consistent with the observation-based estimate based on the modified W2006 data set. The gradation in the fractional coverage of PFTs from regions of high to low coverage is 20 simulated much better when the modified version of the L-V equations is used but the simulated fractional coverages are still not as graded as in the modified W2006 data set. These limitations are due primarily to: (i) the coarse resolution used in our study which does not allow resolution of climatic niches, (ii) the absence of shrub and moss/lichen PFTs and (iii) the small number of natural PFTs (seven) that are represented in CTEM. These constraints Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Modelling competition between PFTs in CTEM required adjusting some of the model parameters in order to realistically simulate the fractional coverages of its seven natural, non-crop PFTs. These parameter changes removed the positive bias in simulated vegetation biomass in certain high-latitude regions but yield too high vegetation biomass in the tropics. Overall, simulated global values of GPP, NPP, vegetation biomass, soil carbon and 5 area burned in the CTCOMP simulation compare reasonably with observation-based and other model estimates (Table 3). The zonal distributions of GPP and soil carbon also show reasonable comparison with observation-based estimates although larger differences are seen for vegetation biomass with higher simulated values in the tropics, as mentioned above (Fig. 6). 10 Despite its limitations, the behaviour of the competition module that uses the modified L-V equations, in the CLASS-CTEM modelling framework, is sufficiently realistic to yield a tool with which to study the impact of changes in climate and atmospheric CO 2 concentration on the global distribution of vegetation.

A1 The model structure
The basic model structure of CTEM includes three live vegetation components (leaf (L), stem (S) and root (R)) and two dead carbon pools (litter or detritus (D) and soil carbon (H)).
The amount of carbon in these pools (C L , C S , C R , C D , C H , kg C m −2 ) is tracked prognostically through the fluxes in and out of them. The rate change equations for carbon in these 20 pools are summarized in Sect. A7 after the processes leading to the calculation of fluxes in and out of these pools are introduced in the following sections.
4887 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | A2 Photosynthesis and canopy conductance

A2.1 Net photosynthesis
All biogeochemical processes in CTEM are simulated at a daily time step except gross photosynthetic uptake and associated calculation of canopy conductance which are simulated on a half hour time step with CLASS. The photosynthesis module of CTEM calculates the 5 net canopy photosynthesis rate which, together with atmospheric CO 2 concentration and vapour pressure or relative humidity, is used to calculate canopy conductance. This canopy conductance is then used by CLASS in its energy and water balance calculations.
The photosynthesis parameterization is based upon the approach of Farquhar et al. (1980) and Collatz et al. (1991Collatz et al. ( , 1992 as implemented in SiB2 (Sellers et al., 1996) and 10 MOSES (Cox et al., 1999) with some minor modifications as described in Arora (2003). Arora (2003) outlines four possible configurations for the model based on choice of a "bigleaf" or "two-leaf" (sunlight and shaded leaves) mode and stomatal conductance formulations based on either Ball et al. (1987) or Leuning (1995). The Ball et al. (1987) formulation uses relative humidity while Leuning (1995) uses vapour pressure deficit in calculation of 15 canopy conductance. While the model remains capable of all four possible configurations, in practice, the model is usually run using the "big-leaf" parameterization with the stomatal conductance formulation of Leuning (1995) which is the configuration described here. The original description of the CTEM photosynthesis parameterization in Arora (2003) did not include discussion of all the PFTs simulated by CTEM which we expand upon here and also 20 include changes to the parameterization since version 1.0.
The gross leaf photosynthesis rate, G o , depends upon the maximum assimilation rate allowed by the light (J e ), Rubisco (J c ), and transport capacity (J s ). The limitation placed on G o by the amount of available light is calculated as (mol CO 2 m −2 s −1 ) 25 4888 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | where I is the incident photosynthetically active radiation (PAR; mol photons m −2 s −1 ), ν is the leaf scattering coefficient, with values of 0.15 and 0.17 for C 3 and C 4 plants, respectively, and ε is the quantum efficiency (mol CO 2 (mol photons) −1 ; values of 0.08 and 0.04 are used for C 3 and C 4 plants, respectively). c i is the partial pressure of CO 2 in the leaf interior (Pa) and Γ is the CO 2 compensation point (Pa) (described below). 5 The Rubisco enzyme limited photosynthesis rate, J c , is given by where V m is the maximum catalytic capacity of Rubisco (mol CO 2 m −2 s −1 ), adjusted for temperature and soil moisture, as described below. K o and K c are the Michaelis-Menten 10 constants for O 2 and CO 2 , respectively. O a is the partial pressure (Pa) of oxygen.
The transport capacity (J s ) limitation determines the maximum capacity to transport the products of photosynthesis for C 3 plants, while for C 4 plants it represents CO 2 -limitation 15 where p is surface atmospheric pressure (Pa).
V m is calculated as where T c is the canopy temperature ( • C) and T low and T high are PFT-dependent lower and 20 upper temperature limits for photosynthesis (Table A1) ) and V max is the PFT-dependent maximum rate of carboxylation by the enzyme Rubisco (mol CO 2 m −2 s −1 ; Table A1). The constant 10 −6 converts V max from units of µmol CO 2 m −2 s −1 to mol CO 2 m −2 s −1 .
4889 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | The influence of soil moisture stress is simulated via S root (θ) which represents a soil moisture stress term formulated as where S root (θ) is calculated by weighting S(θ i ) with the fraction of roots, r i , in each soil layer i and is a PFT-specific sensitivity to soil moisture stress (unitless; Table A1). φ i is the degree of soil saturation (soil wetness) given by 10 where θ i is the volumetric soil moisture (m 3 water (m 3 soil) −1 ) of the ith soil layer and θ i,field and θ i,wilt the soil moistures at field capacity and wilting point, respectively.
The CO 2 compensation point (Γ) is the CO 2 partial pressure where photosynthetic uptake 15 equals the leaf respiratory losses (used in Eqs. A1 and A2). Γ is zero for C 4 plants but is sensitive to oxygen partial pressure for C 3 plants as where σ is the selectivity of Rubisco for CO 2 over O 2 (unitless), estimated by σ =  25 4890 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Given the light (J e ), Rubsico (J c ) and transportation capacity (J s ) limiting rates, the leaf-level gross photosynthesis rate, G o (mol CO 2 m −2 s −1 ), is then determined following a minimization based upon smallest roots of the following two quadratic equations: where β 1 is 0.95 and β 2 is 0.99. When soil moisture stress is occurring, both the J s and J c terms are reduced since the V m term (Eq. A4) includes the effect of soil moisture stress through the S(θ) term and this reduces the leaf-level gross photosynthesis rate.
The current version of CTEM does not include nutrient constraints on photosynthesis and, 10 as a result, increasing atmospheric CO 2 concentration leads to unconstrained increase in photosynthesis. In natural ecosystems however, down-regulation of photosynthesis occurs due to constraints imposed by availability of nitrogen, as well as phosphorus. To capture this effect, CTEM uses a nutrient limitation term, based on experimental plant growth studies, to down-regulate the photosynthetic response to elevated CO 2 concentrations (Arora et al., 15 2009). The parameterization, and its rationale, are fully described in Arora et al. (2009) but the basic relations are summarized here. The leaf-level gross photosynthetic rate is scaled by the downregulation term, Ξ N , to yield the nutrient limited leaf level gross photosynthetic rate as where c a is the atmospheric CO 2 concentration in ppm, c 0 is the pre-industrial CO 2 concentration (285.0 ppm), γ g is 0.95 (unitless; see Arora et al., 2009). A value of γ gd lower than γ g ensures that Ξ N gradually decreases from its pre-industrial value of one as c a in- 25 4891 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | creases to constrain the rate of increase of photosynthesis with rising atmospheric CO 2 concentrations. CTEM v. 2.0 uses a γ gd value of 0. 30 (unitless). Finally, the leaf-level gross photosynthesis rate, G o,N-limited is scaled up to the canopylevel, G canopy , by considering the exponential vertical profile of radiation along the depth of the canopy as which yields the gross primary productivity (G canopy , GPP). k n is the extinction coefficient that describes the nitrogen and time-mean photosynthetically absorbed radiation (PAR) pro- 10 file along the depth of the canopy (Table A1) (Ingestad and Lund, 1986;Field and Mooney, 1986), and LAI (m 2 leaf (m 2 ground) −1 ) is the leaf area index. The net canopy photosynthetic rate, G canopy,net (mol CO 2 m −2 s −1 ), is calculated by subtracting canopy leaf maintenance respiration costs (R mL ; see Sect. A3.1) as

A2.2 Coupling of photosynthesis and canopy conductance
When using the Leuning (1995) approach for photosynthesis-canopy conductance coupling, canopy conductance (g c ; mol m −2 s −1 ) is expressed as a function of the net canopy photosynthesis rate, G canopy, net , as 20 g c = m G canopy,net p (c s − Γ) where p is the surface atmospheric pressure (Pa), the parameter m is set to 9.0 for needleleaved trees, 12.0 for other C 3 plants and 6.0 for C 4 plants, parameter b is assigned the values of 0.01 and 0.04 for C 3 and C 4 plants, respectively. V is the vapour pressure deficit 25 4892 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | (Pa) and the parameter V o is set to 2000 Pa for trees and 1500 Pa for crops and grasses. The partial pressure of CO 2 at the leaf surface, c s , is found via Here, c ap is the atmospheric CO 2 partial pressure (Pa) and g b is the aerodynamic conduc-5 tance estimated by CLASS (mol m −2 s −1 ). The intra-cellular CO 2 concentration required in Eqs. (A1)-(A3) is calculated as Since calculations of G canopy,net and c i depend on each other, the photosynthesis-canopy 10 conductance equations need to be solved iteratively. The initial value of c i used in calculation of G canopy,net is the value from the previous time step or, in its absence, c i is assumed to be 0.7c ap .
Canopy (g c ) and aerodynamic (g b ) conductance used in above calculations are expressed in units of mol CO 2 m −2 s −1 but can be converted to the traditional units of m s −1 as follows 15 g c (m s −1 ) = 0.0224 where p 0 is the standard atmospheric pressure (101 325 Pa) and T f is freezing temperature (273.16 K).

20
Version 2.0 of CTEM calculates autotrophic respiratory fluxes (R a ) from the living vegetation components and heterotrophic respiratory fluxes (R h ) from the dead litter and soil carbon pools based on the approach described in Arora (2003) but with some modifications.

4893
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | A3.1 Maintenance and growth respiration Autotrophic respiration (mol CO 2 m −2 s −1 ) is composed of maintenance, R m , and growth respirations, R g .
Maintenance respiration accounts for carbon consumed by processes that keep existing plant tissues alive and is a function of environmental stresses. Maintenance respiration is calculated on a half-hourly time step (with photosynthesis) for the leaves, R mL , and at a daily timestep for the stem, R mS , and root, R mR , components. 10 Maintenance respiration is generally strongly correlated with nitrogen content (Reich et al., 1998;Ryan, 1991). The current version of CTEM does not explicitly track nitrogen in its vegetation components. Therefore, we adopt the approach of Collatz et al. (1991Collatz et al. ( , 1992 in which the close relation between maximum catalytic capacity of Rubisco, V m , and leaf 15 nitrogen content is used as a proxy to estimate leaf maintenance respiration. where ς L is set to 0.015 and 0.025 for C 3 and C 4 plants, respectively, f PAR scales respiration from the leaf to the canopy level, similar to Eq. (A15), and the f 25 (Q 10d,n ) function accounts 20 for different temperature sensitivities of leaf respiration during day (d) and night (n). Pons and Welschen (2003) and Xu and Baldocchi (2003) suggest lower temperature sensitivity for leaf respiration during the day compared to night, and therefore we use values of Q 10d = 1.3 and Q 10n = 2.0 for day and night, respectively.
Maintenance respiration from the stem and root components is estimated based on PFT-25 specific base respiration rates (ς S and ς R specified at 15 • C, kg C (kg C) −1 yr −1 ; Table A2) 4894 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | that are modified to account for temperature response following a Q 10 function. Maintenance respiration from stem and root components, R m{S,R} , is calculated as where l v,i is the live fraction of stem or root component, i.e. the sapwood, and C i is the stem 5 or root carbon mass (kg C m −2 ). The constant 2.64×10 −6 converts units from kg C m −2 yr −1 to mol CO 2 m −2 s −1 . The live sapwood fraction, l v,i , for stem or root component is calculated following the CENTURY model (Parton et al., 1996) as 10 The Q 10 value used in Eq. (A25) is not assumed to be constant but modelled as a function of temperature following Tjoelker et al. (2001) as where T {S,R} is stem or root temperature ( • C). Stem temperature is assumed to be the same 15 as air temperature while root temperature is based on the soil temperature weighted by the fraction of roots present in each soil layer (Arora and Boer, 2003). The calculated Q 10 value is additionally constrained to be between 1.5 and 4.0.
Growth respiration, R g (mol CO 2 m −2 s −1 ), is estimated as a fraction ( g = 0.15) of the positive gross canopy photosynthetic rate after maintenance respiration has been ac- 20 counted for.
Finally, net primary productivity (NPP) is calculated as 25 4895 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | A3.2 Heterotrophic respiration Heterotrophic respiration, R h (mol CO 2 m −2 s −1 ), in CTEM is based on respiration from the litter (which includes contributions from the stem, leaf and root components), R h,D , and soil carbon, R h,H , pools.
Heterotrophic respiration is regulated by soil temperature and moisture and is calculated on a daily time step. The original heterotrophic respiration scheme is described in Arora (2003) while the modified parameterization used in CTEM v. 2.0 is detailed in Melton et al. (2015) and is briefly described here. Respiration from the litter and soil carbon pools takes 10 the following basic form.
The soil carbon and litter respiration depends on the amount of carbon in these components (C H and C D ; kg C m −2 ) and on a PFT-dependent respiration rate specified at 15 • C 15 (ς H and ς D ; kg C (kg C) −1 yr −1 ; Table A3). The constant 2.64 × 10 −6 converts units from kg C m −2 yr −1 to mol CO 2 m −2 s −1 .
The effect of soil moisture is accounted for via dependence on soil matric potential (f (Ψ)), described later. The temperature dependency of microbial soil respiration rates has been estimated by several different formulations, ranging from simple Q 10 (exponential) to 20 Arrhenius-type formulations (see review by Lloyd and Taylor, 1994). In CTEM, soil temperature influences heterotrophic respiration through a temperature-dependent Q 10 function (f 15 (Q 10 )). The value of Q 10 itself is assumed to be a function of temperature following a hyperbolic tan function.  25 where T {D,H} is the temperature of either the litter or soil carbon pool ( • C), respectively. The parameterization is a compromise between the temperature-independent Q 10 commonly found in many terrestrial ecosystem models (e.g. Cox, 2001) and the temperature dependent Q 10 of Kirschbaum (1995). While a constant Q 10 yields an indefinitely increasing respiration rate with increasing temperature, the formulation of Kirschbaum (1995) gives a continuously increasing Q 10 under decreasing temperature, which leads to unreasonably high soil and litter carbon pools at high latitudes in CTEM. The CTEM parameterization avoids these issues with a Q 10 value of about 2.0 for temperatures less than 20 • C, while 5 a decreasing value of Q 10 at temperatures above 20 • C ensures that the respiration rate does not increase indefinitely.
The temperature of the litter pool is a weighted average of the temperature of the top soil layer (T 1 ) and the root temperature (T R ) as litter consists of leaf, stem, and root litter (T D = 0.7T 1 +0.3T R ). The temperature of the soil carbon pool is calculated as the mean soil 10 temperature in the rooting zone based upon the fraction of roots in each soil layer and their temperature. The carbon in each soil layer is not explicitly tracked but assumed to adopt an exponential distribution (similar to roots; e.g. Fig. 4 of Jobbágy and Jackson, 2000).
The response of heterotrophic respiration to soil moisture is formulated through soil matric potential (Ψ; MPa). While soil matric potential values are usually negative, the formula- 15 tion uses absolute values to allow math with logarithms to be performed. Absolute values of soil matric potential are high when soil is dry and low when it is wet. The primary premise of soil moisture control on heterotrophic respiration is that heterotrophic respiration is constrained both when the soils are dry (due to reduced microbial activity) and when they are wet (due to impeded oxygen supply to microbes) with optimum conditions in between. The 20 exception is the respiration from the litter component which is assumed to be continually exposed to air, and thus never oxygen deprived, even when soil moisture content is high (0.04 > |Ψ| ≥ |Ψ sat |, where Ψ sat is the soil matric potential at saturation). The soil moisture dependence thus varies between 0 and 1 with matric potential as follows. Heterotrophic respiration for bare ground is treated separately in CTEM. The carbon con- 10 tributions to the bare ground litter and soil carbon pools come via processes such as creation of bare ground due to fire, competition between PFTs and land use change. The heterotrophic respiration is sensitive to temperature and moisture in the same manner as vegetated areas using Eqs. (A31)-(A36). The base respiration rates of ς D,bare and ς H,bare are set to 0.5605 and 0.02258 kg C (kg C) −1 yr −1 , respectively. 15 The amount of humidified litter which is transferred from the litter to the soil carbon pool (C D→H ) is modelled as a fraction of litter respiration (R h,D ) as where χ (Table A3) is the PFT-dependent humification factor and varies between 0.4 and 20 0.5. For crops, χ is set to 0.1 to account for reduced transfer of humidified litter to the soil carbon pool which leads to loss in soil carbon when natural vegetation is converted to croplands. Over the bare ground fraction χ is set to 0.45.
With heterotrophic respiration known, net ecosystem productivity (NEP) is calculated as Positive NPP is allocated daily to the leaf, stem, and root components which generally causes their respective biomass to increase, although the biomass may also decrease depending on the autotrophic respiration flux of a component. Negative NPP generally causes net carbon loss from the components. While CTEM offers the ability to use both specified 5 constant or dynamically calculated allocation fractions for leaves, stems and roots, in practice the dynamic allocation fractions are primarily used. The formulation used in CTEM v. 2.0 differs from that for CTEM v. 1.0 as described in Arora and Boer (2005a) only in the parameter values.
The dynamic allocation to the live plant tissues is based on the light, water, and leaf phe- 10 nological status of vegetation. The preferential allocation of carbon to the different tissue pools is based on three assumptions: (i) if soil moisture is limiting, carbon should be preferentially allocated to roots for greater access to water, (ii) if LAI is low, carbon should be allocated to leaves for enhanced photosynthesis, and finally (iii) carbon is allocated to the stem to increase vegetation height and lateral spread of vegetation when the increase in 15 LAI results in a decrease in light penetration.
The vegetation water status, W , is determined as a linear scalar quantity that varies between 0 and 1 for each PFT and calculated by weighting the degree of soil saturation (φ i (θ i ), Eqn. A7) with the fraction of roots in each soil layer. 20 The light status, L, is parameterized as a function of LAI and nitrogen extinction coefficient, k n (PFT-dependent see Table A1), as L = exp(−k n LAI), trees and crops max 0, 1 − LAI 4.5 , grasses (A40) 25 4899 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | For PFTs with a stem component (i.e. tree and crop PFTs) the fractions of positive NPP allocated to stem (a fS ), leaf (a fL ), and root (a fR ) components are calculated as The base allocation fractions for each component (leaves -L , stem -S , and roots -R ) are PFT-dependent (Table A4) and sum to 1, i.e., L + S + R = 1. The parameter ω a , which varies by PFT (Table A4), determines the sensitivity of the allocation scheme to changes in W and L. Larger values of ω a yield higher sensitivity to changes in L and W . 10 Grasses do not have a stem component (i.e. a fS = 0) and the allocation fractions for leaf and root components are given by 15 The above equations ensure that the allocation fractions add up to one (a fL + a fR + a fS = 1).
The dynamic allocation fractions are superseded under three conditions. First, during the leaf onset for crops and deciduous trees, all carbon must be allocated to leaves (a fL = 1, a fS = a fR = 0). Second, the proportion of stem plus root biomasses to leaf biomass must satisfy the relationship: where C S , C R , and C L are the carbon in the stem, root, and leaves, respectively. The parameter η is PFT-specific (Table A4) and parameter κ has a value of 1.6 for trees and crops 4900 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | and 1.2 for grasses. Both parameters are based on the Frankfurt Biosphere Model (FBM) (Lüdeke et al., 1994). This constraint (Eq. A46) is based on the physical requirement of sufficient stem and root tissues to support a given leaf biomass. As grasses have no stem component, Eq. (A46) determines their root to shoot ratio (i.e. the ratio of belowground to aboveground biomass). The final condition ensures that a minimum realistic root to shoot 5 ratio is maintained for all PFTs (lr min , listed in Table A4). Root mass is required for nutrient and water uptake and support for the aboveground biomass. If the minimum root to shoot ratio is not being maintained, carbon is allocated preferentially to roots.

A5 Leaf phenology
The leaf phenology parameterization used in CTEM v. 1.0 is described in detail by Arora 10 and Boer (2005a). Changes between version 1.0 and 2.0 are limited to parameter values and the parameterization is briefly described here. There are four different leaf phenological states in which vegetation can be at a given instant: (i) no leaves or dormant, (ii) maximum growth, (iii) normal growth, and (iv) leaf fall or harvest. PFTs may go through only some, or all, of these phenological states depending on their deciduousness. A broadleaf cold de-15 ciduous tree, for example, transitions through all these four states in a year. In winter, the broadleaf cold deciduous trees are in the no leaves/dormant state; favourable climatic conditions in spring trigger leaf growth and the tree enters the maximum leaf growth state when all the NPP is allocated to leaves to accelerate leaf out; when the LAI reaches a threshold (described below) the tree enters the normal leaf growth state and NPP is also allocated to 20 stem and root components; finally the arrival of autumn triggers leaf fall and the trees go into the leaf fall mode where no carbon is allocated to leaves (but it continues for roots and stems). When all the leaves have been shed, the trees go into the no leaves or dormant state again and the cycle is repeated the next year. The evergreen tree PFTs and the grass PFTs do not enter the leaf fall state and maintain a leaf canopy as long as environmental 25 conditions are favourable. Although drought and cold stress cause accelerated leaf loss compared to the normal leaf turnover from these PFTs, they do not explicitly go into the leaf fall mode where the intent is lose all leaves in a specified amount of time.
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | The leaf phenological state transitions are dependent upon environmental conditions. In particular, the transition from no leaves/dormant state to the maximum growth state is based on the carbon-gain approach. CTEM uses "virtual" leaves to assess favourable meteorological conditions for leaf out. The virtual leaves photosynthesize and respire in a manner similar to normal leaves except the carbon gain or loss is not taken into account in vegetation's 5 carbon balance. A positive net leaf photosynthesis rate (G canopy,net , Eq. A17) for the virtual leaves over seven consecutive days indicates the arrival of favourable growth conditions and triggers leaf onset and the associated transition from the no leaves/dormant state to the maximum leaf growth state, when the entire positive NPP is allocated to leaves (a fL = 1, a fS = a fR = 0). When LAI reaches LAI thrs then the vegetation switches to the normal growth 10 mode and positive NPP is allocated to all three vegetation components -leaves, stem, and roots (a fL , a fS , a fR > 0).
The PFT-specific L f term (see Table A5) calculates LAI thrs to be typically between 40-50 % 15 of the maximum LAI that a given amount of stem and root biomass can support (based on the terms in the square brackets and Eq. (A46). SLA is the specific leaf area (Eqn. A56). This rule for transition from maximum to normal growth state is also used for evergreen tree PFTs and grass PFTs. Similar to LAI thrs , the LAI of virtual leaves is 7.5 % of the maximum LAI a given amount of root and stem biomass can support for tree and crop PFTs and 20 2.5 % for grass PFTs. In addition, the LAI of virtual leaves is constrained to be, at least, 0.3 m 2 m −2 for tree PFTs and 0.2 m 2 m −2 for crop and grass PFTs.
The transition from the normal growth state to leaf fall state is triggered by unfavourable environmental conditions and shorter day length. Broadleaf deciduous trees transition to leaf fall state when either: (i) day length is less than 11 h and the rooting zone temperature 25 drops below 11.15 • C or (ii) when the rooting zone temperature drops below 8 • C regardless of the day length. Needleleaf deciduous tress begin leaf fall after seven consecutive days with daily mean air temperature below −5 • C. Leaf fall occurs over a period of 15 days. In the Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | leaf fall state, the vegetation continues carbon allocation to its root and stem components, but not to leaves (a fL = 0, a fS + a fR = 1). Evergreen trees and grasses do not enter the leaf fall state and neither do the broadleaf drought deciduous trees. The implication for the latter PFT is that if the climate changes and the dry season becomes shorter, then the trees will keep their leaves on for a longer period of time since broadleaf drought deciduous trees 5 lose leaves due to soil moisture stress (described below).
The model vegetation is able to transition between the different leaf phenological states in response to changing conditions. For example, a leaf out in spring for broadleaf cold deciduous trees can be interrupted by a cold event when the vegetation goes into a leaf fall state until the return of more favourable conditions. 10 Leaf litter generation is caused by normal turnover of leaves (Ω N , day −1 ) and also due to cold (Ω C , day −1 ) and drought (Ω D , day −1 ) stress, both of which contribute to seasonality of LAI. For example, the leaf loss associated with drought and reduced photosynthesis during the dry season are the principal causes of the seasonality of LAI for the broadleaf drought deciduous tree PFT. 15 The conversion of leaf carbon to leaf litter (D L , kg C m −2 day −1 ) is expressed as where (Ω N,C,D , day −1 ) are the leaf loss rates associated with normal turnover of leaves and the cold and drought stress. The rate of normal turnover of leaves is governed by PFT-20 specific leaf life span (τ L , yr) as Ω N = 1/365τ L (See Table A6 for PFT specific values of τ L .). The leaf loss rate associated with cold stress (Ω C ) is calculated as where Ω C,max (day −1 , Table A5) is the maximum cold stress loss rate. L cold is a scalar that varies between 0 and 1 as where T leaf cold is a PFT-specific temperature threshold below which a PFT experiences dam-5 age to its leaves promoting leaf loss (Table A5) and T a is the daily mean air temperature ( • C). The leaf loss rate due to drought stress is calculated in a similar manner where Ω D,max (day −1 , Table A5) is the maximum drought stress loss rate and φ root (Eq. A39) 10 is the degree of soil saturation in the rooting zone.

A6 Stem and root turnover
The turnover of stem and root components is modelled via their PFT-dependent specified lifetimes. The litter generation (kg C m −2 day −1 ) associated with turnover of stem (D S ) and root (D R ) components is calculated based on the amount of biomass in the respective 15 components (C S , C R ; kg C m −2 ) and their respective turnover timescales (τ S and τ R ; yr; see Table A5) as A7 Rate change equations for carbon pools 20 With gross canopy photosynthesis rate (G canopy , Eq. A15), maintenance and growth respirations (R m and R g , Eqs. A23 and A28), and heterotrophic respiration components (R h,H 4904 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | and R h,D , Eq. A30) determined, it is possible to estimate the change in carbon amount of the model's five pools.
When the daily NPP (G canopy − R m − R g ) is positive, carbon is allocated to the plant's live carbon pools and the rate of change is given by, 5 where a fi is the corresponding allocation fractions for each pool (stem, root, and leaves) and D i is the litter produced from these components as explained in Sect. A5. H i is the loss associated with fire that releases CO 2 and other trace gases to the atmosphere and M i is the mortality associated with fire that contributes to the litter pool as explained in Sect. A9. 10 If the daily NPP is negative (G canopy < R m , R g = 0), the rate of change is given by, Negative NPP causes the plant to lose carbon from its live carbon pools due to respiratory costs in addition to the losses due to litter production (D i ) and disturbance (H i , M i ). 15 The rate change equations for the litter and soil carbon pools are given by where C D→H represents the transfer of humified litter to the soil carbon pool (Eqn. A37) and 20 H D is loss associated with burning of litter associated with fire that releases CO 2 and other trace gases to the atmosphere.

A8 Conversion of biomass to structural vegetation attributes
The time-varying biomass in the leaves (C L ), stem (C S ) and root (C R ) components is used to calculate the structural attributes of vegetation for the energy and water balance calcula-Leaf biomass is converted to LAI using specific leaf area (SLA, m 2 (kg C) −1 ) which itself is assumed to be a function of leaf life span (τ L ; Table A6) where C L,g is the green leaf biomass and C L,b is the brown leaf biomass that is scaled by 10 0.55 to reduce its contribution to the plant height. CTEM explicitly tracks brown leaf mass for grass PFTs. The turnover of green grass leaves, due to normal aging or stress from drought and/or cold, does not contribute to litter pool directly as the leaves first turn brown.
The brown leaves themselves turnover to litter relatively rapidly (τ L,b = 0.1 τ L ).
CTEM dynamically simulates root distribution and depth in soil following Arora and Boer 15 (2003). The root distribution takes an exponential form and roots grow and deepen with increasing root biomass. The cumulative root fraction at depth z is given by Rooting depth (d R ; m), which is defined to be the depth containing 99 % of the root mass, 20 is found by setting z equal to d R and f R = 0.99 which yields The parameter ι which describes the exponential root distribution is calculated as 25 4906 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | where ι represents the PFT-specific mean root distribution profile parameter and C R the average root biomass derived from Jackson et al. (1996) (Table A6). Equation (A60) yields a lower (higher) value of ι than ι when root biomass C R is higher (lower) than the PFTspecific mean root biomass C R , resulting in a deeper (shallower) root profile than the mean root profile. 5 The rooting depth d R is checked to ensure it does not exceed the soil depth. If so, d R is set to the soil depth and ι is recalculated as ι = 4.605/d R . The new value of ι is used to determine the root distribution profile adjusted to the shallower depth. Finally, the root distribution profile is used to calculate fraction of roots in each of the model's soil layers.

10
CTEM v. 2.0 represents disturbance as both natural and human-influenced fires. The original fire parameterization corresponding to CTEM v. 1.0 is described in Arora and Boer (2005b). The parameterization has since been adapted and used in several other DGVMs (Kloster et al., 2010(Kloster et al., , 2012Migliavacca et al., 2013;Li et al., 2012). CTEM v. 2.0 incorporates changes suggested in these studies as well as several new improvements. 15 Fire in CTEM is simulated using a process-based scheme of intermediate complexity that accounts for all elements of the fire triangle: fuel load, combustibility of fuel, and an ignition source. CTEM represents the probability of a fire occurrence (P f ), for a representative area of 500 km 2 (a rep ), as: 20 where the right hand side terms represent the fire probabilities that are conditioned on (i) the availability of biomass as a fuel source (P b ), (ii) the combustibility of the fuel based on its moisture content (P m ), and (iii) the presence of an ignition source (P i ). The probability of fire and the subsequent calculations are performed for each PFT present in a grid cell 25 (but the PFT index α is omitted for clarity in Eq. A61). Since the CTEM parameterization is based on one fire per day per representative area, the representative area has to be sufficiently small that the requirement of only one fire per day is reasonable, yet sufficiently 4907 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | large such that it is not possible to burn the entire representative area in one day. Based on MODIS observed fire counts in Figure 1 of Li et al., 2012, 500 km 2 is an appropriate size to not have more than one fire per day and still be a large enough area to be assumed representative of the gridcell as a whole.
The P b term depends on the above-ground biomass (B ag ) available for sustaining a fire 5 (which includes the green and brown leaf mass, stem mass, and litter mass, B ag = C L + C S + C D ). Below a lower threshold of above-ground biomass (B low ; 0.2 kg C m −2 ; similar to Moorcroft et al., 2001, andKucharik et al., 2000), fire is not sustained and thus has a probability of 0. Above a biomass of 1.0 kg C m −2 (B high ), P b is set to 1 as the amount of fuel available is assumed sufficient for fire. P b is then calculated using the aboveground 10 biomass, B ag (kg C m −2 ) with a linear variation between the upper and lower thresholds as: The linear decrease of P b from B high to B low reflects the fragmentation of fuel that occurs as biomass decreases. Fuel fragmentation impacts upon area burned as it impedes the fire 15 spread rate (Guyette et al., 2002). The probability of fire based on the presence of ignition sources (P i ) is influenced by both natural (lightning) and anthropogenic agents (either intentional or accidental). An initial lightning scalar, ϑ F , that varies between 0 and 1 is found as, 20 where F low and F high represent lower and upper thresholds of cloud-to-ground lightning strikes (F c2g , flashes km −2 month −1 ), respectively. Similar to Eq. (A62), below the lower threshold (F low ; 0.25 flashes km −2 month −1 ), ϑ F is 0 implying lightning strikes are not sufficient to cause fire ignition, above the upper threshold (F high ; 10.0 flashes km −2 month −1 ) ϑ F 25 is 1 as ignition sources now do not pose a constraint on fire. The amount of cloud-to-ground lightning, F c2g , is a fraction of the total lightning based on the relationship derived by Price Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | and Rind (1993) (approximation of their Eqs. 1 and 2): where Φ is the grid cell latitude in degrees and F tot is the total number of lightning flashes km −2 month −1 (both cloud-to-cloud and cloud-to-ground). The probability of fire due 5 to natural ignition, P i,n , depends on the lightning scalar, ϑ F , as: Fire probability due to ignition caused by humans, P i,h , is parameterized following Kloster where p thres is a population threshold (300 people km −2 ) above which P i,h is 1. The probability of fire conditioned on ignition, P i , is then the total contribution from both natural and 15 human ignition sources: The population data used to calculate probability of fire ignition caused by humans and anthropogenic fire suppression (discussed further down in this section) is based on the 20 HYDE 3.1 data set (Goldewijk et al., 2010) The probability of fire due to the combustibility of the fuel, P m , is dependent on the soil moisture in vegetation's root zone and in the litter layer. The root zone soil wetness (φ root , Eq. A39) is used as a surrogate for the vegetation moisture content and the soil wetness 4909 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | of the top soil layer as a surrogate for the litter moisture content. If a grid cell is covered by snow, P m is set to zero. The probability of fire conditioned on soil wetness in vegetation's rooting zone, P m,V , is then: where E V is the extinction soil wetness above which P f,V is reduced to near zero and is set to 0.30.
The probability of fire based on the moisture content in the "duff" layer, P m,D , which includes the brown leaf mass (grasses only) and litter mass (B duff = C L,b + C D ; kg C m −2 ), is calculated in a similar way but using the soil wetness of the first soil layer, (φ 1 , Eq. A7), as 10 a surrogate for the moisture in the duff layer itself as: where the extinction soil wetness for the litter layer, E D , is set to 0.50 which yields a higher probability of fire for the litter layer than for the vegetation for the same soil wetness. P m is 15 then the weighted average of P m,V and P m,D given by: where f duff is the duff fraction of above-ground combustible biomass. 20 The area burned (a) is assumed to be elliptical in shape for fires based upon the wind speed and properties of an ellipse: where l (m) and w (m) are the lengths of major and minor axes of the elliptical area burnt; v d (km h −1 ) and v u (km h −1 ) are the fire spread rates in the downwind and upwind directions, respectively; v p (km h −1 ) is the fire spread rate perpendicular to the wind direction and t is the time (h). The fire spread rate in the downwind direction (v d ) is represented as where v d,max (km h −1 ) is the PFT-specific maximum fire spread rate from Li et al. (2012), which is set to zero for crop PFTs (Table A7). The functions g(u) accounts for the effect of wind speed and h(φ r, d ) accounts for the effect of rooting zone and duff soil wetness on the fire spread rate, as discussed below.
The wind speed (u; km h −1 ) is used to determine the length (l) to breadth (w) ratio, L b , of the elliptical area burned by fire 15 and its head to back ratio, H b , following Li et al. (2012): which help determine the fire spread rate in the direction perpendicular to the wind speed and in the downward direction. Equations (A73) and (A74) are combined to estimate the 20 wind scalar g(u): 4911 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | which varies between between 0.05 and 1. The lower limit is imposed by the g(0) term which has a value of 0.05 and represents the fire spread rate in the absence of wind (u = 0); the upper limit is assigned a maximum value of 1. The fire spread rate in the absence of wind is essentially the spread rate in the direction perpendicular to the wind speed (v p ). The value of the g(0) term is derived by considering the case where the wind speed becomes very 5 large. As u → ∞ then L b → 11 and H b → 482, while g(∞) = 1 due to its definition, which yields g(0) = 0.0455 ≈ 0.05.
The dependence of fire spread rate on rooting zone and duff soil wetness, h(φ r, d ) is represented as Both h(φ root ) and h(φ 1 ) gradually decrease from 1 (when soil wetness is 0 and soil moisture does not constrain fire spread rate) to 0 when soil wetness exceeds the respective extinction 15 wetness thresholds, E V and E D .
With fire spread rate determined, and the geometry of the burned area defined, the area burned in one day, a 1d (km 2 day −1 ), following Li et al. (2012), is calculated as by setting t equal to 24 h. The fire extinguishing probability, q, is used to calculate the duration (τ , days) of the fire which in turn is used to calculated the area burned over the duration of the fire, a τ d . q is which yields a value of q that varies from 0.5 to 0.95 as population density, p d (number of people km −2 ), increases from zero to infinity. Higher population density thus implies a higher 5 probability of fire being extinguished. q represents the probability that a fire will be extinguished on the same day it initiated and the probability that it will continue to the next day is (1 − q). Assuming individual days are independent, the probability that the fire will still be burning on day τ is (1 − q) τ . The probability that a fire will last exactly τ days, P (τ ), is the product of the probability that the fire still exists at day τ and the probability it will be 10 extinguished on that day hence P (τ ) = q(1 − q) τ . This yields an exponential distribution of fire duration whose expected value is Based on this fire duration and the area burned in one day (Eq. A77), the area burned over 15 the duration of the fire (a τ d ) (but still implemented in one day since the model does not track individual fires over their duration, km 2 day −1 ) is calculated as Finally, and reintroducing the PFT index α, the area burned is extrapolated for a PFT α (A b,α , km 2 day −1 ) to the whole grid cell as 4913 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | where A g is area of a grid cell (km 2 ), f α the fractional coverage of PFT α and a rep the representative area of 500 km 2 , as mentioned earlier. Area burned over the whole grid cell (A b , km 2 day −1 ) is then calculated as the sum of area burned for individual PFTs, (A82) 5 Fire emits CO 2 , other trace gases, and aerosols as biomass is burned while plant mortality and damage due to fire contribute to the litter pool. The emissions of a trace gas/aerosol species j from PFT α, E α,j (g species (m −2 grid cell area) day −1 ) are obtained from a vector of carbon densities C α = (C L , C S , C R , C D ) α (kg C m −2 ) for its leaf, stem, root, and litter components, multiplied by a vector of combustion factors α = ( L , S , R , D ) α which determines what fraction of leaf, stem, root and litter components gets burned, multiplied by a vector of emissions factors Υ j = (Υ L , Υ S , Υ R , Υ D ) j (g species (kg C dry organic matter) −1 ), and by the area burned A b,α for that PFT. The dot product of C α , Υ j and α thus yields emissions per unit grid cell area of species j from PFT α, where the constant 1000 converts C α from kg C m −2 to g C m −2 and the constant 450 (g C (kg dry organic matter) −1 ) converts biomass from carbon units to dry organic matter. The corresponding loss of carbon (kg C m −2 day −1 ) from the three live vegetation compo-20 nents (L, S, R) and the litter pool (D) of PFT α is given by The PFT-specific combustion factors for leaf ( L ), stem ( S ), root ( R ) and litter ( D ) components are summarized in Table A7. Emission factors for all species of trace gases and 25 4914 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | aerosols (CO 2 , CO, CH 4 , H 2 , NHMC, NO x , N 2 O, total particulate matter, particulate matter less than 2.5 µm in diameter, and black and organic carbon) are based on an updated set by Andreae and Merlet (2001) listed in Tables 3 and 4 of Li et al. (2012). Litter generated by fire is based on similar mortality factors which reflect a PFT's susceptibility to damage due to fire Θ α = (Θ L , Θ S , Θ R ) α (fraction). The contribution to litter pool of 5 each PFT due to plant mortality associated with fire (kg C m −2 day −1 ) is calculated as: which is the sum of contribution from individual live vegetation pools  Table A7. When CTEM is run with prescribed PFT fractional cover, the area of PFTs does not change and the fire related emissions of CO 2 , other trace gases and aerosols, and generation of litter act to thin the remaining biomass. When competition between PFTs for space is allowed, fire both thins the remaining biomass and through plant mortality creates bare 20 ground which is subsequently available for colonization. The creation of bare ground depends on the susceptibility of each PFT to stand replacing fire (ζ r , fraction) (Table A7) and the PFT area burned. The fire-related mortality rate, m dist (day −1 ), used in Eq. (11), is then: 25 After bare ground generation associated with fire, the thinned biomass is spread uniformly over the remaining fraction of a PFT. However, it is ensured that the carbon density of the remaining biomass does not increase to a value above what it was before the fire occurred. 4915 CTEM explicitly simulates C 3 and C 4 crops using a simple approach. Crops increase their biomass depending on environmental conditions similar to other PFTs, however, unlike other PFTs they are explicitly harvested. Harvesting is initiated when the daily mean air temperature remains below 8 • C for 5 consecutive days, or when the crop LAI reaches 5 a threshold (3.5 m 2 m −2 for C 3 crops and 4.5 m 2 m −2 for C 4 crops) signifying that the crops have matured (Arora and Boer, 2005a). Harvesting occurs over a period of 15 days and the harvested biomass contributes to the litter pool. The above environmental criteria typically lead to one annual crop cycle in mid-to high-latitude regions while multiple crop cycles may be realized in tropical regions depending on the precipitation seasonality. Harvest- 10 ing ensures that vegetation biomass does not keep increasing on croplands and, unlike natural vegetation, this prevents croplands from sequestering carbon as atmospheric CO 2 increases. Some carbon may still be sequestered in the soil carbon pool but crops have a higher soil carbon decomposition rate and lower humification factor compared to other PFTs (ς H and χ, respectively, in Table A3) to account for tillage which prevents soil carbon 15 sequestration in croplands.

A11 Land use change
The land use change (LUC) module of CTEM is based on Arora and Boer (2010) and briefly described here. When the area of crop PFTs changes, CTEM generates LUC emissions. In the simulation where fractional coverages of PFTs are specified, the changes in fractional 20 coverage of crop PFTs are made consistent with changes in the fractional coverage of natural non-crop PFTs. That is, an increase or decrease in the area of crop PFTs is associated with a corresponding decrease or increase in the area of non-crop PFTs. This approach is taken by Wang et al. (2006) which allows to reconstruct historical land cover given a spatial data set of changes in crop area over the historical period and an estimate of potential 25 natural land cover for the pre-industrial period (as described in Sect. 3). When competition between PFTs for space is allowed, only the fractional coverages of crop PFTs are spec-Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | ified. Similar to a simulation with prescribed PFT fractions, when the area of crop PFTs increases, the fractional coverage of non-crop PFTs is decreased in proportion to their existing coverage (i.e. the linear approach of Wang et al., 2006). Alternatively, and in contrast to the simulation with prescribed PFT fractions, when the area of crop PFTs decreases then the generated bare fraction is available for recolonization by non-crop PFTs. 5 A decrease in the area of natural non-crop PFTs, associated with an increase in area of crop PFTs, results in deforested biomass (while the term "deforested" implies clearing of forests, the same processes can occur in grasslands as well and is meant here to imply removal of the biomass). The deforested biomass is divided into three components: (i) the component that is combusted or used for fuel wood immediately after natural vegetated 10 is deforested and which contributes to atmospheric CO 2 , (ii) the component left as slash or used for pulp and paper products, and (iii) the component that is used for long-lasting wood products. The fractions allocated to these three components depend on whether the PFTs are woody or herbaceous and on their above-ground vegetation biomass density (see Table 1 of Arora and Boer, 2010). To account for the timescales involved, the fraction 15 allocated to slash or pulp and paper products is transferred to the model's litter pool and the fraction allocated to long-lasting wood products is allocated to the model's soil carbon pool. Land use change associated with a decrease in the area of natural vegetation thus redistributes carbon from living vegetation to dead litter and soil carbon pools and emits CO 2 to the atmosphere through direct burning of the deforested biomass. The net result is 20 positive LUC carbon emissions from land to the atmosphere.
When croplands are abandoned, the area of natural PFTs increases. In simulations with prescribed fractional coverage of PFTs this results in a decreased carbon density for all model pools as the same amount of carbon is spread over a larger fraction of the grid cell. This reduced density implies that natural vegetation is able to take up carbon as it comes 25 into equilibrium with the driving climate and atmospheric CO 2 concentration. This creates the carbon sink associated with abandonment of croplands as natural vegetation grows in its place. In simulations with competition between PFTs, the abandoned land is treated as bareground which is subsequently available for recolonization, as mentioned above. As Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | natural vegetation expands into bareground it takes up carbon, again creating the carbon sink associated with abandonment of croplands. The net result is negative LUC carbon emissions as carbon is taken from atmosphere to grow vegetation over the area that was previously a cropland.
Code availability 5 Fortan code for CLASS-CTEM modelling framework is available on request and upon agreeing to Environment Canada's licensing agreement available at http://collaboration.cmc.ec.gc.ca/science/rpn.comm/license.html. Please contact the first author (joe.melton@canada.ca) to easily and rapidly obtain the model code. 10 The works published in this journal are distributed under the Creative Commons Attribution 3.0 License. This license does not affect the Crown copyright work, which is re-usable under the Open Government Licence (OGL). The Creative Commons Attribution 3.0 License and the OGL are interoperable and do not conflict with, reduce or limit each other. © Crown copyright 2015 15 The Supplement related to this article is available online at doi:10.5194/gmdd-0-1-2015-supplement.

Copyright statement
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |    Table A4. PFT-specific base allocation fractions for leaf ( L ), stem ( S ) and root ( R ) components (Eqs. A41 to A43), parameter ω a which determines the sensitivity of allocation to light and water status, the parameter η (Eq. A46) which determines proportion of stem plus wood biomass to leaf biomass and the minimum root:shoot ratio lr min used in the allocation module of CTEM v. 2.0. Some parameter values differ between the model versions when PFTs' fractional coverages are specified and when they are dynamically modelled using competition between PFTs. If modified, the parameter values for the dynamically modelled version are shown in parentheses.  Table A5. PFT-specific parameters used in the phenology module of CTEM v. 2.0: maximum cold (Ω C,max , day −1 , Eq. A49) and drought (Ω D,max , day −1 , Eq. A51) stress loss rates, T leaf cold ( • C) temperature threshold for determining cold stress scalar L cold (Eq. A50) and the fraction L f (Eq. A47) used to determine threshold LAI for switching from maximum growth to normal growth leaf phenology mode. Also shown in the table are the turnover time scales for the stem (τ S , yr, Eq. A52) and root (τ R , yr, Eq. A52) components. Some parameter values differ between the model versions when

Plant functional type
PFTs' fractional coverages are specified and when they are dynamically modelled using competition between PFTs. If modified, the parameter values for the dynamically modelled version are shown in parentheses.  Table A6. PFT-specific leaf life spans (τ L , yr, Eq. A56) and mean root distribution profile (ι, dimensionless) and average root biomass (C R , kg C m −2 ) used to determine root distribution profile in