Constraining a land surface model with multiple observations by application of the MPI-Carbon Cycle Data Assimilation System

We describe the Max Planck Institute Carbon Cycle Data Assimilation System (MPI-CCDAS) built around the tangent-linear version of the land surface scheme of the MPI-Earth System Model v1 (JSBACH). The simulated terrestrial biosphere processes (phenology and carbon bal5 ance) were constrained by observations of the fraction of absorbed photosynthetically active radiation (TIP-FAPAR product) and by observations of atmospheric CO2 at a global set of monitoring stations for the years 2005 2009. When constrained by TIP-FAPAR alone, the system successfully, 10 and computationally efficiently, improved simulated growing season average FAPAR, as well as its seasonality in the Northern extra-tropics. When constrained by atmospheric CO2 observations, global net and gross carbon fluxes were improved, although the system tended to underestimate trop15 ical productivity. Assimilating both data streams jointly allowed the MPI-CCDAS to match both observations (TIPFAPAR and atmospheric CO2) equally well as the single data stream assimilation cases, therefore overall increasing the appropriateness of the resultant biosphere dynamics and un20 derlying parameter values. Our study thus demonstrates the value of multiple-data stream assimilation for the simulation of terrestrial biosphere dynamics. and highlights the potential role of remote sensing data, here the TIP-FAPAR product in stabilising the strongly underdetermined atmospheric 25 inversion problem posed by atmospheric transport and CO2 observations alone. The constraint on regional gross and net CO2 flux patterns is limited through the parametrisation of the biosphere model. We expect improvement on that aspect through a refined initialisation strategy and inclusion of fur30 ther biosphere observations as constraints.


Introduction
Estimates of the net carbon balance of the terrestrial biosphere are highly uncertain, because the net balance cannot be directly observed at large spatial scales (Le Quéré et al., 2015).Studies aiming to quantify the contemporary global carbon cycle therefore either infer the terrestrial carbon budget as a residual of the arguably better constrained other components of the global carbon budget (Le Quéré et al., 2015) or rely on measurements of atmospheric CO 2 and the inversion of its atmospheric transport (Gurney et al., 2002).Both approaches have the caveat that they are not able to provide accurate estimates at high spatial resolution, and cannot utilise the broader set of Earth system observations that provide information on terrestrial carbon-cycle dynamics (Luo et al., 2012).Furthermore, they are diagnostic by nature, and therefore lack any prognostic capacity.
Ecosystem models integrate existing knowledge of the underlying processes governing the net terrestrial carbon balance and have such a prognostic capacity.Since they simulate all major aspects of the terrestrial carbon cycle, they Published by Copernicus Publications on behalf of the European Geosciences Union.
can -in principle -benefit from the broader set of Earth system observations.However, studies comparing different land-surface models show a large spread of estimates of the seasonal and annual net land-atmosphere carbon exchange and their trends (Piao et al., 2013;Sitch et al., 2015).This uncertainty is one of the primary causes of discrepancies in future projections of stand-alone terrestrial biosphere models (Sitch et al., 2008) and coupled carbon-cycle climate models (Anav et al., 2013;Friedlingstein et al., 2014) for the 21st century.Next to the uncertainty due to different climate forcing (Jung et al., 2007;Dalmonech et al., 2015) and alternative model formulations (Sitch et al., 2015), the uncertainty about the parameter values of the mathematical representation of key carbon-cycle processes in these models are an important source of the model spread (Knorr and Heimann, 2001;Zaehle et al., 2005;Booth et al., 2012).This parametric uncertainty can be as large as the differences between models.The spread among models limits our ability to provide further constraints of the net terrestrial carbon uptake.
A potential route to reduce parameter and processformulation related uncertainties in the estimates of the terrestrial carbon cycle is to systematically integrate the increasing wealth of globally distributed carbon-cycle observations into models through data assimilation methods.A broad overview of potential observations and methodological choices is given in Raupach et al. (2005).Since computational run time is an important limiting factor in global carbon-cycle data assimilation, the development of a relatively "fast" but comprehensive system is advantageous.Knorr and Kattge (2005) investigated the use of a Monte Carlo approach for data assimilation with global models.They suggested that the computational burden (i.e. the run time) is too large to allow its application with a comprehensive land-surface model and an appropriate number of parameters in the optimisation.Nevertheless, the method has been successfully applied at global scales for a reduced set of parameters and limited process representations (Ziehn et al., 2012).A computationally more efficient method is the use of gradient-based methods.For instance, approximating the gradient with finite differences, Saito et al. (2014) performed data assimilation of several data streams with the VISIT model.
An alternative to the finite difference method is to calculate the gradient precisely by a tangent-linear or adjoint version of the biosphere model.A prototype of such a carboncycle data assimilation system (CCDAS) based on an advanced variational data assimilation scheme and a prognostic terrestrial carbon flux model (BETHY; Knorr, 1997Knorr, , 2000) ) has demonstrated the potential to effectively constrain the simulated carbon cycle with observations of atmospheric CO 2 (BETHY-CCDAS; Rayner et al., 2005;Scholze et al., 2007;Kaminski et al., 2013).Conceptually similar systems have been built for other, more complex, global biosphere models.Applications of these alternative systems include, for example, constraining the phenology of the JULES model with the MODIS collection five-leaf area index product (Luke, 2011) and carbon fluxes in the ORCHIDEE model using observations from several FLUXNET sites (Kuppel et al., 2012(Kuppel et al., , 2013)).Previous studies with these systems focussed on the effect of different (in situ and satellite) FAPAR observations at selected sites on simulated phenology with the ORCHIDEE model (e.g.Bacour et al., 2015) or on the joint use of site-level carbon flux and FAPAR observations (Kato et al., 2013).At the global scale, Forkel et al. (2014) investigated the use of long-term FAPAR data to constrain long-term trends in vegetation greenness simulated by the LPJmL model, whereas Kaminski et al. (2012) focussed on the joint assimilation of FAPAR and atmospheric CO 2 observations.
Here, we present the development and first application of a variational data assimilation system (Max Planck Institute Carbon Cycle Data Assimilation System: MPI-CCDAS) built around the tangent-linear representation of the JSBACH land-surface model (Raddatz et al., 2007).JSBACH is a further development of the BETHY model, providing a more detailed treatment of carbon turnover and storage in the terrestrial biosphere, as well as more detailed treatment of landsurface biophysics (Roeckner et al., 2003) and land hydrology (Hagemann and Stacke, 2014).JSBACH serves as a land-surface scheme to the MPI-Earth System Model (MPI-ESM; Giorgetta et al., 2013).Our objective with this development is twofold: (i) to improve the scope of the original BETHY-CCDAS by including a larger set of terrestrial processes affecting the terrestrial carbon cycle; and (ii) to provide a means to constrain the land carbon-cycle projections of JSBACH with several data streams, and thereby potentially also that of the MPI-ESM.Dalmonech et al. (2015) have shown that the simulated phenology, and its seasonal and interannual climate sensitivity, as well as the simulated seasonal net land-atmosphere carbon flux are reasonably robust against climate biases in the MPI-ESM.One can therefore expect that improvements of these aspects made with the MPI-CCDAS driven by observed meteorology will be maintained in the coupled Earth system model.
We first provide a technical description of the MPI-CCDAS system.We then demonstrate the capacity of the MPI-CCDAS system to integrate atmospheric CO 2 observations and the fraction of absorbed photosynthetically active radiation (FAPAR) recorded from satellites, which constrains the seasonality of the phenology and assesses the relative effect of the constraint from these two data streams on parameter values and modelled fluxes.Furthermore, the joint assimilation of the two data streams demonstrates their mutual benefit to constrain parameters in JSBACH.
2 Description of the MPI-CCDAS

The CCDAS method
The MPI-CCDAS applies a variational data assimilation approach to estimate a set of model parameters and initial states given a range of observations.The variational data assimilation method is described in detail by Kaminski et al. (2013).In the following, we thus only give a brief overview of the method.The values and uncertainties for model parameter values, observations and the model are detailed in the following sub-sections.
To take account of the uncertainty inherent in the description of observed and simulated variables, the method operates on probability density functions (PDFs) and is conveniently formulated in a Gaussian framework.The MPI-CCDAS uses the combined information provided by the model M(p) and the observations d to update the PDF describing the prior state of information on the model's process-related parameters and initial state variables, combined in the model's control vector p.This prior control vector is described by the mean p pr and the covariance of its uncertainty C pr .The CCDAS method seeks to minimise the misfit between observed and modelled quantities by minimising the cost function J : where C d is the covariance of combined uncertainty in the observations (with mean d) and model simulation.The minimum of J , denoted as the posterior control vector p po , corresponds to the maximum likelihood estimate.p po thus balances the misfit between modelled quantities and their observational counterparts over the entire assimilation window, while taking independent prior information on the control vector into account.In other words, the vector d contains all observations used in the assimilation procedure, which act simultaneously to constrain the control vector.In contrast to sequential assimilation schemes, the approach applied here determines a model trajectory through the state space, which, in particular, ensures conservation of mass and energy (Kaminski and Mathieu, 2016).Technically, J is minimised by a quasi-Newton approach with so-called Broyden-Fletcher-Goldfarb-Shanno updates of the Hessian approximation, in the implementation provided by the numerical recipes (Press et al., 1992, dfpmin routine).The iterative procedure requires the gradient ∂J ∂p , which is evaluated by the so-called tangent-linear version of the model.This tangent-linear model was generated by means of the Transformation of Algorithms in Fortran compiler tool (TAF, Giering and Kaminski, 1998) through automatic differentiation (Griewank, 1989).This procedure regards the model code that evaluates J (p) as the composition of a sequence of (very many) elementary operations (such as "+", or "exp") to which it applies the chain rule of calculus.Being implementations of the chain rule, the derivatives provided by the tangent-linear code are as accurate as possible on a computer, i.e. up to machine precision.This contrasts the traditional numerical differentiation approach, which derives derivative approximations through a series of perturbed model runs (for example, so-called finite difference or divided difference approximations).

The forward model
The model that is optimised within the MPI-CCDAS is the JSBACH land-surface model (Raddatz et al., 2007;Brovkin et al., 2009;Reick et al., 2013;Schneck et al., 2013;Dalmonech and Zaehle, 2013).The model considers 10 plant functional types (PFTs: see Table 1).These PFTs are allowed to co-occur within a grid cell on separate tiles, but nonetheless share a common water storage.Compared to the aforementioned JSBACH studies, the MPI-CCDAS does not use land-use change and land-use transition or dynamic vegetation, but uses a multi-layer soil hydrology scheme (Hagemann and Stacke, 2014).Appendix A gives a detailed description of the relevant parts of JSBACH.The model is typically used within the MPI-ESM (Giorgetta et al., 2013) and calculates the terrestrial storage of energy, water and carbon and its half-hourly exchanges between the atmosphere and the land surface.JSBACH is applied here uncoupled from the atmosphere and forced with reconstructed meteorology (see Sect. 2.6).
The application of gradient-based minimisation procedures is facilitated by a differentiable calculation of J (p).According the chain rule, this ultimately requires all code parts of the forward model that depend on the control variables and impact the cost function to be differentiable.To improve differentiability, the original phenology scheme that describes the timing and amount of foliar area based on logistic growth functions (Lasslop, 2011) was replaced by an alternative scheme developed explicitly for the needs of differentiable codes (Knorr et al., 2010, Appendix A1).Some further minor modifications were necessary to make the code differentiable.These changes included replacing look-up tables with their continuous formulations, avoiding division by zero in the derivative code (e.g. through differentiation of √ 0 in the forward mode leading to 1 √ 0 in the differentiated code), and reformulating minimum and maximum calculations to allow a smooth transition at the edge.These modifications alter the calculations.However, they were implemented such that the differences in the modelled results compared to the original code are minimal.

The atmospheric transport model
To map the net land-atmosphere CO 2 exchange simulated by JSBACH to observations of the atmospheric CO 2 mole frac-   (TrH) and water C4 crops (TrCr) tion, the computation of atmospheric transport is required, which is done here by transport model TM3 (Heimann and Körner, 2003).Specifically, we compute the response of monthly mean CO 2 mole fractions c to monthly mean surface fluxes f (extending 2 years back in time).Since the atmospheric transport of CO 2 is linear in the fluxes, the transport process can be written as where M represents the TM3 responses as a transport matrix (Rödenbeck et al., 2003).For our analysis, we used the Jacobian representation of the TM3 model, version 3.7.24(Rödenbeck et al., 2003), with a spatial resolution of about 4 • × 5 • (the "fine" grid of TM3), driven by interannually varying wind fields of the NCEP reanalysis (Kalnay et al., 1996).The net exchange f is the sum of the terrestrial fluxes computed by JSBACH and those not computed by JS-BACH, i.e. prescribed ocean and fossil fuel fluxes (Sect.2.5).Biomass burning fluxes are not explicitly included (see also the discussion in Sect.4.5).During the assimilation of atmospheric CO 2 , any information on these latter fluxes in the observations are consequently mapped to the respiratory fluxes simulated by JSBACH.
In the MPI-CCDAS, the atmospheric CO 2 mole fraction at the monitoring stations at the beginning of this simulation is specified as a globally constant offset CO offset 2 , one of the parameters to be estimated.The resulting CO 2 mole fractions can then be directly compared with observed atmospheric CO 2 .Limiting the system to one global modifier was motivated by limitation in the computational run time, while an inclusion of an offset depending on the observation locations could be easily implemented.With a spin-up of 2 years for the atmospheric transport, we allow the system to build up the latitudinal gradient of CO 2 .After the second year, there is no visible trend in the difference of observed CO 2 at Mauna Loa and the South Pole, leading us to conclude that 2 years are sufficient to spin up the atmosphere.

Model parameters
For this study, JSBACH parameters related to the phenology, photosynthesis and land carbon turnover (including initial carbon stocks) were optimised (see Appendix A for a detailed model description).The default prior value and assumed prior Gaussian uncertainty of each parameter and the posterior values from the assimilation experiments are given in Table 2.The choice of these parameters was based on an extensive parameter sensitivity study on a much larger set of parameters across multiple biomes (Schürmann, unpublished results).We retained those parameters, for which we found a significant effect on modelled FAPAR and net CO 2 exchange.In principle, it is possible to add more parameters, which are decisive for other modelled quantities such as soil moisture, and which might feed back to our observables.A brief explanation of the parameters involved in this study is given in the following.
The parameters controlling phenology ( max , τ l , τ w , T φ , t c , and ξ ) are allowed to take different values for each plant functional type with the exception of ξ , which is a globally valid parameter.While max controls the LAI, ξ controls the rate of leaf growth, and τ l is the timescale of leaf senescence.T φ and t c are temperature and day-length thresholds, respectively, controlling the onset and end of vegetation activity.The parameter τ w controls the shedding of leaves in response to phenology for drought-deciduous PFTs.Soil moisture in JSBACH follows a five-layer scheme (Hagemann and Stacke, 2014) and is coupled to vegetation processes via the phenology and the photosynthesis by influencing actual stomatal conductance and thus evapotranspiration.
The phenological parameter prior values and uncertainties are taken from Knorr et al. (2010), with the following three exceptions: the water control parameter τ w required an adaptation to account for the different soil-water formulations in the MPI-ESM compared to BETHY.τ l for the coniferous evergreen PFT (CE) has also been adapted after preliminary site-scale studies to allow more flexibility in the seasonality of the evergreen phenology (Schürmann, unpublished results).Finally, max is left to its default JSBACH parameter value for all PFTs, with the exception of the coniferous evergreen PFT.For CE, a value of max = 1.7 m 2 m −2 has been used, because preliminary model tests revealed a large bias in modelled FAPAR in CE-dominated regions, which adversely affected the model results of the carbon cycle.
Calculation of photosynthesis in JSBACH follows Farquhar et al. (1980) for C3 plants and Collatz et al. (1992) for C4 plants, with details as described in Knorr and Heimann (2001) and Knorr (1997).Maximum rates of carboxylation (V c max ) and electron transport (J max ) for the calculation of gross primary production (GPP; see Appendix A) are allowed to vary per PFT.We assume that the observed tight correlation between V c max and J max is conserved irrespective of the precise value for each PFT (Kattge and Knorr, Table 2. Model parameters used in the data assimilation procedure with their prior and posterior values for the different assimilation experiments.Parameters marked with * represent scalars that are multiplied with their respective value in the model, given in Table D1.The mapping variants are explained in Appendix C: (1) no lower bound; and (2) a lower bound at 0 for those parameters that are not allowed to take negative values.
).Thus, we introduce a single scaling coefficient f photos : Prior parameter ranges for each PFT were derived from the TRY database (Kattge et al., 2011).
Autotrophic respiration (Ra) in JSBACH follows Knorr (2000), who assumed that growth respiration is a fixed fraction (20 %) of the net assimilation.Maintenance respiration scales with dark respiration (with a parameter f aut_leaf ), and thus V c max , assuming that it is mainly driven by the amount of available photosynthates.The net primary production (NPP, the difference of GPP and Ra) is allocated to either a green or woody pool.Upon senescence, these pools turn over into three litter pools (above ground green, below ground green and woody) with PFT-and pool-specific turnover times.Heterotrophic respiration (Rh) of these pools responds to temperature according to a Q 10 formulation (see Appendix A).
Prior sensitivity studies have revealed that the most influential parameters controlling carbon storage on land and the partitioning between autotrophic and heterotrophic respiration were the leaf fraction of maintenance respiration (f aut_leaf ) and temperature response (Q 10 ) of the carbon pools, which were both included as parameters into the optimisation.The uncertainty of these parameters has been estimated based on the works of Mahecha et al. (2010) for Q 10 and Knorr (2000) for f aut_leaf .
To account for non-steady-state conditions of the net carbon flux at the beginning of the assimilation period, we followed the approach of Carvalhais et al. ( 2008) by estimating a global scaling factor for the size of the initial slow pool f slow .The inclusion of f slow in the optimised parameters allows for the modification of global heterotrophic respiration and thereby adjusts the CO 2 growth rate by altering the net carbon flux to the atmosphere.However, the limitation of this approach is that it does not change the spatial distribution of carbon pools, which remains entirely controlled by the prior parameter values.
For this first application of the MPI-CCDAS, the most slowly varying pool has been selected (i.e. the soil carbon pool with a turnover time of 100 years).The initial conditions of other carbon pools were not included in the control vector to avoid the associated increase in the computational burden (e.g.run time).This consequently includes the risk of assigning any misrepresentation of modelled pool sizes to the soil carbon pool, and the changes in the carbon pool sizes after the assimilation should be interpreted with care.The uncertainty of f slow has been set to 10 %, reflecting a moderate deviation from equilibrium (but see also the discussion in Sect.4.4).The turnover-time parameters (see Eq. A18) were not included in the control vector, because their impact on land carbon fluxes was small compared to other parame-ters (Schürmann, unpublished results) at the timescale of the MPI-CCDAS (a couple of years).
To account for minor offsets of the MPI-CCDAS with respect to the initial carbon content of the atmosphere, one single offset value CO offset 2 is included in the set of estimated parameters (see Sect. 2.3).CO offset 2 was assumed to not deviate more than a few ppm, and its uncertainty was set accordingly.
Uncertainties of all parameters were assumed to be Gaussian and exposed to the assimilation procedure in a form normalised by their prior uncertainty.In order to prevent parameters from attaining physically impossible, negative values, some parameters were constrained at the lower end of the distribution to zero (see Table 2 and Appendix C).

Observational constraints and observation operators 2.5.1 Atmospheric CO 2
Observed atmospheric CO 2 mole fractions were obtained from the flask data/continuous measurements provided by different institutions (e.g.flask data of NOAA/CMDL's sampling network, update of Conway et al., 1994, Japan Meteorological Agency, JMA, Meteorological Service of Canada, MSC, and many others; see Rödenbeck et al., 2003).Stations were selected in order to cover the global latitudinal gradient (Table B1), focussing on remote locations with little imprint of local fluxes.For cross-evaluation, an independent set of available station data was used (Table B2).The temporal resolution of the CO 2 original data at the monitoring stations (hourly to daily/weekly) depends on the specific station.The data were averaged to monthly means.
The MPI-CCDAS compares atmospheric CO 2 abundances at a monthly temporal resolution.In order to reduce the representation error, simulated CO 2 abundances are only considered at observational sampling times.The treatment of the observations of CO 2 and their uncertainties follows Rödenbeck et al. (2003).A floor value of 1 ppm is added to this uncertainty, similarly as in Rayner et al. (2005).Ancillary flux fields at monthly resolution were prescribed to represent the ocean (Jena CarboScope pCO 2 -based mixed layer scheme oc_v1.0Rödenbeck et al., 2013) and fossil fuel (Emissions Database for Global Atmospheric Research EDGAR, European Commission, Joint Research Centre , JRC) net CO 2 fluxes.

TIP-FAPAR
The observations of FAPAR used in the assimilation process were specifically derived for this study by the Joint Research Centre Two-stream Inversion Package (JRC-TIP, Pinty et al., 2007).JRC-TIP is based on an advanced onedimensional two-stream scheme, which assures a physically consistent solution of the radiative transfer problem in the coupled canopy-soil system (Pinty et al., 2006).It has been explicitly designed to deliver products suitable for assimilation into climate and numerical weather prediction models.Similar schemes are implemented in most state-of-the-art terrestrial biosphere models (e.g.Loew et al., 2014).The product used here was derived by running JRC-TIP on MODIS broadband visible and near-infrared white sky surface albedo input aggregated to the model grid separately for snow-free and snow-like background conditions in a similar way as described for the native 0.01 degree product (Pinty et al., 2011a, b;Clerici et al., 2010;Voßbeck et al., 2010).
Uncertainties in the FAPAR data are based on rigorous uncertainty propagation from the MODIS input albedos using first and second derivative information (Voßbeck et al., 2010).A space and time invariant prior (except for the occurrence of snow) is used, i.e. all spatio-temporal variability in the products is derived from the input products (including the MODIS snow flag).In contrast to alternative algorithms, there is no variability imposed through (possibly implicit) assumptions such as the distribution of land cover types (as in Knyazikhin et al., 1999), which avoids potential inconsistencies with the model's own land cover (for more details see Disney et al., 2016).To reduce biases in the retrieved products through the prior information, the prior is given a deliberately low weight, which is a σ of 5 for the effective LAI (Pinty et al., 2011a).
We applied two filters to the global FAPAR product to ensure that potential model structural errors did not lead to compensating effects in the parameter estimation procedure and thus impede fitting the FAPAR data in other regions.First, owing to the fact that no specific crop phenology is implemented in JSBACH, grid cells with fractional crop coverage of more than 20 % have been filtered out.A consequence of this filter is to mask the deciduous broadleaf PFT in the US and Europe, because in these areas, this PFT is collocated in crop-dominated pixels.Hence, the phenological parameters of the deciduous broadleaf PFT are only constrained by observations from other locations -a fact that should be kept in mind when interpreting the deciduous broadleaf parameters.Second, grid points with correlations between the prior model and the observed FAPAR below 0.2 (i.e.prior phenology exhibits out-of-phase seasonal cycles) have also been filtered out.Together, these filters reduce the overall global coverage of the FAPAR constraint and thus the number of observations to be fitted (Fig. 1) by 57 %.

Experimental set-up
The MPI-CCDAS was driven by daily meteorological forcing (air temperature, specific air humidity, precipitation, downward short-and long wave radiation, wind speed) obtained from the WATCH forcing data set (Weedon et al., 2014).Annual CO 2 mole fractions of the atmosphere as a forcing for the photosynthesis calculations of JSBACH were prescribed according to Sitch et al. (2015).Vegetation distribution (Fig. E1) and other surface characteristics were de-0.050.10 0.15 0.20 0.25 0.30 0.35 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Constraints Evaluation q Constraints Evaluation Figure 1.Location of the CO 2 observations (for constraining the model and for evaluation) and the temporal median of the TIP-FAPAR uncertainties (given with the colour scale) in each pixel acting as a constraint.
rived from Pongratz et al. (2008).Although the MPI-CCDAS is flexible to be run at any spatial resolution, for computational efficiency, it was applied at a coarse spatial resolution of about 8 • × 10 • .Note that, as explained in Sect.2.3, the atmospheric transport itself was simulated at 4 • × 5 • .Water and carbon-cycle state variables of JSBACH were initialised as follows: first, an equilibrium in terms of stores and long-term fluxes of water and carbon was achieved through repeated integration over the period 1979-1989 with corresponding meteorological forcing and atmospheric CO 2 mole fractions of 1979.Starting from this equilibrium state, an integration followed with transient atmospheric and meteorological forcing from 1979 to 2003 but with constant land cover.The final state of 2003 was then taken as the initial condition for all MPI-CCDAS experiments.This spin-up procedure used the prior parameter values, i.e. it was not part of the assimilation loop for the parameter estimation.
The MPI-CCDAS experiments were run for the years 2003-2011 with transient atmospheric and meteorological forcing, but constant land cover.During this period the parameters were left free to adapt to the observational constraints given the optimisation procedure.To allow for nonequilibrium states of the carbon pools at the beginning of these experiments, the assimilation procedure was allowed to modify the initial soil carbon pool (at the end of the spin-up procedure) by a global scaling factor (see Sect. 2.4).The first 2 years of the simulation (2003 to 2004) were used to build a spatial gradient in the simulated atmospheric CO 2 mole fractions in accordance with the simulated net carbon exchange, and no observations for these years were included as an observational constraint.In the following years (2005 to 2009), the observational constraints were active.For the final 2 years (2010 to 2011), the constraints were inactive and the observations were used to evaluate the MPI-CCDAS with prior and posterior parameters in a prognostic manner.We used the correlation, bias, root mean squared error and the Nash-Sutcliffe model efficiency (NSE) as evaluation statistics.NSE is defined as where the index i denotes individual pairs of observation (d) and model output (m) and an overbar the arithmetic mean.NSE = 1 indicates a perfect model and for all NSE < 0 the mean of the observations is a better predictor than the model itself.
Our study follows a factorial design to assess the benefit of each data stream, but also to evaluate the potential of assimilating more than one data stream and its effect on the carbon cycle: two experiments, each using one data stream alone as an observational constraint (CO2alone using only atmospheric CO 2 observations, and FAPARalone using only the TIP-FAPAR product), and one experiment using both data streams simultaneously as an observational constraint (JOINT), with each data stream equally weighted in the cost function (Eq.1).

Performance of the assimilation
The application of the MPI-CCDAS was successful within a feasible number (29 to 69) of iterations (with run times of 1 to 2 months), increasing from FAPARalone (using only TIP-FAPAR) to CO2alone (using only atmospheric CO 2 observations) and JOINT (using both observations simultaneously; Table 3).For all three assimilation experiments, the value of the cost function was considerably reduced, while the posterior parameter values remained in physically plausible ranges.Nevertheless, some parameter values (e.g.T φ of the CD phenotype) deviated strongly from the prior values (Table 2).For FAPARalone, the value of the cost function was almost halved between the prior and the posterior run.Even stronger reductions of the cost function were obtained in the other two experiments using CO 2 as a constraint (Table 3).
Several statistics comparing the posterior model with observations for FAPAR and CO 2 (Tables 4 and 5) show that the model performance of the JOINT experiment was comparable to the performance of the two single data-stream experiments relative to the assimilated quantity.The single data-stream assimilation experiments either showed no improvement with respect to the other data stream (the fit of the CO2alone experiment to TIP-FAPAR), or even a degradation (the fit of the FAPARalone experiment to atmospheric CO 2 observations).By contrast, the JOINT assimilation captured the main features of both data sources.Overall, these results suggest that both data streams can be successfully assimilated jointly with the MPI-CCDAS.Siberian FAPAR @ 59°, 120° (Lat, Lon) Years FAPAR q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q PRIOR CO2alone FAPARalone JOINT Obs During the assimilation procedure, the norm of the gradient1 ∂J ∂p (see Eq. 1) was considerably reduced by 3-4 orders of magnitude (Table 3).During the first tens of iterations of the assimilation procedure, the cost as well as the norm of the gradient were considerably reduced.In this initial phase of the assimilation, the parameter values also changed most strongly.However, some parameter values continue to change in later iterations without substantial reductions in the cost function or the norm of the gradient.The assimilation procedure finally stopped when the changes to the parameters became too small.

Phenology
The statistics of the comparison to the TIP-FAPAR data sets show an improvement of the model-data fit for all experiments relative to the prior model (Table 4).As expected, the improvement was strongest when using FAPAR (FA-PARalone and JOINT) as a constraint.One important reason for the improvement was a general reduction in modelled growing-season average FAPAR simulated by the MPI-CCDAS compared to the prior run.This decrease in FAPAR was mostly driven by a reduction of globally averaged foliar area of 0.41 m 2 m −2 for the JOINT experiment (0.34 m 2 m −2 for FAPARalone and 0.59 m 2 m −2 for CO2alone).Almost all PFTs contributed to the decrease in FAPAR, resulting from a reduction in the maximum leaf area index parameter ( max ) for tropical deciduous forests, needleleaf deciduous forests, as well as herbaceous PFTs (crops and grasses).In addition, the water-stress parameter τ w for drought-responsive PFTs played a secondary role in the leaf area reduction.The concurrent increase in foliar area for extra-tropical deciduous and rain-green shrubs only plays a minor role in the model-  data agreement, since these PFTs only cover a small fraction of the global land area.

JOINT
In regions with a strong temperature control of phenology, the assimilation did not only change the average LAI during the growing season.Also, the timing of the onset and end of the growing season was improved, as demonstrated by the enhanced correlation and model efficiency of the MPI-CCDAS with respect to the TIP-FAPAR data (Table 4).This improvement was mostly the result of adjusting the parameters T φ and t c , which are the temperature and day-length criteria determining when the vegetation switches from the dormant to the active phase.In particular, the assimilation reduced the temperature control parameter T φ , which led to an earlier onset of the growing season in the extra-tropical deciduous broadleaf and deciduous needleleaf PFTs.For the deciduous needleleaf forests, the assimilation procedure also resulted in an earlier end of the growing season, in accordance with the observations (see Fig. 2 for an example).The parameters controlling the phenological timing of other PFTs were not strongly altered by the assimilation, which -at the monthly temporal resolution of the satellite data analysed here -led to no observable modification of the temporal be- haviour of FAPAR.Notably, the CO2alone experiment also showed some improvement in the correlation and model efficiency compared to TIP-FAPAR, although this experiment did not use the TIP-FAPAR data as a constraint.This suggests that the seasonal cycle of CO 2 bears some constraint on the timing of northern extra-tropical phenology.
The FAPARalone assimilation run performed best compared to TIP-FAPAR (Table 4).However, the JOINT experiment yielded a fairly similar (though not identical) performance with respect to the simulated FAPAR.The temporally averaged LAI demonstrates the overall similarity between the FAPARalone and JOINT experiments (Fig. 3).This similarity is also reflected in the parameter values of the phenology: the parameters of FAPARalone and JOINT were often closer to each other than to CO2alone (Table 2).However, in some cases, similar model performance was obtained with diverging model parameterisation: an example for this is the TrBE PFT, for which parameters of the JOINT and FAPARalone experiment were different, while the modelled foliar area was very similar.An explanation of this feature highlighting the potential benefits of multi-data-stream assimilation is given in Sect.3.4.1.The most pronounced differences between the JOINT and FAPARalone experiments arose at locations where TIP-FAPAR data were not used as a constraint, such as crop-dominated pixels, in which the ETD PFT also covered a substantial part of the grid cell.These differences contributed strongly to the differences in the globally averaged foliar area.
Larger differences in simulated FAPAR occurred between the CO2alone and JOINT experiments (Table 4 and Fig. 3).The CO2alone experiment showed the smallest LAI, and thus the smallest FAPAR.This feature is especially pronounced in tropical regions, where the decrease was driven by the water-control parameter τ w and the parameter controlling maximum foliar area max .The opposite pattern was obtained for the CD PFT, which showed a larger foliar area for CO2alone driven by an increased parameter max compared to the other two experiments, in which foliar area and max decreased.The likely explanation of this behaviour is given in Sect.3.4.2.

Atmospheric CO 2
The assimilation procedure strongly reduced the misfit between the observed and modelled atmospheric mole fractions of CO 2 when using CO 2 as a constraint (CO2alone; Table 5).This was true for the seasonal cycle, the seasonal cycle's amplitude and the 5-year trend (Figs. 4 and 5).Conversely, the FAPARalone experiment showed a strong deterioration of the simulated atmospheric CO 2 metrics compared to the prior model.Notwithstanding an improvement of the seasonal cycle amplitude of atmospheric CO 2 (Fig. 5), the 5-year trend of atmospheric CO 2 was much less conforming to the observations, leading to a much faster increase in CO 2 than observed (Table 5 and Fig. 4).
Introducing TIP-FAPAR as an additional constraint in the JOINT experiment did allow the MPI-CCDAS to match both the atmospheric CO 2 data and the TIP-FAPAR product: the simulated monthly CO 2 mole fractions of the JOINT and CO2alone experiment are almost identical for most sites (Table 5 and Figs. 4 and 5).
The improvement of the simulated atmospheric CO 2 for the CO2alone and JOINT assimilation runs persisted for the 2 years following the assimilation period, in which the model was run in a prognostic mode (driven by reconstructed meteorology), with only minor degradation in model performance (Table 5).Both experiments clearly outperform the prior model, which is most obvious in the improvement of the NSE for the prognostic period.
The comparison of the simulated posterior atmospheric CO 2 mole fractions at the evaluation stations showed a general improvement in the performance measures, with substantial improvements in the simulated bias, RMSE and NSE relative to the prior model (Table 5).Unlike for the set of calibration sites, there was no difference in the improvement between the assimilation period and the subsequent 2-year period, suggesting that the model improvement is of a general nature.In other words, the short-term (1-2-year) prognostic capabilities of the model have been largely improved for a 2-year horizon after assimilating CO 2 observations, also at the evaluation locations.The changes in simulated atmospheric CO 2 mole fractions originated from substantial changes of the seasonal amplitude and overall strength of the net carbon fluxes simulated by JSBACH.The application of the CO 2 constraint increased the global net biome production (NBP) from 1.0 Pg C yr −1 in the prior model to 3.2 Pg C yr −1 in the CO2alone and JOINT experiments.Conversely, using only TIP-FAPAR as a constraint decreased the NBP to −2.2 Pg C yr −1 .In other words, using FAPAR data alone turned the biosphere into a net source (Table 6), inconsistent with current understanding of the global carbon cycle (Le Quéré et al., 2015).Despite the similarity of the global NBP for the experiments with CO 2 as a constraint, the spatial patterns of NBP were different between the CO2alone and JOINT experiments (Fig. 6).The net uptake in both experiments originated from boreal and tropical regions.However, the JOINT experiment showed an uptake in the boreal regions of coniferous evergreen and coniferous deciduous dominated pixels, whereas the net CO 2 uptake in the CO2alone experiment was more concentrated on the coniferous deciduous regions.These differences will be further investigated in Sect.3.4.2.
While the atmospheric observations constrained the net land-atmosphere CO 2 flux, the MPI-CCDAS model parameters act directly only on the gross carbon fluxes: gross primary production (GPP), autotrophic respiration, and heterotrophic respiration (Ra and Rh, respectively).Thus, the changes in simulated NBP were the indirect consequence of altered gross fluxes and land carbon pools.Although the globally integrated posterior GPP values were somewhat different across the experiments (Table 6), the relative latitudinal patterns were fairly similar to each other (Fig. 7): a reduction of GPP occurred globally, but was most prominent in tropical forests and grass/crop dominated regions in the temperate and boreal zone.The GPP reduction was strongest for the CO2alone experiment and weakest (but still very pronounced) for FAPARalone.The generally reduced foliar area directly led to a reduced GPP of the terrestrial biosphere (in all experiments).The changes in the photosynthetic capacity (f photos ) (Table 2) often further reduced GPP.This was most pronounced for the crop and tropical evergreen PFTs (Table 2).In the JSBACH model, Ra is estimated as a direct function of canopy-integrated carboxylation capacity, which strongly correlates with GPP (Eq.A17).Simulated Ra and net primary production (NPP) thus quickly adjusted to the imposed change of GPP.
Application of the CO 2 constraint in the CO2alone and JOINT experiment forced heterotrophic respiration (Rh) to be reduced to match the reduced NPP and the imposed atmospheric growth rate of CO 2 .The reduction in Rh was mainly driven by a reduction of the initial soil carbon pool (via the modifier f slow ) to about 50 % of the prior value for the JOINT and CO2alone experiment (Table 6).Since the net carbon fluxes in the FAPARalone experiment were not constrained by the atmospheric CO 2 observations, the assimilation did not adjust the heterotrophic respiration to balance the reduced net primary productivity induced from the altered FA-PAR.As a consequence, the net CO 2 flux to the atmosphere in the FAPARalone increased, leading to the overestimation of the growth rate of atmospheric CO 2 (Fig. 4).

Regional differences among the experiments
In the following, we focus on differences in the spatial patterns of the results obtained for tropical regions and the boreal zone to highlight the interplay between parameters in a global, multi-data-stream application of the MPI-CCDAS either by compensating effects between different model processes within one PFT as occurring in the tropics (Sect. 3.4.1)www.geosci-model-dev.net/9/2999/2016/Geosci.Model Dev., 9, 2999-3026, 2016 or by compensations between different parts of the globe (Sect.3.4.2).

Tropics
The modelled foliar area in the tropics (dominated by the tropical evergreen PFT) was similar for the JOINT and FA-PARalone experiments (Fig. 3), but smaller for CO2alone.
The simulated GPP of the JOINT experiment (Fig. 7) was somewhat lower than in the FAPARalone experiment, but still substantially larger than that of the CO2alone experiment.Notwithstanding these differences, the simulated net land-atmosphere CO 2 exchange (Fig. 6) of the JOINT experiment was closer to the posterior estimate of CO2alone than to that of FAPARalone in terms of absolute values.This result was caused by compensating effects of the two observational constraints (Fig. 8 and Table 2): the phenological parameters, notably τ w and max , were substantially different between the FAPARalone and JOINT experiments, yet their modelled foliar area was very similar (Fig. 3).The reason for this was that the photosynthesis parameter modifier f photos was reduced strongly in the JOINT experiment.This change caused the smaller GPP in the JOINT relative to the FAPARalone experiment.Through the effect of net photosynthesis on canopy conductance (Eq.A14), the potential transpiration rate (E pot ; Eq.A5) was strongly decreased.Together with the increase in τ w (Eq.A5) in the JOINT experiment, the decline in E pot had the same effect on the simulated phenology as the smaller parameter changes in the FA-PARalone experiment.The lack of an FAPAR constraint in the CO2alone experiment allowed the assimilation to overly reduce the foliar area by increasing τ w at the prior rate of photosynthesis and thus E pot to satisfy the constraint by the atmospheric CO 2 observations.As a consequence, due to the water-cycle feedback, the modelled foliar area was clearly different between the JOINT and CO2alone experiments.

Boreal zone
The CO2alone and JOINT experiments showed similar global statistics when compared with atmospheric CO 2 observations (Table 5 and Fig. 4).Their global and hemispheric net carbon uptake was similar (Northern Hemisphere: 2.24/2.20 PgC yr −1 ; Southern Hemisphere: 0.98/0.98PgC yr −1 ), but their underlying spatial patterns were different, in particular in the boreal zone (Fig. 6).The entire boreal zone took up a large share of the global carbon sequestration in the JOINT experiment (0.88 PgC yr −1 ), especially in coniferous deciduous (CD) dominated regions of Eastern Siberia (0.30 PgC yr −1 ).The CO2alone experiment showed a similar net carbon uptake in the boreal region, but the uptake in the CD dominated region was 0.16 PgC yr −1 stronger than in the JOINT experiment.This difference was mainly driven by larger foliar area and increased leaf-level productivity (parameter f photos ) of the CD PFT in the CO2alone experiment.In the same latitudinal band, coniferous evergreen trees showed reduced foliar area in the CO2alone experiment compared to the JOINT experiment, reducing the net uptake by 0.16 PgC yr −1 , such that the differences in these regions cancel each other.These relatively small spatial differences do not prevent the posterior JOINT and CO2alone experiment from capturing the amplitude of the seasonal cycle at individual northern-most stations.This largely increased sink in Eastern Siberia could be an artefact of the set-up used for the data assimilation in this study.No nearby atmospheric stations constrain the net carbon sink in this region adequately, and the CD PFT only occurs dominantly in this region.In consequence, the PFT's pa- Evaluation q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q PRIOR CO2alone FAPARalone Evaluation Constraining period q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q PRIOR CO2alone FAPARalone Evaluation Constraining period q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q PRIOR CO2alone FAPARalone JOINT Observations rameters cannot be adequately constrained by carbon-cycle observations from other parts of the globe.This relative scarceness of observations and independency of other regions allows the Eastern Siberian net carbon uptake to compensate for other regions' fluxes in order to match the global growth rate.Additional observations would be required to allow for spatially higher resolved estimation of the net fluxes.

Comparison of the simulated carbon cycle with independent estimates
We have demonstrated that the JSBACH model is capable of reproducing the seasonal cycle and 5-year trend of the observed atmospheric CO 2 (Figs. 4 and 5, and Table 5).During the assimilation run, we have applied a careful selection of q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q qq q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q qq q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q PRIOR CO2alone FAPARalone JOINT Observation stations to avoid the impact of local sources on modelled atmospheric CO 2 mole fractions, which cannot be simulated with the current coarse resolution of the MPI-CCDAS.The evaluation at the cross-validation sites, which are located on land and thus closer to locally varying source patterns, also demonstrated a good skill of the posterior model for these sites.Overall, this does suggest that the improvement of the MPI-CCDAS's ability to capture the observed CO 2 dynamics at monthly to yearly timescales is reasonably robust.Our results further support earlier studies (Rayner et al., 1999;Kaminski et al., 1999;Peylin et al., 2013) that the observational network of atmospheric CO 2 only constrains a limited number of spatio-temporal flux patterns.
The application of the CCDAS led to significant changes of the modelled carbon cycle in JSBACH.The average global GPP of the JOINT experiment was substantially reduced relative to the prior run and was only slightly lower than independent, data-driven estimates of 119 ± 6 PgC yr −1 (Jung et al., 2011) and 123 ± 8 PgC yr −1 (Beer et al., 2010), as well as estimates of comparable land-surface models (ranging from 111 to 151 PgC yr −1 ; Piao et al., 2013).Partly driven by the reduction of GPP, the NPP was also significantly reduced to 46 PgC yr −1 in the JOINT experiment.While such a value is lower than the commonly accepted reference value of 60 PgC yr −1 , it is still compatible with the range of available estimates for NPP of 44-66 PgC yr −1 (Cramer et al., 1999;Saugier and Roy, 2001).The latitudinal distribution of GPP in comparison to an empirical estimate based on satellite data and field measurements (Jung et al., 2011) shows that the global reduction of GPP led to a better agreement of GPP  in the northern extra-tropics between 30 and 60 • N, but to a lower GPP in the tropical rain forests (Fig. 7).The reduction of GPP in the northern extra-tropics is likely associated with the overestimation of the seasonal cycle of atmospheric CO 2 by the prior model, which was successfully reduced primarily by reducing northern extra-tropical productivity, in particular in temperate and boreal grasslands.Nevertheless, our study supports earlier findings that despite some constraint on northern extra-tropical production, the constraint of observed atmospheric CO 2 on global production is small (Koffi et al., 2012).
A detailed comparison of the simulated vegetation and soil carbon stocks is beyond the scope of this paper, partly because the simplifications of the spin-up procedure entail biases in predicted vegetation and soil carbon stocks, as transient land-use changes, forest management, and forest-age structure are ignored.It is nevertheless instructive to compare the simulated vegetation and soil carbon stocks to global totals from independent estimates to provide the context for the global carbon cycle simulated by MPI-CCDAS.The posterior experiments showed only little less carbon in vegetation (389-420 PgC) than the prior model (424 PgC; see Table 6).All of these estimates are lower than the 556 PgC vegetation carbon based on updated Olson's major world ecosys-q q q q q q q q q q q q q q q q q q q q q q q q −50 0 50 0 5 10 15 20

Latitudinal distribution
Latitude GPP [PgC yr −1 ] q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q PRIOR CO2alone FAPARalone JOINT Jung et al. (2011) Figure 7. Latitudinal distribution of GPP for the prior and posterior models compared to the independent estimates of Jung et al. (2011).
tem carbon stocks2 , but are comparable to a more recent estimate of global vegetation carbon storage of 442 ± 146 PgC (Carvalhais et al., 2014).The posterior amount of soil carbon from the assimilation runs using atmospheric CO 2 as a constraint compare favourably (within the uncertainty) to the estimates of 1343 PgC based on the Harmonized World Soil Database (HWSD)3 .This estimate is more appropriate for the presented comparison than the more recent and higher estimate of soil carbon by Carvalhais et al. ( 2014) of 1836-3257 PgC (95 % confidence interval), as the latter includes estimates of permafrost carbon, which is not modelled with the current version of the MPI-CCDAS.
Our estimate of the net land carbon sink using atmospheric CO 2 as a constraint is slightly larger than the residual land carbon sink estimate (without inclusion of land-use change fluxes) inferred from atmospheric measurements and auxiliary fluxes by Le Quéré et al. (2015), who derived a net uptake of 2.4 ± 0.8 PgC yr −1 for the period 2000-2009.Correcting this estimate for the pre-industrial lateral carbon fluxes from land to the ocean via rivers would increase the terrestrial net land C uptake seen by the atmosphere (and thus the MPI-CCDAS) to 2.85 PgC yr −1 ; see Le Quéré et al., 2015 andJacobson et al., 2007).Due to the interannual variability of the land sink, the shorter time period of our sink estimate may have contributed to the difference between the estimates.However, it is more likely that the reason for the difference is the prescribed, comparatively small, net ocean carbon uptake of 1.1 PgC yr −1 (Rödenbeck et al., 2013).This net ocean uptake applied in the MPI-CCDAS compares to the estimate of 2.4 ± 0.5 PgC yr −1 of Le Quéré et al. (2015) 4 , which decreases to 1.95 PgC yr −1 when correcting the esti-mate for the dissolved organic carbon (DOC) transport from land to oceans via river systems.Bearing in mind that the atmospheric CO 2 observations more directly constrain the net global carbon fluxes at seasonal and annual scales rather than the gross land fluxes or land carbon pools, assuming a larger ocean net carbon uptake would have reduced the net land uptake simulated by MPI-CCDAS.Explicitly accounting for DOC-based carbon losses from land in the JSBACH model would probably help to close the gap between the estimates and thereby reduce the estimated land carbon storage inferred from the atmospheric data.Adding such a process formulation would thus permit the MPI-CCDAS to generate an estimate which is more compatible with that of Le Quéré et al. (2015).

Comparison to previous studies
Our results are consistent with earlier studies, which showed that JSBACH overestimates the seasonal cycle amplitude of atmospheric CO 2 (Dalmonech and Zaehle, 2013).The posterior estimates of this amplitude was considerably reduced, leading to an improved model performance in all three experiments (Fig. 5).This also holds for FAPARalone, for which the comparison with CO 2 is an independent evaluation.Note that the prior we reported here already relies on an adjusted max parameter for the CE PFT (see Sect. 2.6).For the run with the off-the-shelf configuration of JSBACH as applied in Dalmonech and Zaehle (2013, results not shown), the highlatitude mean seasonal cycle amplitude was around 30 ppm, implying an overestimation of about 15 ppm.In the prior experiment including the adjusted max for the CE PFT, this overestimation was reduced to about 5-10 ppm.Applying only FAPAR as a constraint further reduced the overestimation of the high-latitude mean seasonal cycle amplitude (FA-PARalone experiment in Fig. 5).Adding CO 2 as a constraint further improves the fit to the seasonal cycle amplitude.In other words, boreal phenology, in particular maximum annual leaf area, has a considerable control on the seasonal cycle of the high-latitude atmospheric CO 2 signal.Using TIP-FAPAR helped to improve this metric of the carbon cycle despite the deterioration of the simulated longer-term CO 2 trend (Fig. 4).
This conclusion is also supported by Kaminski et al. (2012), who constrained the BETHY-CCDAS jointly with atmospheric CO 2 data and a different FAPAR product (Gobron et al., 2007).They found an improved seasonal cycle amplitude of CO 2 for their joint assimilation, which is in line with our findings.Through factorial uncertainty propagation with their assimilation scheme, Kaminski et al. (2012) also found that the inclusion of FAPAR yields only a moderate uncertainty reduction in the simulated carbon fluxes and mainly decreases the water flux uncertainties.Kaminski et al. (2012) therefore suggested that FAPAR only added little information to the modelled carbon cycle in addition to atmospheric CO 2 .In contrast, we have shown here a considerable impact of the FAPAR data set by altering the spatial net carbon flux patterns between the JOINT and CO2alone experiments.
Our study showed considerable differences in the GPP estimates, which were not reflected in the net carbon fluxes for the CO2alone and JOINT cases, as the net flux is more directly constrained by the atmospheric CO 2 observations.Using a variant of the BETHY-CCDAS, Koffi et al. (2012) also found large differences in their posterior GPP estimates ranging from 109 to 164 PgC yr −1 resulting from the use of alternative transport models, atmospheric station densities, and prior uncertainties.As in our study, their large GPP range was not reflected in large differences of the net land carbon flux.Our work thus supports earlier findings (Rayner et al., 2005;Scholze et al., 2007;Koffi et al., 2012) that despite some constraint on northern extra-tropical GPP, the global land GPP cannot be well constrained with atmospheric CO 2 alone.
A striking difference to the results of Koffi et al. (2012) occurred in the tropics, where BETHY-CCDAS overestimated GPP compared to data-driven estimates, whereas the MPI-CCDAS underestimated GPP.As will be discussed below (Sect.4.4), the underestimation of tropical GPP with MPI-CCDAS is likely a compensating effect arising from the respiration part of the model that only can be modified globally.This is not the case for the BETHY-CCDAS, which allows for a spatially more explicit control on heterotrophic respiration.It appears thus likely that a larger posterior GPP in the MPI-CCDAS could be expected with a system allowing for more spatial freedom in the parameterisation of respiration processes, for instance, by making f aut_leaf and f slow a function of plant functional type.Additional information to further reduce uncertainty in the spatial distribution of the gross fluxes (GPP and ecosystem respiration), especially in tropical regions, is therefore required.Improvements made on the gross fluxes will likely also propagate to an improved estimate of the net CO 2 fluxes.

Discussion of the assimilation procedure
The results clearly show that two data streams can be successfully integrated with the MPI-CCDAS.The posterior parameter values (Table 2) were different between FA-PARalone and JOINT, as well as the CO2alone and JOINT experiments.This demonstrates that the joint use of the two data streams added information to the posterior parameter vector by preventing the degradation of the phenology simulation when trying to fit the CO 2 observations (Tables 5  and 4).This conclusion is also supported by the fact that the value of the cost function of the JOINT assimilation roughly equals the sum of the single-data-stream experiments, indicating consistency of the model with both data streams.
Hence, although the JSBACH phenology is only weakly influenced by the carbon-cycle component of JSBACH and mainly controlled by other drives (e.g.soil moisture, temperature), there are strong interactions among carbon and water cycle parameters and simulated FAPAR, a finding supported by Forkel et al. (2014).The combination of the two data streams in the JOINT experiment helped to keep parameters within acceptable bounds.The capability of assimilating multiple data streams simultaneously is a distinct advantage of the MPI-CCDAS over alternative strategies that assimilate multiple data streams following a sequential design of assimilating FAPAR prior to carbon-cycle information.An implementation of such a sequential assimilation likely reduces the number of parameters to be optimised in each step, and therefore allows a quicker solution of the optimisation problem.However, this advantage comes at the cost of breaking the linkage between parameters, because side-effects of parameter variations on other modelled quantities are ignored in the assimilation process.This can lead to simulation results, in which the posteriori model of a sequential assimilation experiment will not match the observations equally well as obtained by simultaneous assimilation of the data streams.Since our results have demonstrated that a joint assimilation is feasible without impairing the fit to the individual data sources, a joint assimilation approach appears therefore recommendable.
The assimilation procedure achieved a strong reduction of the cost function and the norm of the gradient (see Table 3).Although the relative reduction in the norm of the gradient was larger in the CO 2 cases than in the FAPARalone case, the norm did not not approach zero -contrary to the FA-PARalone case.Such a non-zero gradient was also noted by Rayner et al. (2005) in their CO 2 assimilation with the BETHY-CCDAS.The fact that the MPI-CCDAS successfully reduces the norm of the gradient for FAPAR suggests that this is not a general failure of the MPI-CCDAS but is specific to the particularities of the CO 2 set-up.It is presently unclear what is causing the assimilation to fail to reach the minimum of the cost function, warranting further investigation of the non-linear nature and potential numerical issues regarding the computation of the gradient ∂J ∂p (Eq.1).Fur-ther tests with alternative station network settings, parameter priors, or time periods for data assimilation will provide more insight into potential solutions to tackle this issue.Nevertheless, we believe that our results can still be meaningfully interpreted and used to evaluate the general capacity of the MPI-CCDAS as a comprehensive data assimilation tool.

Comments on the parameter set-up
The results presented in Sect.3.2 show that there is a certain degree of equifinality in the parameter values obtained from the assimilation of TIP-FAPAR.This can happen when (i) certain parameters enter an insensitive regime where parameter differences do hardly propagate to differences in the modelled foliar area, (ii) pixels are a composite of different plant functional types that can show compensating effects, and (iii) the atmospheric CO 2 constraint imposes an additional weight on changing FAPAR, because of the feedbacks through photosynthesis and stomatal conductance.
A cautionary note about the posterior parameter values is warranted: Some of the parameters of the JOINT and CO2alone experiment were altered strongly compared to the assumed prior uncertainty.This is possible within the MPI-CCDAS, because the prior contribution to the cost function is weak due to the small number of parameters compared to the number of observations.One example is the f slow parameter, which controls the initial soil carbon pool size and thus the disequilibrium between GPP and respiration (Table 2).Another example is the photosynthesis parameter f photos for the tropical evergreen PFT in the JOINT experiment, which was reduced by more than 2.5 times the prior uncertainty and to roughly 75 % of its prior value.As a consequence, the assimilation procedure can result in parameter values with small prior probabilities.This either points toward too tight prior uncertainties, or to model structural problems.
One such structural problem is that the current MPI-CCDAS excludes the model spin-up from the assimilation procedure for reasons of computational efficiency.The current version of MPI-CCDAS manipulates the initial soil carbon pool by one globally valid modifier.This choice was made because allowing one to control the spatial structure of the carbon pools would require several more parameters to be optimised, which would very likely suffer from a strong equifinality problem, and which would considerably extend the already long run time of the MPI-CCDAS.Our results demonstrate that this spin-up approach allows us to adequately reproduce the space-time structure of the atmospheric CO 2 budget at the timescale of several years (Fig. 4 and Table 5).However, this approach likely introduces an imprint of the spatial distribution of the prior productivity on the final model outcome, which may cause imperfections in the ability of the MPI-CCDAS to accurately capture the spatial distribution of the net land carbon uptake.In turn, this approach will also affect the posteriori parameter vector.Allowing for more spatially explicit modifiers for the initial car-bon pools (as is done in the BETHY-CCDAS), for instance, by linking the initial soil disequilibrium to a particular PFT, would be a first step forward.
Another structural problem of MPI-CCDAS is the stiffness of the respiration parametrisation in JSBACH (with only a few adjustable parameters).This feature likely contributed strongly to the propagation of low temperate GPP into the tropical zone.Because the overall net CO 2 flux is constrained by the atmospheric observations, reduction in temperate GPP required a corresponding adjustment of the ecosystem respiration to balance the budget.While lowering GPP also reduces autotrophic respiration (Eq.A17), any further reduction in respiration in the temperate zone by adjusting autotrophic (f aut_leaf ) or heterotrophic respiration parameters (Q 10 , f slow ) would also affect tropical respiration, because in the current version of the MPI-CCDAS these parameters are assumed to be valid globally.To balance the budget, a reduction in tropical GPP, associated with the strong reduction of f photos for the tropical evergreen PFT in the JOINT experiment, might have been required.It is unlikely that the reduction of tropical GPP was associated with a phase shift in the dry-wet cycle in the Amazon rain forest, as no phase mismatch in atmospheric CO 2 is observed at Mauna Loa (Fig. 4) that would suggest such a problem.

Further development of the MPI-CCDAS
The application of the MPI-CCDAS allows detection of model structural errors and/or deficits in the set-up, which then can lead to a reformulation of the forward model (see e.g.Kaminski et al., 2003;Rayner et al., 2005;Williams et al., 2009;Kaminski et al., 2013).The framework described here can be steadily improved through regular improvements of the JSBACH model structure by including missing or correcting false model parameterisations (e.g.Knauer et al., 2015).The system is also versatile enough to add more constraints from relevant and complementary, multiple data sources (Luo et al., 2012) to come up with more robust regional estimates than the current atmospheric inversion allows.Besides the previously discussed limitation related to the spin-up, the representation of initial carbon pools and ecosystem respiration, we also suggest other analyses and developments to further improve MPI-CCDAS.
The discrepancies between FAPARalone and JOINT in the foliar area estimates for crop-dominated regions originates from the exclusion of TIP-FAPAR as constraint for these regions.This exclusion also affected the extra-tropical deciduous PFT that co-occurred dominantly in the same pixels.Increasing the constraining power of TIP-FAPAR by either adding more pixels as constraints or by increasing the resolution to finer grids might further improve the phenology.In this context we note that the per-pixel uncertainty ranges in the TIP-FAPAR product also reflect limitations of the information content that can be derived from sunlight reflected to space in the optical domain (i.e. the input to TIP) in particu-lar over dense canopies.Formal uncertainty propagation can quantify the information content in the FAPAR product on gross fluxes or, conversely, derive accuracy requirements for optical products (Kaminski et al., 2012).
We demonstrated the value of using a CCDAS instead of a pure atmospheric inversion to estimate the land net carbon flux, because the CCDAS can ingest complementary data streams, which may help to further constrain the regional estimates of the net land carbon flux.In this first version of the MPI-CCDAS, we have assumed the net fluxes other than those simulated with JSBACH, i.e. fossil fuel emissions and ocean exchange, as well as the atmospheric drivers of JS-BACH, to be perfectly known.Thereby we impute all modeldata mismatches on shortcomings of the land-surface model.It would be desirable to also account for the uncertainties in these components of the modelling system to more robustly identify potential model shortcomings.Further assessing the relative importance of different error sources (e.g. in the land cover type parameterisation, model biases, or observational errors) with a system such as the MPI-CCDAS would allow us to highlight priority areas to reduce their uncertainties and further constrain the global carbon-cycle numbers as given in Table 6.
Our results show that applying FAPAR and atmospheric CO 2 as a constraint for the JSBACH model leads to an improved simulation of phenology and northern extra-tropical GPP.As a consequence of the assimilation procedure, the model also captures the magnitude of the global and hemispheric NBP.This is a major step forward to including better constrained terrestrial models for the estimation of the global carbon budget (Le Quéré et al., 2015).However, we have set up the model such that it attributes the difference between the prior and posterior sinks (i.e.2.2 PgC yr −1 ) to changes in the soil carbon storage.It has long been known that the terrestrial net carbon uptake, and thus the CO 2 signal seen by the atmospheric observations, is strongly affected by natural (such as fire) and anthropogenic disturbances (such as land-use change; Houghton et al., 2012).These processes contribute to the disequilibrium of vegetation and soil carbon pools with vegetation production, and thus affect the spatial pattern of terrestrial carbon release and uptake.Without consideration of these processes, one should be careful in analysing the MPI-CCDAS projected carbon-cycle trends and attribution of drivers of the trends.The tangent-linear version of the JS-BACH model contained in the MPI-CCDAS already has the appropriate modules to simulate disturbance by fire (Lasslop et al., 2014) and land use (Reick et al., 2013).A further development of the MPI-CCDAS could be to activate these processes.In order to improve on the current situation it might also be desirable to constrain the post-disturbance dynamics of the carbon pools or at least to analyse how well these are constrained.This would also allow one to add more data streams to potentially disentangle the tight parameter linkages in the model.
The assimilation of 5 years of remotely sensed FAPAR and atmospheric CO 2 observations with the MPI-CCDAS was generally successful as the fairly substantial model-data mismatch of the prior model was largely reduced.In particular, the assimilation procedure strongly reduced the too large prior estimate of GPP, and generally led to an improvement of the simulated carbon cycle and its seasonality.The resultant carbon-cycle estimates compared favourably to independent data-driven estimates, although tropical productivity was lower than these estimates.The posterior global net land-atmosphere flux was well constrained and commensurate with independent estimates of the global carbon budget.Our analysis of the prognostic fluxes for a consecutive 2-year period as well as at stations withheld from the assimilation procedure demonstrates that our results are robust.
The factorial inclusion of FAPAR and atmospheric CO 2 as a constraint clearly demonstrated that the two data streams can be simultaneously integrated with the MPI-CCDAS.We have shown the potential of multiple-data-stream assimilation by adding TIP-FAPAR as a constraint and have shown how this data stream helps in constraining the foliar area without degrading the ability of the model to capture seasonal and yearly dynamics of the atmospheric CO 2 mole fractions.However, the multi-data assimilation also pointed to model structural problems in the initialisation, which need to be addressed.Nevertheless, our study highlights the potential of adding new data streams to constrain more processes in a global ecosystem model.
This study provides an important step forward in the development of global atmospheric inversion schemes.Adding a process-based component to these inversion systems allows one to disentangle the drivers of the terrestrial carbon balance.It also gives the opportunity to apply multiple data streams to constrain these drivers.Applying a data-assimilation system to a land component of a coupled carbon-cycle climate model provides a means to continuously improve carbon flux simulations in this coupled model.Improving the assimilation system on the one hand and adding more data streams on the other hand can ultimately lead to regionally constrained estimates of the terrestrial carbon balance for the assessment of current and future trends.

Code availability
The JSBACH model code is available upon request to S. Zaehle (szaehle@bgc-jena.mpg.de).
The TAF-generated derivative code is subject to license restrictions and is not available.J c = kC i and J i = α i I with the quantum efficiency α i = 0.04 and k: k = J max × 10 3 exp with E K = 50 967 J mol −1 .Dark respiration is modelled depending on V c max according to with activation energy E R = 45 000 J mol −1 , and f r C3|C4 = 0.011|0.031for C3 and C4 plants, respectively.Dark respiration is reduced to 50 % of its value during light conditions (Brooks and Farquhar, 1985).Photosynthesis and dark respiration are inhibited above 55 • C. Calculations are performed per PFT and three distinct canopy layers, which vary in depth according to the current leaf area index, assuming that within the canopy, nitrogen, and thus V c max , J max , and R d decline proportionally with light levels in the canopy.GPP values per PFT are integrated to grid-cell averages according to the cover fractions of each PFT within each grid cell.

A3 Carbon-water coupling
JSBACH employs a two-step approach to couple the plant carbon and water fluxes (Knauer et al., 2015).Given a photosynthetic-pathway dependent specific maximal internal leaf CO 2 concentration (C i ), a maximal estimate of stomatal conductance (gs pot ) is derived for each canopy layer, which is then reduced by a water-stress factor (w s ) to arrive at the actual stomatal conductance (gs act ) (see Knorr, 1997, 2000, andreferences therein).
where C a and C i are the external and internal leaf CO 2 concentrations.The water-stress factor w s is defined as where W root is the actual soil moisture in the root zone, and W crit|wilt defines the soil moisture levels at which stomata begin to close, or reach full closure, respectively.Soil moisture and bare soil evaporation are calculated according to the multi-layer soil water scheme of Hagemann and Stacke (2014).
Given the water-stressed stomatal conductance, leaf internal CO 2 concentration and carbon assimilation are then recalculated for each canopy layer by solving simultaneously the diffusion equation (Eq.A14) and the photosynthesis equations as outlined above (Sect.A2) A4 Land carbon pools, respiration and turnover The vegetation's net primary production (NPP) is related to the net assimilation (A) as where R g is the growth respiration, which is assumed to be a fixed fraction (20 %) of A − R m .R m is the maintenance respiration, which is assumed to be coordinated with foliar photosynthetic activity, and thus scaled to leaf dark respiration via f aut_leaf (Knorr, 2000), with the dark respiration R d as given in Eq. (A13).As a consequence, an increase in f _aut_leaf leads to an increase in NPP.
NPP is allocated to either a green or woody pool given fixed, PFT-specific allocation constants.The green pool turns to litter according to the leaf phenology, whereas the woody turnover rate is prescribed as a fixed constant.
JSBACH considers three litter pools (above ground green, below ground green and woody) with distinct, PFT-specific turnover times, as well as a soil organic matter pool with a longer turnover time.Heterotrophic respiration for each of these pools responds to temperature according to a Q 10 formulation: with a soil-moisture dependent factor 0 <= α resp <= 1. C pool is either the slow soil carbon pool, above or below ground green litter or wood litter pool and T is temperature and T ref = 0 • C the reference temperature and a pool depended turnover rate τ pool (more details on the carbon balance sub-module can be found in Goll et al., 2012).

Figure 2 .
Figure 2. Example time series of FAPAR for an East Siberian pixel dominated by the CD-PFT to demonstrate the improvement in the timing of the phenology due to the data assimilation.TIP-FAPAR observations are given with their mean (dots) and 1σ uncertainties (vertical lines).

Figure 3 .
Figure 3. Temporally averaged global LAI of the JOINT experiment and differences of the other experiments to the JOINT case.

Figure 4 .
Figure 4. Time series of atmospheric CO 2 as observed at highlatitude evaluation site Summit and at two constraining sites, one at high latitudes (Alert) and one representative of the Northern Hemisphere (Mauna Loa) for the different prior and posterior models.The observations are given together with their uncertainty.

Figure 5 .
Figure 5. Latitudinal distribution of atmospheric CO 2 seasonal cycle amplitude, calculated as the difference between the maximum and minimum CO 2 mole fractions of the averaged seasonal cycle of the linearly de-trended signal from 2005 to 2009.

Figure 6 .
Figure 6.Temporally averaged NBP of the JOINT assimilation, and the difference between the CO2alone and JOINT experiments.

Figure 8 .
Figure 8. Parameter changes of tropical evergreen trees in multiples of the prior uncertainty (as p po −p pr σ pr ).

Table 1 .
Plant functional types (PFTs) in the JSBACH model and the limitations that control the phenological behaviour of the respective PFT.

Table 3 .
Characteristics of the assimilation experiments.The prior and posterior cost-function values and the contribution of FAPAR, CO 2 and the prior (second term in Eq. 1) to the posterior cost-function value are given, as well as the norm of the gradient, the number of observations acting as a constraint, and the number of iterations of the assimilation

Table 6 .
Global averages of selected carbon-cycle components for the years 2005 to 2009 in PgC yr −1 for fluxes and PgC for stocks and comparison with independent estimates.Ra: autotrophic respiration.Rh: heterotrophic respiration.Reco: ecosystem respiration.NBP = GPP − Reco = GPP − Ra − Rh = NPP − Rh.Vegetation carbon includes quickly overturning leaf and fine root carbon, as well as a woody carbon pool.