Constraining DALECv 2 using multiple data streams and ecological constraints : analysis

We use a variational method to assimilate multiple data streams into the terrestrial ecosystem carbon cycle model DALECv2 (Data Assimilation Linked Ecosystem Carbon). Ecological and dynamical constraints have recently been introduced to constrain unresolved components of this otherwise ill-posed problem. Here we recast these constraints as a multivariate Gaussian distribution to incorporate them into the variational framework and we demonstrate their advantage through a linear analysis. Using an adjoint method we study a linear approximation of the inverse problem: firstly we perform a sensitivity analysis of the different outputs under consideration, and secondly we use the concept of resolution matrices to diagnose the nature of the ill-posedness and evaluate regularisation strategies. We then study the non-linear problem with an application to real data. Finally, we propose a modification to the model: introducing a spin-up period provides us with a built-in formulation of some ecological constraints which facilitates the variational approach.


Introduction
Carbon is a fundamental constituent of life and understanding its global cycle is a key challenge for the modelling of the Earth system.Through the processes of photosynthesis and respiration, ecosystems play a major role in the carbon cycle and thus in the dynamics of the global climate system.Our knowledge of the biogeochemical processes of ecosystems and an ever-growing amount of Earth observation systems can be combined using inverse modelling strategies to improve model predictions and uncertainty quantification.
The Data Assimilation Linked Ecosystem Carbon (DALEC) model is a simple box model for terrestrial ecosystems simulating a large range of processes occurring at different timescales from days to millennia.The work of Williams et al. (2005) established the benefit of using DALEC together with net ecosystem exchange (NEE) of CO 2 measurements in a Bayesian framework to estimate initial carbon stocks and model parameters, to improve flux predictions for ecosystem models and to quantify uncertainties.Inter-comparison experiments (Fox et al., 2009;Hill et al., 2012) have then demonstrated the relative merit of various inverse modelling strategies using NEE and MODIS leaf area index observations: most results agreed on the fact that parameters and initial stocks directly related to fast processes were best estimated with narrow confidence intervals, whereas those related to slow processes were poorly estimated with very large uncertainties.Other studies have tried to overcome this difficulty by adding complementary data streams (see Richardson et al. (2010)) or by considering longer observation windows (see Hill et al. (2012)).Recently Bloom and Williams (2015) defined a set of ecological and dynamical constraints (EDCs) to reject unrealistic parameter combinations in the absence of additional data.However, to date very few systematic analysis has been carried out to explain the large differences among results.
As with many inverse problems, assimilating Earth observations into DALEC is an ill-posed problem: the modelobservation operator which relates parameters and initial carbon stocks to the observations is rank deficient and not all variables can be estimated, or the model-observation operator is ill-conditioned and small observational noise may lead Published by Copernicus Publications on behalf of the European Geosciences Union.
to a solution we can have little confidence in.Solving the problem amounts first to transforming it into a tractable problem in order to ensure a robust, meaningful and stable solution.This can be achieved by using regularisation techniques; the most popular one involves combining the observations and prior information, assuming it exists, through Bayesian inference.The choice of regularisation method depends on the nature of the problem and on the inverse modelling approach adopted.
So far, off-the-shelf methods such as ensemble Kalman filter (EnKF) and Monte Carlo Markov Chain (MCMC) were adopted to perform model-data fusion with DALEC.For its ability to accommodate non-linearity and any kind of probability distributions, the MCMC method, in the limit of a large number of samples, may be considered as the gold standard.However, despite being well suited for this type of small-scale problem, the computational complexity of MCMC method makes it intractable for more complex situations.Here we adopt a variational approach (4DVAR) where a cost function measuring the mismatch between the model and observations is minimised using a gradient method based on the adjoint of the model.At AmeriFlux sites (see http: //ameriflux.lbl.gov/),we use MODIS monthly mean leaf area index (LAI) observations over a 12-year time window together with flux tower measurements of NEE and gross primary production (GPP).4DVAR facilitates the diagnosis of the ill-posedness of the inverse problem: using model resolution matrices we can assess the resolution and stability properties of the observation operators and of the regularisation terms.We transcribe the EDCs into a novel variational framework and use some of this additional knowledge to estimate the otherwise undetermined variables.We consider a modification of the DALEC model by adding a spin-up period where carbon stocks are brought to equilibrium.This offers an alternative to including all the EDCs and helps reducing the confidence intervals for the predicted fluxes.
The paper is organised as follows.In Sect. 2 we present DALECv2 and the observation streams used in this study, review the EDCs introduced in Bloom and Williams (2015) and perform a sensitivity analysis of the different outputs of DALECv2 of interest for our experiments.In Sect. 3 we recall basic principles of inverse theory from a Bayesian perspective, we introduce the variational formulation and we show how to incorporate the EDCs into this framework.Section 4 is devoted to a résumé of the linearised problem, using the tangent linear model, where the challenges of ill-posed problems and their regularisation can be explored in detail using simple linear algebra.Using a singular value decomposition we illustrate the effect of observational noise on illconditioned systems, and we investigate solution strategies from the point of view of resolution matrices.In Sect. 5 we conduct a series of non-linear inverse modelling experiments using multiple data streams and EDCs.In Sect.6 we modify DALECv2 to include a spin-up period which offers a built-in formulation of some EDCs, and then we reproduce the non- linear experiments.In Sect.7 we discuss several extension to our manuscript and finally in Sect.8 we draw conclusions.
2 Model, constraints and observations

DALECv2
DALECv2 depicts a terrestrial ecosystem as a set of six carbon pools (labile C lab , foliar C f , wood C w , root C r , litterfall C l and soil organic matter C s ) linked via allocation fluxes.At a monthly time step the gross primary production (GPP) is calculated using the Aggregated Canopy Model (Williams et al., 1997) as a non-linear function of meteorological drivers (temperature, radiation, atmospheric CO 2 concentration), foliar carbon and foliar nitrogen.Following the mass conservation principle, GPP is then allocated to the different carbon pools or released in the atmosphere via respiration.The schematic for DALECv2 is represented in Fig. 1 and a complete description of the model can be found in Bloom and Williams (2015).DALECv2 combines the two previous DALEC-evergreen and DALEC-deciduous models into a single model where the non-differentiable phenology process of DALEC-deciduous has been replaced with a differentiable process.DALECv2 is a non-linear dynamical system and the carbon pools are dynamical variables parametrised by their initial values C 0 and by 17 parameters p, whose range and description can be found in Table 1.
The magnitudes and ranges of the parameters and the initial values vary drastically; therefore, to avoid the computational problems caused by these different scales the variational methods will be formulated and implemented in terms of the log transformed variable x = log([p, C 0 ]) T .However, in order to limit unnecessary notation and definition, in the remainder of this paper p and C 0 will stand for their log transform.
Geosci.Model Dev., 10, 2635-2650, 2017 www.geosci-model-dev.net/10/2635/2017/The meteorological drivers are extracted from 0.125 • × 0.125 • ERA-Interim reanalysis data sets.For the purpose of our inverse modelling experiments we use four different observation streams: LAI, NEE, GPP and RESP (total respiration).LAI monthly mean observations for AmeriFlux sites are extracted from MOD15A2 LAI 8-day version 005 1 km resolution product.These observations together with the meteorological drivers are provided by A. Bloom and J. Exbrayat.Details about their construction can be found in Bloom and Williams (2015).At AmeriFlux sites we use the level 4 data product (available at http://cdiac.ornl.gov/ftp/ameriflux/data/Level4/),which provides monthly means for NEE and GPP.NEE and GPP are then used to define RESP as RESP = NEE + GPP.The meteorological drivers span a period of 12 years from 2001 to 2013.LAI observations are available during the full period but for NEE and GPP, and thus RESP, shorter records are available depending on the AmeriFlux site.In this study we consider the Morgan Monroe State Forest located in Indiana, ).This AmeriFlux site is composed in majority of mixed hardwood broadleaf deciduous trees and classifies as a humid subtropical climate.
In the remainder of the paper the main focus is on the vector x = log([p, C 0 ]) T : in Sect.2.3 first where we investigate the sensitivity of different outputs with respect to x and its components, and then in subsequent sections where x is es-timated using inverse methods.The vector x, denoting fixed quantities as initial conditions and parameters for the dynamical system DALECv2, is seen as the variable from the point of view of sensitivity analysis and inverse modelling and therefore its components will be referred to as state variables, input variables or parameters interchangeably throughout the manuscript.

Ecological constraints
Over the last decade many inverse modelling studies have used NEE measurements from the FLUXNET network, together with other types of observations when available, to provide information about processes controlled by parameters with respect to which NEE is weakly sensitive.Though it contains an ever-increasing amount of information, the flux tower network only provides sparse coverage of terrestrial ecosystems.On the other hand, despite good spatial and temporal coverage, MODIS LAI monthly mean observations only constrain a limited set of DALECv2 state variables, and additional information is required in order to regularise the ill-posed problem and obtain a meaningful solution.
Additional information can be obtained by imposing priors on the variables or by adding other observation streams (biomass, soil organic matter, etc.).As an alternative, Bloom and Williams (2015) introduced a set of constraints, referred to as ecological and dynamical constraints (EDCs).These www.geosci-model-dev.net/10/2635/2017/Geosci.Model Dev., 10, 2635-2650, 2017 constraints, detailed in Bloom and Williams (2015), can be divided into two groups: static and dynamic constraints.The static constraints which directly impose conditions on the parameters are as follows: -Turnover rate constraints which ensure that turnover rates ratios are consistent with knowledge of the carbon pools residence times.
EDC 4 : p 7 > p 9 exp p 10 T , (4) where T denotes the mean temperature within the drivers time window.EDC 4 is a modification to the constraint proposed in Bloom and Williams (2015).It is currently used in the CAR-DAMON framework (http://www.geos.ed.ac.uk/homes/ mwilliam/CARDAMOM.html).
-Root-foliar allocation which allows for a strong correlation between parameters controlling allocation to foliage and roots.
EDC 6 : EDC 7 : where the allocation fractions f fol , f lab and f root are defined as The dynamic constraints, for which a model run is performed to define attractors, limit the application of the model to ecosystems with no major recent disturbance.They are defined as follows: -Root-foliar mean dynamics where C f and C r denote the mean of C f and C r over the simulation period.
-Yearly carbon pools growth rate is limited to 10 %.
where for each pool C i denotes the mean carbon pool size over year i and the growth factor ζ is set to 1.
-Carbon pools are not expected to show rapid exponential decay; therefore, parameter sets are required to satisfy the condition that the half-life period of carbon pools is more than 3 years.
EDC 16−21 : γ < 3 × 365/ log 2 (15) The trajectory of each carbon pool is approximated using an exponential decay curve a + b exp γ t where a, b and γ are the fitted exponential decay parameters and t the time variable, in days in this case.
-Carbon pools are expected to be within an order of magnitude of a steady-state attractor.
where for each of the carbon pools C s , C l , C w and C r , C 0 denotes the initial state and C ∞ denotes the steadystate attractor defined as where G denotes the mean gross primary production and f wood , f som and f lit are given by To the original EDCs, we found it useful to add the three following constraints: EDC 31 : LAI(final day) > 0, (25 where α and β are real constants that need to be adjusted, LAI(summer) denotes the modelled LAI during summer and LAI(final day) denotes the modelled LAI at the end of the model run.These new constraints guarantee that LAI and the mean NEE remain within realistic bounds.Bloom and Williams (2015) demonstrated the efficiency of incorporating EDCs using a Monte Carlo method to improve parameter estimates and NEE predictions.We propose an approach to apply these extra constraints within a variational framework.Parameters are ranked in decreasing order according to their sensitivity, the blue dots represent the mean of the MNS (dimensionless quantity), the intervals represent 1σ error bars and the red dots correspond to null sensitivity.

Sensitivity analysis
Sensitivity analysis studies how the variations of the output h of a model can be attributed to variations of the input variables x i .Such information is crucial for model design, inverse modelling and reduction of complex non-linear models.A global sensitivity analysis for DALEC was recently performed in Safta et al. (2015).Here we consider a local approach where first-order derivatives are used to build sensitivity indices that help us understand the influence of input variables on the output.We denote h t as the function that maps x = log(p, C 0 ) to the value of an output of the model (here LAI, NEE, GPP and RESP) at time t, and we denote the time series of the model output as h = (h t 1 , . . ., h t N ).Following Zhu and Zhuang (2014), we consider the mean normalised sensitivity (MNS) defined as where E(•) denotes the average of the time series.The scalars σ i and σ h denote the parameter variance, set as 40 % of the parameter range, and the variance of the output respectively.The partial derivatives are computed using the adjoint derived using the method described in Giering and Kaminski (1998).The MNS s i is a dimensionless number that allows us to compare among parameters.We consider the Morgan Monroe State Forest over a 12year period.We sample 100 parameter sets satisfying the ecological constraints.For each parameter set, we compute the MNS for DALEC simulated mean fluxes LAI and NEE.In Fig. 2 parameters are ranked with respect to their mean MNS.We see that for LAI only 12 out of the 23 variables are sensitive, namely p 5 , p 17 , p 2 , p 13 , p 11 , p 15 , p 16 , C f , p 12 , p 3 , C lab and p 14 .Therefore, using LAI only in an inverse modelling experiment provides, at best, information about those twelve sensitive variables.For NEE we see that all variables are sensitive.Sensitivity analysis for GPP shows similar characteristics with LAI and so does RESP with NEE.For the four outputs under consideration (LAI, NEE, GPP and RESP) the most sensitive variables are the autotrophic respiration, p 2 , the annual leaf loss fraction, p 5 , the leaf mass per area, p 17 , the fraction of GPP allocated to labile pool, p 13 , the nitrogen use efficiency, p 11 , and the leaf fall day p 15 .
Here our focus is on the mean of the time series of DALEC fluxes (LAI, NEE) over a 12-year period.Finer analysis could be carried out by looking at seasonal aspects of the carbon cycle, identifying what variables are the most sensitive at certain times of the year, for example as studied in Safta et al. (2015).

Data assimilation
In this section we introduce concepts and methods that allow for close mathematical scrutiny of inverse problems and we present the variational method that we will apply in the following sections.

Ill-posed problem
A generic inverse problem consists of finding a n dimensional state vector x such that www.geosci-model-dev.net/10/2635/2017/Geosci.Model Dev., 10, 2635-2650, 2017 for a given N-dimensional observation vector y, including random noise, and a given model h.In the remainder of the paper the terms state vector, state variable, input variable and parameters will be used interchangeably to denote the vector x to be estimated using inverse methods and defined in the previous section as x = log([p, C]) T .The problem is well posed in the sense of Hadamard (1923) if the three following conditions hold: (1) there exists a solution, (2) the solution is unique and (3) the solution depends continuously on the input data.If at least one of these conditions is violated the problem is said to be ill-posed.The inverse problem (Eq.28) is often ill-posed, and a regularisation method is required to replace the original problem with a well-posed problem.Solving Eq. ( 28) amounts to (1) constructing a solution x, (2) assessing the validity of the solution and (3) characterising its uncertainty.Each inverse problem has its own features which need to be understood in order to characterise properly the solution and its uncertainty.

Bayesian inference: 4DVAR
Inverse problems are generally presented in a probabilistic framework where most methods can be expressed through a Bayesian formulation.The Bayesian approach provides a full characterisation of all possible solutions, their relative probabilities and uncertainties.From Bayes' theorem, the probability density function (PDF) of the model state x given the set of observations y, p(x|y), is given by where p(y|x) is the PDF of the observations given x and p(x) is the prior PDF of x.A special case is given when p(y|x) and p(x) are Gaussian PDF given by and where B is the covariance matrix of the prior term x 0 , and R is the covariance matrix of the observation error.When the operator h is linear then the posterior PDF p(x|y) is Gaussian and thus fully characterised by its mean and covariance matrix.The mean is obtained by minimising the modulus of the log of the joint probability distribution, which is the cost function J given by Many methods can be considered to minimise this cost function.A Monte Carlo method is employed in Bloom and Williams (2015).Here we use a variational approach which applies a gradient-based method where the gradient is given by with H T denoting the adjoint operator.The covariance matrix of the solution, C, is given by the inverse of the Hessian of the cost function When the observation operator h is non-linear, the cost function J can have multiple local minima and the posterior PDF may no longer be a Gaussian PDF.However, locally, the PDF N ( x, C), where C is given by Eq. ( 34) evaluated at a minimum x, provides a Gaussian approximation of the posterior PDF p(x|y).
The first term in the cost function (Eq.32) is a regularisation term encoding the Gaussian prior p(x).As we will show in the next sections the problem of assimilating Earth observations (LAI, GPP, NEE, RESP) into DALEC is a highly ill-posed problem and regularisation is required.The sensitivity analysis of Sect.2.3 showed that LAI and GPP are not sensitive to all variables.Moreover, all observations streams show very low sensitivities to some variables.Therefore, as will be illustrated in Sect.4.1, the solution (if any) is likely to be subject to large uncertainties.Apart from a couple of extensively studied sites, our prior knowledge about the variables is so far limited to their upper and lower bounds given in Table 1.As performed in Zhu and Zhuang (2014), it is a common practice to use this information to define a Gaussian prior p(x) ∼ N (x 0 , B), where x 0 is given by the centre of the variables ranges and B is the diagonal matrix whose diagonal elements are the squares of 40 % of the variables ranges.While using this kind of regularisation is necessary to ensure any solution at all when no better source of information is available, this introduces some biases in the solution.The EDCs introduced by Bloom and Williams (2015) provide new prior information about the variables.One of the purposes of this paper is to incorporate the EDCs as a regularisation term within 4DVAR.In the next section we propose a strategy to achieve this goal.

EDCs and 4DVAR
Incorporating the EDCs from an optimisation point of view can be easily performed by considering an inequality constraint optimisation problem where we aim at solving min x J y (x) subject to l < x < u and g(x) < 0, where g is the non-linear operator defining the EDCs described in Sect.2.2, and l and u are the lower and upper bounds defined in Table 1.This approach provides an efficient, robust and quick strategy to find an acceptable solution; however, stability properties are not easily determined (see Roese-Koerner et al., 2012).
Geosci.Model Dev., 10, [2635][2636][2637][2638][2639][2640][2641][2642][2643][2644][2645][2646][2647][2648][2649][2650]2017 www.geosci-model-dev.net/10/2635/2017/ We are seeking a multivariate Gaussian distribution that would encode the EDCs.At a forest site, we start by sampling the parameter space to obtain an ensemble of 1000 parameter sets satisfying the EDCs; each parameter set x is randomly created and required to satisfy g(x) < 0. We denote this ensemble by X EDCs .For most parameters, the sampling gives rise to undetermined PDFs which can certainly not be represented by Gaussian PDFs.However, upon inspecting the distribution g(x), for all x in X EDCs , we see that the distribution log(g(x)) can be fairly accurately approximated by multivariate Gaussian PDFs N (c, ), where c denotes the mean of the distribution log(g(x)) and denotes its covariance matrix.As an example, Fig. 3 shows the marginals log(g 4 (x)) and log(g 6 (x)), corresponding to the EDCs 4 and 6 respectively, together with a Gaussian fit.
Using Bayes' theorem we can then write Finding a Gaussian approximation for p(x|y, c) amounts then to minimising the cost function The gradient of J is given by and the Hessian of the cost function can be approximated by evaluated at the minimiser x.The operator G T denotes the adjoint of the tangent linear model G whose key ingredient is given by the adjoint of DALECv2.The approximation of p(x|y, c) is then given by the Gaussian distribution N ( x, −1 ).In Sect. 5 we will perform experiments using real data to validate this approach.

Linear analysis
Considerable theoretical insights into the nature of the inverse problem, and the ill-posedness, can be obtained by studying a linearisation of the operator h.A first approximation to the inverse problem consists of finding a perturbation z which best satisfies the linear equation where H is the tangent linear operator for h and d is a perturbation of the observations.The linear operator H is commonly referred to as the observability matrix (see Johnson et al., 2005).The least squares formulation of this problem is to solve the optimisation problem The minimisation can be performed using an iterative method such as the conjugate gradient method, where the gradient is given by The inverse Hessian of the cost function, (H T H) −1 , gives the covariance matrix of the least squares solution.In the next section we consider a direct solution method based on the singular value decomposition of the operator H, which allows us to investigate the nature of the ill-posedness of the problem.We illustrate regularisation using a truncated singular value decomposition.

Singular value decomposition
We consider a singular value decomposition of H of the form where U is a N × N unitary matrix, V is a n × n unitary matrix and S is the N × n diagonal matrix whose diagonal elements are the singular values s 1 ≥ . . .≥ s n ≥ 0. Using this decomposition, the solution z LS to Eq. ( 39) can be written as The matrix H † = V S † U T is the pseudo-inverse of H where S † is the diagonal matrix obtained by transposing S and replacing the non-zero elements with their inverse s −1 i .The covariance of the solution is given by  1996) that the relative error in the solution, defined as the left-hand side of the above inequality, is bounded by where κ(H) is the condition number of H defined as κ(H) = s 1 /s n , z 0 denotes the truth (possibly unknown) and represents observational noise.When the condition number is large the matrix is said to be ill-conditioned, the problem is ill-posed and the solution (Eq.42) is unstable: small perturbations to the system can lead to very large perturbations in the solution.

Stability for NEE operator
As an example we consider the problem of assimilating NEE observations into DALECv2 to estimate model parameters and initial conditions at Morgan Monroe State Forest.We linearise Eq. ( 28) about a point x * satisfying the EDCs, form the observability matrix H and compute its singular value decomposition.The singular values, shown in Fig. 4, reveal a condition number of the order of 10 5 .
For a signal-to-noise ratio, namely / d , of magnitude 0.1, inequality (Eq.44) gives an upper bound for the relative error in the solution of the order of 10 4 , which does not give much credit to the least squares solution.How sharp is this bound?Are we overestimating the error?To answer these questions we create a set of noisy observations with noise variance σ = 0.1 and we compute the solution (Eq.42).The relative error for each component of the solution, η i , and the variance ν i , are given in Table 2. Despite a relatively good match between the modelled NEE perturbations and the observations, as shown in Fig. 5, the results of Table 2 show very large relative errors and variances for most variables.Moreover, these results are in agreement with the results of REFLEX: parameters directly linked to foliage and GPP are better estimated than parameters related to allocation to and turnover of fine root/wood.The results of Table 2 reflect the sensitivity analysis shown in Fig. 2. The variables with respect to which NEE is the most (least) sensitive are the less (more) affected by the noise.
To reduce the impact of observational noise on the solution, regularisation is required.The truncated singular value decomposition (TSVD) is a simple and popular method for regularisation.TSVD consists of truncating the pseudoinverse in Eq. ( 42) in order to remove the smallest singular values, the most affected by the noise.The solution z (k) is then given by where k is the truncation rank and where S k , U k and V k are the rectangular matrices formed by the first k columns of S, U and V.The covariance of the solution is given by The truncation rank k can be chosen using the L-curve method.The L curve is a log-log plot of the norm of the solution z (k) against the norm of the residual H z (k) − d parametrised by the regularisation parameter k.The optimal parameter corresponds to the point of maximum curvature of the L curve.Further details on the L-curve method can be found in Hansen and O'Leary (1993).
In our example with NEE we use Hansen's regularisation tools (see Hansen, 2007) to perform the TSVD method.The truncation rank obtained using the L-curve method is k = 7.The last three columns of Table 2, presenting the TSVD solution, the relative error of each components and the variances, can be compared with the unstable solution results.Whereas the relative errors in the unstable solution range from 5.3 × 10 −2 to 3.8 × 10 4 , the relative errors in the regularised solution range from 5.3 × 10 −2 to 5.1.We see that TSVD has the effect of keeping small the variables that cannot be estimated correctly.As previously stated the results of the regularisation can be related to the sensitivity analysis depicted in Fig. 2: TSVD prevents the variables with respect to which NEE is the least sensitive from growing unbounded.
In the next section we consider the concept of a resolution matrix, which allows for finer analysis of the solution of the linear problem.

Resolution matrix
As suggested by Eqs. ( 42) and ( 45), finding a solution z amounts to constructing a generalised inverse H g such that formally The generalised inverse is the operator representing any method, direct or iterative, used to solve the linear inverse problem, with or without any kind of regularisation.In the previous section we considered two examples of generalised inverse, the pseudo-inverse and the truncated inverse obtained using TSVD.The generalised inverse can be used to define operators which directly address the conditions for well-posedness for the linearised problem.Assuming a true state z * exists, possibly unknown, using Eqs.( 38) and ( 47) we can then define an operator N called the model resolution matrix which relates the solution z to the true state This matrix provides a practical tool to analyse the resolution power of an inverse method, that is, its ability to retrieve the true state, with or without using any regularisation method: the closer N is to the identity, the better the resolution.Moreover, the trace of the matrix defines a natural notion of information content (IC).Similarly a data resolution matrix can be defined to study how well data can be reconstructed and its diagonal elements naturally define a notion of data importance.For the two examples of generalised inverse presented in the previous section we obtain the following resolution matrices: for the pseudo-inverse and for the truncated pseudo-inverse.In the first case the IC equals the number of non-zero singular values, in the second case the IC equals the truncation rank k.An in-depth theoretical and practical analysis of these concepts and those introduced in the remainder of this section can be found in Menke (1984).
While the model resolution matrix allows us to see how a solution strategy maps the true state variables to the solution of the inverse problem, and to see how well and how independently the state variables can be recovered, one also needs to assess the uncertainty of the solution.This can be studied using the so-called unit covariance matrix, C, defined using the generalised inverse as By characterising the degree of error amplification that occurs in the mapping from the true state to the solution of the inverse problem, the unit covariance matrix is a crucial object for studying the stability of the solution with respect to observational noise.The unit covariance matrix defined by Eq. ( 51) agrees with the covariance matrices given in the previous section by Eq. ( 43) for the pseudo-inverse, and by Eq. ( 46) when TSVD is applied.

Resolution for LAI operator
We now study the model resolution matrix for the LAI observation operator at Morgan Monroe State Forest.In the first instance we will demonstrate the resolution power of the LAI signal without regularisation using the pseudo-inverse as generalised inverse first, and then apply TSVD to show how using the truncated pseudo-inverse affects resolution.In a second case we will study the added value of the EDCs in terms of resolution.
As previously, we linearise Eq. ( 28) about the point x * given in Table 2.The trace of the resolution matrix obtained using the pseudo-inverse as generalised inverse is 10, and this means that 10 independent variables can be estimated using LAI.These independent variables are not the variables in which the system is expressed, but a linear transformation can be found to express the system in terms of the independent variables.Figure 6 shows the model resolution matrix for LAI.As shown in Sect.2.3 with the sensitivity analysis, 11 out of the 23 variables are not sensitive to LAI, and this can be seen in the resolution matrix by the diagonal terms which are zero, represented in blue.In contrast the diagonal elements corresponding to sensitive variables have positive values, represented by colours ranging from light blue to red. Figure 6 also shows that whereas p 5 , p 11 , p 12 , p 14 , p 15 and p 16 are perfectly resolved (the corresponding elements are coloured brown or dark red), there exist linear combinations between the remaining sensitive variables, which explains why only 10 independent variables can be estimated from the 12 sensitive variables.
For the study of the unit covariance matrix we restrict ourselves to the sensitive variables.This amounts to removing the columns corresponding to the non-sensitive variables, containing only null elements, from the observability matrix.The dependency of the solution on observational noise can be studied by looking at Fig. 7, where the diagonal elements of the unit covariance matrix, corresponding to the variance of each element of the solution obtained using the pseudo-inverse, are represented in log scale.Except for p 5 , p 12 , p 15 and p 16 , all variances are shown to be large.
As previously, we illustrate a simple regularisation strategy, TSVD, and show its effects on both resolution and stability.Figure 8 shows the resolution matrix for LAI with optimal truncation rank k = 6.The IC decreases to 6.We see that whereas p 5 , p 12 , p 15 and p 16 remain almost perfectly resolved, p 13 , p 17 and C lab are only partially resolved and the remaining variables are not resolved properly.Figure 7 shows the corresponding diagonal elements of the unit covariance matrix, from which we see that the variances have been drastically reduced.This example shows how regularisation ensures stability at the price of losing resolution.
We now consider the effect of incorporating the static EDCs into the variational framework in terms of resolution.The static EDCs are given by the first seven EDCs, the linear problem is then given by where with −1/2 the inverse of the symmetric square root of the covariance matrix , defined in Sect.model resolution matrix corresponding to the operator on the left-hand side of Eq. ( 52), obtained using the pseudo-inverse, is depicted in Fig. 9.The trace of the model resolution matrix gives an IC of 16, 13 variables are perfectly resolved and 4 variables show linear dependencies (p 2 , p 3 , p 4 and p 13 ).However, although p 9 and p 10 are sensitive variables, they do not appear to be resolved at all: inspecting the linear operator G shows that the non-zero components corresponding to p 9 and p 10 are several order of magnitude smaller than the other components.This example shows clearly the benefit of introducing the static EDCs to help estimate poorly constrained or otherwise undetermined components.

Experiments at AmeriFlux sites
We now consider a real experiment at Morgan Monroe State Forest.At this AmeriFlux site, 12 years of MODIS LAI monthly mean observations from 2001 to 2013, NEE, GPP and thus RESP observations from 2001 to 2005 are available.Our goal is to study two different aspects.The first one is the impact of using multiple data streams: how does it affect uncertainty of the predicted fluxes and how well do we predict non-observed fluxes?The second one is to use the static EDCs and to assess their utility in constraining poorly sensitive variables.
When including all terms the cost function, J TOT , becomes where subscripts L, N, G and R stand for LAI, NEE, GPP and RESP respectively.The vectors y L , y N , y G and y R represent the observation vectors for LAI, NEE, GPP and RESP respectively.The scalars λ L , λ N , λ G and λ R take the value 0 or 1 depending on whether or not the corresponding data stream is included in the experiment.The scalar λ c takes the value 0 or 1 depending on whether we include the EDCs and λ 0 takes the value 1.We perform six experiments summarised in Table 3.In experiment (Exp.) 1, we use only LAI observations and bounds constraints so that in the cost function J TOT we set λ L = 1 and λ 0 = 1, and the other λs are set to zero.For Exp. 2, we use LAI and NEE observations, that is, we set λ L = 1, λ N = 1 and λ 0 = 1; the other λs are set to zero.We proceed similarly for the remaining experiments.Here we assimilate all data streams simultaneously; it is not our intention to question what method best accommodates multiple data streams.MacBean et al. (2016) addresses this question using a simple C cycle model.Moreover, we choose to assume the same statistical error for all data streams and set their error covariance matrix equal to the identity.To avoid being trapped at meaningless local minima, the experiments are performed multiple times using different initialisation parameter sets and results for the best candidate only are reported.
The results of the experiments are presented in Table 4, where each element of J TOT is given for all experiments, and www.geosci-model-dev.net/10/2635/2017/Geosci.Model Dev., 10, 2635-2650, 2017 in Table 5, where the solution components and their variance are presented for all experiments.Results of Table 4 show that J L is the smallest in Exp. 1 when LAI only is used.In Exp. 2, when adding NEE we see that J N decreases from 109.012 in Exp. 1 to 15.263, and J G slightly decreases as compared to Exp. 1, but J R increases instead.In Exp. 3 we see that all costs drastically decrease compared to their initial values.Going from Exp. 1 to Exp. 3, J 0 slightly increases; adding more data streams constrains more parameters, and the parameters shift from their prior value which may cause J 0 to increase.Similar observations can be made for Exp. 4 to Exp. 6; moreover, we see that including the EDCs only slightly affects the costs.A reason for this might be that EDCs help constrain the less sensitive parameters for which the costs are less sensitive, as suggested by the sensitivity analysis depicted in Fig. 2. To see the effect of the EDCs we need to look at Table 5, which details the solution components together with their relative variance defined by the ratio of the variance by the parameter range.In Exp. 1 we see that the variables with the smallest relative variance are the most sensitive parameters as illustrated in Fig. 2: p 2 , p 5 , p 10 , p 12 , p 14 , p 15 , p 16 and p 17 .We recall that the sensitivity analysis of Sect.2.3 was performed by averaging sensitivities for an ensemble of initial parameter sets; therefore, the ranking shown in Fig. 2 may not be reflected in the relative variances.As we include NEE in Exp. 2 we see that most relative variances decrease, especially for p 8 , p 9 , p 10 , p 13 , C lab , C f and C l .The only variable whose relative variance increases is p 14 , but as shown in Fig. 2, p 14 has very low sensitivity.In Exp. 3 most relative variances decrease.The values are still large though for p 1 , p 3 , p 4 , p 6 , p 9 , C r and C l .Again, similar features can be observed for Exp. 4 to Exp. 6, 2001Exp. 6, 2002Exp. 6, 2003Exp. 6, 2004Exp. 6, 2005Exp. 6, 2006Exp. 6, 2007Exp. 6, 2008  but a clear improvement can be seen for most variables except for C r which is not constrained by the first seven EDCs.Finally, the last column of Table 4 shows the computation time for each experiment.As expected we see that the more observation streams we consider, the longer the experiment takes to run, and incorporating the EDCs increases computation time.However, we stress that these figures are several orders of magnitude less than the time required to perform the same experiments using the current gold standard MCMC approach used in Bloom and Williams (2015).
Figures 10 and 11 show the predicted fluxes for LAI, NEE, GPP and RESP for the result of Exp. 6.We can see good agreement between modelled fluxes and observations.The uncertainty of the predicted fluxes is evaluated by modelling an ensemble of trajectories from a 95 % ellipsoid of the posterior truncated Gaussian distribution.These trajectories are represented as grey curves in Figs. 10 and 11. Figure 12 shows the posterior parameter distribution marginals for p 1 , p 7 , p 16 and C f for Exp.6, illustrating the four different cases where most of the marginal is contained in the parameter range for p 16 ; the marginal is truncated on the left or the right for p 7 and C f and truncated on both sides for p 1 .

DALEC-SP
In the previous section we used EDCs within 4DVAR and showed their advantage in reducing drastically the uncertainty of otherwise undetermined variables.However, we only included the static EDCs which do not require a model run.As including more EDCs often leads to convergence issues, the solution and its uncertainty become subject to caution.
Geosci.Model Dev., 10, [2635][2636][2637][2638][2639][2640][2641][2642][2643][2644][2645][2646][2647][2648][2649][2650]2017 www.geosci-model-dev.net/10/2635/2017/As shown in Chuter et al. (2015) for the previous DALEC evergreen and deciduous models, the evolution of the carbon pools for DALECv2 show a tipping point which depends on the parameters p 1 to p 17 .Given a set of parameters, p, the fast carbon pools C lab , C f , C r and C l grow or decay rapidly to an equilibrium state.This equilibrium is either zero and the forest dies out or a pseudo-periodical seasonal cycle as shown in Fig. 13 for C f .Moreover, there exists a limit value below which any initial condition leads to the zero equilibrium and above which the equilibrium is a strictly positive pseudo-periodical seasonal cycle.
Here we consider ecosystems with no recent major disturbance, where the fast carbon pools are expected to be close to their pseudo-periodical cycle.To model these ecosystems, one can either restrict the parameter space by using the dynamic EDCs, or we can introduce a spin-up period during which the carbon pools reach their attractor.Given parameters p 1 to p 17 and initial values for C w and C s a first run of DALECv2 is performed to obtain a state which is closer to a pseudo-periodical cycle for the fast carbon pools.The steady-state trajectories are then used to initialise the fast carbon pools.For this DALECv2 "spin-up" model, DALEC-SP, the state variable is therefore formed of the 17 parameters, p 1 , . . ., p 17 , and the initial conditions for C w and C s .
DALEC-SP offers several advantages: some of the EDCs such as those controlling the growth and the half-life period of carbon pools are almost automatically satisfied.This greatly reduces the time required to generate the PDF p(c|x).Moreover, as the sensitivity analysis and the resolution matrices showed, the fast carbon pools are variables that are not highly sensitive to the signals that we observe, and therefore reducing the number of variables by removing the fast carbon pools is likely to improve the overall conditioning of the inverse problem.Reproducing experiments 1 to 6 using DALEC-SP shows similar results to those with DALECv2.

Discussion
To our knowledge, this paper presents the first application of variational methods for an inverse modelling experiment using DALEC.Over the last 15 years many studies have validated the use of DALEC together with various types of data streams to infer ecological parameters at the site level.However, first ensemble Kalman filter and then Monte Carlo methods were privileged.At the same time 4DVAR has been successfully used at the global scale to constrain ecosystem parameters in carbon cycle data assimilation system (CC-DAS).In Rayner et al. (2005), the Biosphere Energy Transfer Hydrology model (BETHY) is coupled with the transport model TM2, and satellite observations of photosynthetically active radiation and atmospheric CO 2 concentration observations are used to optimise model parameters.In this context Kemp et al. (2014) investigated how to constrain the 4DVAR problem in CCDAS through a number of different methods: using constrained optimisation, adding a penalty term and applying parameter transformations.They concluded that using parameter transformations give the best results.In our study the three methods were investigated: Gaussian anamorphosis where priors based on the distribution of parameters satisfying the EDCs were considered, constrained optimisa-   tion as stated in Sect.3.3 and adding a penalty term to account for the EDCs.The latter solution which is the main interest of this publication was found to be the most successful in our case.
The complexity of global-scale experiments still limit the application of fully non-linear methods such as MCMC.In Ziehn et al. (2012) a comparison between the MCMC Metropolis-Hastings approach and 4DVAR for the BETHY-TM2 CCDAS framework is performed.This study reports a computation time of less than 1 h for the variational method and about 8 months for the overall MCMC computation.For our setting, DALECv2 site-based experiment, the complexity is relatively small and a MCMC approach is affordable.Used in Bloom and Williams (2015), the MCMC approach for DALEC is studied in detail in Safta et al. (2015), and the resulting parameter distributions suggest that 4DVAR and the inherent Gaussian approximation provide a reasonable posterior distribution.
As with most variational methods, the analysis and application presented in this paper rely heavily on the possibility of deriving the tangent linear model and its adjoint.DALECv2 was designed to take into account this requirement, in particular by replacing the phenology process of the DALEC deciduous model in order to obtain differentiable processes.The model resolution matrix and the gradient of the cost function, including the additional term encoding the EDCs, are computed using adjoint techniques.Despite the increasing capacities offered by automatic differentiation tools, deriving and maintaining an adjoint code can be a complicated task, and, besides its limiting hypothesis, this is certainly one of the main reason for choosing alternatives to 4DVAR.In a forthcoming paper, we use ensem-Geosci.Model Dev., 10, [2635][2636][2637][2638][2639][2640][2641][2642][2643][2644][2645][2646][2647][2648][2649][2650]2017 www.geosci-model-dev.net/10/2635/2017/ble methods to approximate the gradient of the cost function and to derive approximate resolution matrices, and the experiments presented in this paper are reproduced.The approach, which no longer requires the adjoint, shows very promising results: firstly in terms of estimating parameters, and secondly in terms of computation time by using graphic processing units (GPUs) to perform massive parallel computations.Designing a global-scale experiment involving a coupling between DALEC and a transport model has been considered but is still at an early stage.As presented in Bloom and Williams (2015), the EDCs were originally introduced to constrain unresolved parameters at the site level where, in the absence of any other information, only MODIS LAI observations were available.In theory there is no restriction to readily apply the same constraints at a global scale; however, their efficiency highly depends on the nature of the coupling between the ecosystem model and the transport model, and on the observation streams considered.Nonetheless in this context 4DVAR remains the only reasonable method to consider in terms of computer resources, and our study demonstrates that the current research efforts to develop regularisation strategies fit well into the variational framework.

Conclusions
We used DALECv2 and combined multiple data streams -MODIS monthly LAI and monthly NEE, GPP and RESP at an AmeriFlux site -together with ecological constraints to estimate model parameters and initial conditions and to provide uncertainty characterisation for predicted fluxes.DALECv2 is a simple model.It represents the basic processes at the heart of more sophisticated models of the carbon cycle, and, besides its large modelling skills, its simplicity allows for close mathematical scrutiny.Here we adopted a variational approach where the tangent linear model and its adjoint play a major role in (1) facilitating a linear analysis which allows one to understand the nature of the illposed problem and to evaluate strategies to regularise it and (2) finding a posterior distribution for the state variables.
We performed a sensitivity analysis using a direct method that consists of studying the first-order derivatives of the output computed using an adjoint method.A sensitivity analysis is a prerequisite to any work with a model, but there is a paucity of literature on this topic in connection with DALEC.Our analysis reveals generic issues that will be encountered in many inverse modelling strategies.Studying the first-order inverse problem, we discussed how noise affects the stability of the solution and we illustrated a simple regularisation method.We then introduced the notion of a model resolution matrix and showed how this can be used to diagnose the illposedness of an inverse problem and evaluate the result of regularisation strategies.While some of our findings may be anticipated in the framework of a simple model, it is important to describe these tools and their interpretation, as similar analyses can be readily applied to a wide range of more complex models.Bloom and Williams (2015) proved the advantage of the EDCs in constraining poorly resolved components of the carbon cycle and recommended their use for inverse modelling problems.We successfully incorporated the EDCs within the context of variational data assimilation.Our results confirm that the EDCs regularise an otherwise ill-posed problem and efficiently reduce the uncertainty of predicted fluxes.Moreover, our modification to DALECv2, DALEC-SP, which includes a spin-up period, offers an alternative to some EDCs that facilitates the variational approach.
This study did not aim at providing an exhaustive account on the capability of variational tools or exploring all aspects of the EDCs for the inverse problem for DALEC.The objectives were to use 4DVAR and show that it offers a suitable framework to solve efficiently, robustly and quickly the inverse problem for DALEC, and to present a methodology to analyse some issues that affect most methods based on Bayesian inference.

Figure 1 .
Figure 1.DALECv2 links the carbon pools (C) via allocation fluxes (green), litterfall fluxes (red) and decomposition (black).Respiration is represented by the blue arrows.The orange arrow represents the feedback of foliar carbon to gross primary production (GPP).

Figure 3 .
Figure 3. Distribution and Gaussian fit for EDC 4 and EDC 6 .

Figure 4 .
Figure 4. Singular values of the observability matrix for NEE (log scale).

Figure 5 .
Figure5.Solution of the linearised inverse problem for NEE.The red points represent the observations, the red curve is the true trajectory, the green curve is the trajectory obtained using the unstable solution and the blue curve is obtained using the truncated singular value decomposition (TSVD) solution.

Figure 7 .
Figure 7. Diagonal elements (log scale) of the unit covariance matrix for the LAI operator: using the pseudo-inverse shown in green and TSVD shown in yellow.

Figure 8 .Figure 9 .
Figure 8. Model resolution matrix for the LAI operator using TSVD with truncation rank k = 5.

Figure 10 .
Figure10.DALECv2 monthly estimates for LAI and NEE at Morgan Monroe State Forest.The red dots are the observations, the blue trajectories are obtained using the 4DVAR analysis, and the grey trajectories are ensemble runs obtained from a 95 % confidence sample of the posterior PDF.

Figure 11 .
Figure11.DALECv2 monthly estimates for GPP and RESP at Morgan Monroe State Forest.The red dots are the observations, the blue trajectories are obtained using the 4DVAR analysis, the grey trajectories are ensemble runs obtained from a 95 % confidence sample of the posterior PDF.

Figure 12 .
Figure 12.Posterior parameter distributions for parameters p 1 , p 7 , p 16 and C f for Exp. 6.For each plot the limits of the abscissa correspond to the parameter range.The red curve is the Gaussian posterior distribution and the blue bars represent the sample used to produce the grey trajectories in Figs. 10 and 11.

Figure 13 .
Figure 13.Pseudo-periodical seasonal cycle for DALECv2.Using a given set of parameters and initial values for C w and C s , 100 DALECv2 runs are performed using random initial values for C lab , C f , C r , C l .The plot shows the 100 trajectories for C f .

Table 1 .
DALECv2 dynamical variables and parameters with their respective range.The units of the non-dimensionless quantities are given in brackets.

Table 2 .
Results of the linear inverse problem showing (1) the solution components for the least squares solution z LS together with their relative error η i (dimensionless quantity) and variance ν i and (2) the solution components for the TSVD solution z (k) together with their relative error η Figure 6.Model resolution matrix for the LAI operator.

Table 3 .
Experiment set up summary: in Exp. 1 we use LAI and bounds constraints (BDS), in Exp. 2 we use LAI, NEE and BDS and so on.

Table 4 .
Costs for the results of the inverse modelling experiments.The last column reports the computation time in seconds for the experiment.

Table 5 .
Results of the inverse modelling experiments.The solution components together with their relative variance, in brackets, are given for each experiment.The vector x init is the randomly chosen parameter set satisfying the EDCs that initialises the minimisation routine.