Improving international food security under a changing climate and increasing human population will be greatly aided by improving our ability to modify, understand and predict crop growth. What we predominantly have at our disposal are either process-based models of crop physiology or statistical analyses of yield datasets, both of which suffer from various sources of error. In this paper, we present a generic process-based crop model (PeakN-crop v1.0) which we parametrise using a Bayesian model-fitting algorithm to three different sources: data–space-based vegetation indices, eddy covariance productivity measurements and regional crop yields. We show that the model parametrised without data, based on prior knowledge of the parameters, can largely capture the observed behaviour but the data-constrained model greatly improves both the model fit and reduces prediction uncertainty. We investigate the extent to which each dataset contributes to the model performance and show that while all data improve on the prior model fit, the satellite-based data and crop yield estimates are particularly important for reducing model error and uncertainty. Despite these improvements, we conclude that there are still significant knowledge gaps, in terms of available data for model parametrisation, but our study can help indicate the necessary data collection to improve our predictions of crop yields and crop responses to environmental changes.

Improving food security is one of the greatest challenges currently facing
humanity

Most crop models to date can be put into one of two broad categories:
process-based or statistical. Process-based crop models have some
representation of the mechanisms that determine how plants grow in their
formulation

Statistical crop models aim to capture relationships between various
predictor variables and crop properties without using any information of how
such factors should be related from biology or ecology. For example, studies
have predicted crop yields based on observed simple relationships between
yield data and climate inputs

Both the process-based and statistical approaches have their disadvantages
when it comes to obtaining general insights. Process-based models have often
only been shown to be applicable at the individual field scale, making it
unclear if their predictions might provide information about crop responses
at larger spatial scales. Process-based models can also be sensitive to
chosen parameter values and formulation, which has rarely been identified as
applicable over multiple crop types or locations

An alternative to the extremes of either purely process-based or purely
statistical crop models is to apply statistical methods to process-based
models to data constrain their parameters. This technique, which is
increasingly used in Earth system and vegetation modelling studies

The main problem with data-constraining process-based models is data
availability. Datasets of annual yield, such as those used in statistical
modelling studies, are unlikely to be sufficient when data constraining the
parameters of physiologically explicit models because, to put it simply, they
are unlikely to carry enough information to enable identification of what the
different model parameters should be. However, two other sources of data,
widely used in the global vegetation modelling but to a lesser extent in
agricultural modelling, could be of use in data-constraining crop model
parameters. Space-based remote sensing data can provide spatially and
temporally continuous information on vegetation greenness at a variety of
spatial and temporal scales

Sites where intensive data collection has taken place do exist and can be
very useful in exploring certain aspects of crop physiology, for example, in
the context of the agricultural model intercomparison and improvement
project, AgMIP

In this paper, we present a newly developed general, non-crop-specific
process-based model and use parameter inference to infer the most likely parameters
for 15 locations for winter wheat and maize using a combination of
space-based vegetation indices, eddy covariance flux data and reported
agricultural yields. We aim to answer the following questions:

Does our model with data-constrained parameters predict empirical data better than a model with prior parameters?

Are the data-constrained parameters similar among different sites, and what are the impacts on model predictive accuracy of having site-specific versus site-shared parameters?

To what extent does the inclusion of the different types of data in the model-fitting process influence the uncertainty in the inferred parameters and model predictions?

We expect the qualitative answer to the first question to be that utilising empirical data does enable the model to make better predictions because that is a typical outcome of our parameter estimation approach. However, we are more interested in the quantitative answer: i.e. how much. For example, the generation of a model that could make extremely precise and accurate predictions would suggest that data-constraining general models with the datasets we identify could provide an extremely useful tool for agricultural predictions and forecasts. Alternatively, the generation of a model that makes very imprecise predictions would suggest that more data collection and model improvement are needed for the model to have practical applications.

In addition to our aims above, our goal with this paper is to provide a proof-of-concept data-constrained process-based crop model that could be of use in practical agricultural systems. To this end, we include more descriptions of the methods than otherwise necessary as well as a more broad discussion of the applicability of this paper.

While our study is part of a boarder scientific objective to enable more accurate field-scale predictions, the lack of availability of field-scale datasets to train and validate our model means that the scale of model evaluation for our study here is a mix of field (flux tower) and regional scales (county and country level for yield estimates and 3 by 3 km scale for photosynthetic activity).

Our analysis focusses on 15 sites for which we can obtain the combination of
eddy covariance data, satellite data and crop yield data for specific crops
(summarised in Table

Study sites are listed; all sites correspond to eddy covariance measurement sites.

NA

We use data on vegetation greenness from the MODIS (Moderate Resolution
Imaging Spectroradiometer) Terra instrument. The MODIS fraction of absorbed
photosynthetically active radiation (fAPAR data) from the MOD15A product was
downloaded (

Using the pixel closest to the flux tower site was infeasible because of data
noise and gaps resulting in an uneven time series. Instead, we aggregated all
pixels within a 3 by 3 km square centred on the tower site in a single
time series. The untested assumption behind this aggregation is that farming
practices are constant across this scale. To distinguish between different
crops, we use a crop phenology approach

We use eddy covariance data for 15 sites across Europe and the United States
(Table

To obtain information on crop yield, we use data provided by the US Department
for Agriculture (USDA) yearly, at the county level, available for the entire
study period (

Sowing and harvest dates are required as model inputs and were extracted from
the crop calendar global dataset

Fertiliser input data were obtained from the published site descriptions (see
Table

We use NASA's Modern-Era Retrospective Analysis for Research and Application
(MERRA) dataset

Our new general model of crop growth is based on the single plant model of

The germination process is described as a degree-day function with a fixed
base temperature of 0

Vegetative growth begins once the accumulated degree days are higher than the
limit parameter, germ

During vegetative growth, biomass is allocated to either aboveground or
belowground fractions to achieve an optimal carbon-to-nitrogen (C : N)
ratio at the plant level (

Nitrogen-limited growth is considered to be a function of root mass

Here,

Actual biomass growth is then the minimum between nitrogen- and carbon-limited growth:

Reproductive growth starts at a point where the supply of any of the
resources, carbon or nitrogen, reaches a maximum, which we term “peak
resource”. This is the point in time which will result in the maximum final
reproductive mass as further increases in vegetative fractions would not
result in an overall increase in growth rate and lead to suboptimal growth
(see

The peak nitrogen condition is achieved when an increase in root mass does
not result in an increase in nitrogen uptake. This condition is achieved in
nitrogen-limited environments where the nitrogen available in the soil is
depleted through the period of vegetative growth. This assumption can be
considered valid in agricultural systems where the major nitrogen input into
the system during the growing period comes solely from agricultural
fertilisers. Soil nitrogen decays monotonically through the season in our
model due to the simplicity with which we model nitrogen uptake, and thus
detecting the peak nitrogen condition is straightforward. Similarly, the peak
carbon flowering condition is triggered when the addition of aboveground
biomass would not lead to an increase in net carbon gain due to self-shading
in the canopy. To calculate the peak carbon trigger, we use the environmental
variables averaged over

During the reproductive phase, all new biomass produced is assigned to
reproductive tissues. Nitrogen and carbon are translocated to reproductive
organs at a constant rate,

We use Bayesian parameter inference techniques to infer the parameters for
the model described above. The technique involves solving Bayes' theorem
which, in this context, states

Three different datasets were used in combination to infer our model
parameters – MODIS fAPAR, flux tower GPP and crop yield data. Each dataset
contributes to the assessment of the model likelihood but each one of these
has different temporal resolutions and covers different time periods,
resulting in a variable number of data points. To prevent our inferred
parameters from being overly based towards explaining the datasets with the
greatest quantity of data points, we down-weighted the contributions to our
likelihood estimates from each data point according to the quantity of data
in each dataset. The likelihood function used in Filzbach is
therefore

We adopt different techniques to estimate the standard deviation

In order to assess whether the model with data-constrained parameters predicts empirical data better than a model with prior parameters, we infer the parameters for each site individually using all of the empirical data and compare the model predictive performance to one site in which the parameter values are sampled randomly from the prior range.

We compare the inferred parameters and predictive performance of models with parameters inferred using data from individual sites (the one-site model) or from multiple sites together (all-site model), always keeping maize and winter wheat sites separate, to assess the effects of allowing parameters to differ between the sites. Preliminary investigations revealed that similar model parameter distributions were inferred once data from more than three sites were used in combination when inferring the parameters. We therefore also take the opportunity to assess the performance of the models with parameters shared between sites in predicting data that have not been used in parameter inference (evaluation model).

To assess the importance of different types of data constraints, we perform a data knock-out experiment and we infer the model parameters for individual sites using only one or two of the different empirical datasets and assess inferred model parameters and model performance.

In general, we assess model predictive performance by quantifying the root mean squared error (RMSE) between the model predictions and the empirical data to access model precision and the mean error to assess model bias. We normalise both these metrics by the mean value of the different empirical dataset types to aid in comparison. We calculate parameter uncertainty as the 95th percentile confidence interval from the posterior distribution (Sect. 4).

To calculate uncertainty for the model predictions, we sample parameter values
from their respective posterior distribution and compute predictions with
each parameter combination, which results in a corresponding distribution of
model predictions. We report this prediction distribution uncertainty using
95th percentile confidence intervals. This predicted distribution does not
include the prescribed or inferred uncertainty about observations,

Model parameters, upper and lower bounds and initial values used in the model-fitting procedure.

Model RMSE, bias and uncertainty for the one-site and all-site parametrisation as well as the model evaluation run.

In general, and as expected, the predictive accuracy of both the wheat and
maize models is improved by inferring their parameters; the root mean squared
error and bias of the model predictions is reduced for predicting all
empirical datasets compared to the prior model (Table

Comparison of prior model predictions (dark grey, dashed line) and
posterior model predictions (light grey, continuous line) at one wheat
(DK-Ris) site. Panels show

In terms of uncertainty, the posterior models show a large reduction when compared to the prior models of aboveground biomass (86 %) and yield (97 %), but a smaller reduction for the belowground variables (67 % for root biomass and 20 % for soil nitrogen), as there are no data in the fitting procedure to directly constrain these. Visual inspection also emphasises the importance of model structural constraints on the model dynamics; e.g. the model predicts a narrow range of dynamics in some properties at certain times of the year (e.g. biomass in leaves, roots and reproductive parts soon after sowing) irrespective of the parameter values.

On average, the RMSEs are very similar between the models with parameters
inferred for individual sites to when parameters are inferred for all sites
together (Table

As expected, the uncertainty in the predicted GPP, fAPAR and
yield is lower for the models with parameters inferred using all sites
because more data are used to infer the parameter values for those models,
leading to lower uncertainty in the inferred parameter distributions
(Fig.

Estimated model parameters for all sites, fitted to individual locations (circles) and all locations combined (black line). Values are posterior medians, and error bars and shaded areas represent 95th percentiles of the posterior parameter distribution for the one-site and all-site parametrisation, respectively.

GPP, fAPAR and yield model predictions at one maize site (US-Ro3) and one wheat site (DE-Gri). The figure shows posterior mean predictions for one site, all sites and the evaluation model fit. Neither site has been included in the evaluation fitting.

Inspection of the inferred parameter distributions (Fig.

Visual inspection of the predicted time series of GPP, fAPAR and yield for
maize and winter wheat predominantly show very similar predictions between
the models with parameters estimated from one site versus all sites
(Fig.

We evaluate the model transferability by inferring the model parameters using
a subset of the sites and assessing model predictive performance against the
remaining sites (Fig.

Our data-type hold-out experiments show clear differences in the roles played
by different data types in improving model predictive accuracy, but the
effects are similar for both crop types (Fig.

Normalised uncertainty for GPP, fAPAR and yield model predictions at one maize site (US-Ro3) and one wheat site (DE-Gri). Uncertainty is calculated as 95th percentile confidence bounds normalised by the posterior mean for one site, all sites and the evaluation model fit. Neither site has been included in the evaluation fitting.

Model RMSE and bias for all data hold-out experiments averaged over all wheat and maize sites, respectively. Error bars represent variation across sites. All values have been normalised to the mean value of that variable at each site. Black bars indicate models that do not reach flowering.

RMSE, bias and uncertainty values in the data knock-out experiments for wheat and maize.

The greatest improvements in model predictive performance for all response
variables is obtained when all data types are used for parameter inference.
This is not inevitable as an overall more likely model might be achieved by
sacrificing predictive accuracy for one data type in order to improve
predictive accuracy for another. For example, adding fAPAR data alone
slightly improves model RMSE for fAPAR data, but makes it worse for GPP and
yield predictions when compared to the model with prior parameter
distributions. Indeed, the crops do not flower for maize or wheat when only
fAPAR data are used for parameter inference. Comparing knockouts with and
without fAPAR data included implies a trade-off between predicting the fAPAR
data well and predicting GPP well (Fig.

The uncertainty in model predictions (Fig.

Model uncertainty, expressed as the difference between the upper and lower 95th confidence intervals for all model setups averaged across all wheat and maize sites. Error bars represent variation between sites. All values have been normalised. Black bars indicate models that do not reach flowering.

We show that a process-based crop model (PeakN-crop v1.0) constrained using EC data, satellite fAPAR observations and regional yield estimates can improve model performance compared to the model run with prior parameter ranges and greatly reduces the uncertainty in model output. However, the resulting uncertainty in both state variables and model parameters is still relatively high.

Model uncertainty is difficult to compare with previous crop modelling studies, as models with fixed parameter values do not often provide uncertainty estimates. In fact, providing uncertainty values for all model variables and parameters is one of the advantages of a data-constrained model. In the current model, uncertainty is highest at the start of the season for all variables but decreases rapidly and final yield uncertainty is much lower. This is due to thresholds: abrupt changes from one growing stage to another when small differences in parameters can lead to large differences in resulting variables. It is, however, important to note that the uncertainty in our yield predictions remains high and the model in its current form is unlikely to provide accurate predictions for practical applications without the addition of new data (Sect. 7.4). We have, however, shown that the use of three different data types does reduce prediction uncertainty – pointing to an avenue for future model improvement.

Our estimates of model parameter uncertainty, and consequently model
prediction uncertainty, are influenced by our assumption that the model is
correct and that any departure of the data from predictions is due to
measurement error. This is undoubtedly false but makes our parameter
estimation method simpler. Overall prediction uncertainty can be decomposed
into initial condition uncertainty, parameter uncertainty and model
uncertainty and methods exist for making these uncertainty estimates and
building them into predictions

In terms of the posterior parameter distributions, resulting parameters show
a similar degree of constraint to that observed in previous model
parametrisation studies for natural ecosystems

In terms of model performance, the model correctly predicts seasonal trajectories of GPP and final yield data. We cannot, however, capture the interannual variability in yields, which is most likely due to the fact that our model does not include a response to water limitation or heat damage. The fact that we use regional yield data can also lead to discrepancies between the yield at each specific flux tower site and the yield data. The model does not capture the fAPAR seasonal cycle well, especially at the maize sites, which is due to the low spatial resolution of the data. However, the predicted model fAPAR is more realistic than the fAPAR data, which is one of the advantages of using a process-based model with a more rigid structure than a statistical one.

One additional complication is the different spatial scales of the three datasets; while the eddy covariance data are at the scale of the flux tower footprint, which can be seen as equivalent to the individual field scale, the fAPAR and yield data correspond to larger scales (county and country level for the yield data and a 3 by 3 km scale for the fAPAR data). The assumption behind our analysis is that the conditions at field scale are representative of the regional scale, so that there would be no discrepancy between model predictions at these different scales. This is obviously a source of error, especially at the wheat sites in Europe, which will be located over a much more heterogeneous landscape. Further sources of data at the field scale would be required to identify the model error caused by the discrepancy in spatial scales.

Eddy covariance data are to date the most widely used dataset for
parametrisation of vegetation models

Space-based vegetation data have the main advantage of a large spatial and temporal coverage, so that they can be used irrespective of the local monitoring infrastructure, providing a general data source. However, the quality of the data is relatively low, especially at the high spatial resolutions needed for crop modelling. This problem is particularly obvious in the case of the maize data, which lack the expected seasonality and are reflected in the very high error in the fAPAR-only fit. However, the model fits without fAPAR (GPP and yield only) show a high error as well, indicating that the information content in vegetation indices is needed for constraining the model but is not sufficient.

Some of these limitations are not general for remotely sensed data but can be
attributed to the spatial and spectral resolution of the MODIS instrument.
The 1 km spatial resolution can be too coarse for agricultural fields,
especially in areas with heterogeneous land cover. Other existing instruments,
specifically the Landsat family, have a better spatial resolution (30 m),
but a much poorer temporal resolution which we have found unsuitable for
fitting a plant growth model where developmental changes can be abrupt. More
recent missions such as Sentinel-2 will have more suitable spatial and
temporal resolutions for use with this type of model

Crop yield is the data that are traditionally used for evaluating agricultural models and is arguably the most important to predict correctly, given that the purpose of the model is to predict crop productivity. We have used county- and country-level reported yields rather than field-level measured yield because of both the availability of the data and the generality of the method. The model fitted with yield data only gives a good fit to yields but gives higher errors for the GPP and fAPAR estimates, which raises questions about the correctness of models which only use final yields to assess performance and the ability of such models to predict crop yields under different conditions. Crop yield data provide the final point of plant crop growth but there is potentially a multitude of model structures and parameter combinations that can result in that yield.

In addition to the three datasets used for parametrisation, the model also
requires input data in the form of sowing and harvest dates and fertiliser
inputs. Additional uncertainty is associated with these datasets which is not
available nor accounted for in our analyses. For example, the crop calendar

Here, we have chosen a given model structure and extensively tested the way in which constraining the parameters with different datasets results in different configurations. The question that arises is to what extent the chosen model itself affects the present results. We have chosen a novel physiology-based model which includes plant optimality concepts, which on one hand has the advantage that it is more general than some of the older models and lacks artificially set thresholds between growth stages, but does have the disadvantage of being less thoroughly tested against field observations. An ideal companion paper to this study would be a comparison of different model structures with a constant data-constraining framework, providing greater insights into which parts of the model led to high errors or uncertainty. However, given the limitations of the current study, we acknowledge this limitation and report most error metrics as relative to prior model runs in an attempt to isolate errors created by the data and model fitting from those caused by the model itself.

The fact that our model shows a relatively good fit when constrained at
multiple sites indicates that it would be possible to obtain a single
parameter set for one cultivar given the same agricultural practices, so that
the model can be fitted at a small number of locations and then applied more
widely. However, the parameters are badly constrained and part of the data we
have used are not sufficiently accurate to allow the use of the model at a
wider variety of locations and climate conditions. Accurate yield data are
essential but not sufficient and must be accompanied by a growth time series.
Our results indicate that additional EC data are not necessary, especially
given the cost of installing and maintaining a flux tower. Instead, either
biomass or LAI (or fAPAR or other vegetation indices) data could be easier to
obtain at multiple locations. The belowground part of the model, describing
root nitrogen uptake, is only indirectly constrained by the existing data,
and any observation of root mass and function would have the capacity to add
extra information, especially time series information

The model in the version presented in this paper does not include any water limitation to growth due mainly to a lack of data constraint on any water-related parameters, as we found that latent heat data from EC towers are not sufficient. Belowground measurements of not only root growth but also soil water properties would again provide some of the necessary information. Such belowground data, especially if supplemented by nutrient concentrations, can also help constrain a more complex version of the nitrogen uptake scheme, which could be improved to include more explicit soil–plant interactions and additional processes such as biological nitrogen fixation for legumes.

If this model, or any other similar process-based data-constrained crop model, is to be used for scientific purposes to understand the response of crops to climate across the globe, the ideal data would be a global dataset, such as space-based vegetation observations, combined with high-quality field-level data that would ideally include growth time series, final grain yield and information about agricultural practices. However, if the model is to be used for agricultural purposes, to aid decision making at the local level, high-quality field-level data would be sufficient. A valuable evaluation in such studies, not conducted here for brevity and due to a lack of location-specific data, would be to compare the predictive accuracy of the model against the predictive accuracy of a statistical average over the data. Such an analysis would reveal whether and how much benefit is gained by using a data-constrained model for predictions.

In this paper, we present a method for data constraining a process-based agricultural model to three sources of data: eddy covariance flux measurements, space-based fAPAR and regional yield estimates. We show that the data-constrained model performs better than the model with prior parameter estimates, especially in terms of uncertainty, and even though the data used are in some cases not sufficient to fully constrain posterior parameters, they have sufficient information values to be used for model parametrisation. We apply the model to both maize and wheat sites and show that the model performs equally well for both species. Parameters can be shared between sites of the same species with a similar performance to local parameters and reduced uncertainty. We have also investigated the impact of the different datasets on constraining the model, and we show that all three types of data contribute to the model performance, but that if in a data-limited world one of the data types was not available, the model can be constrained reasonably well with fAPAR and yield data only. There are still gaps in the data available for model parametrisation, which are also a limitation to the models that can be parametrised, in particular in relation to water limitation on crops, and we believe that a model parametrisation framework such as that presented here can help identify those gaps and the data needed to further our capacity to model crops.

All model code used in this paper is available from the authors upon request.All data used in this paper are freely available and have been fully referenced in the text.

Figures

Gross primary production predictions for 1 year for all sites fitted using all available data at each individual site and at all sites together. Grey shaded areas represent 95 % confidence intervals drawn from the posterior distribution.

fAPAR predictions for 1 year for all sites fitted using all available data at each individual site and at all sites together. Grey shaded areas represent 95 % confidence intervals drawn from the posterior distribution.

Yield predictions for all years for all sites fitted using all available data at each individual site and at all sites together. Grey shaded areas represent 95 % confidence intervals drawn from the posterior distribution.

Gross primary production predictions for 1 year for all sites fitted using all available data at a subset of sites for model evaluation. Sites with black boxes have been used in the model fitting. Grey shaded areas represent 95 % confidence intervals drawn from the posterior distribution.

fAPAR predictions for 1 year for all sites fitted using all available data at a subset of sites for model evaluation. Sites with black boxes have been used in the model fitting. Grey shaded areas represent 95 % confidence intervals drawn from the posterior distribution.

Yield predictions for all years for all sites fitted using all available data at a subset of sites for model evaluation. Sites with black boxes have been used in the model fitting. Grey shaded areas represent 95 % confidence intervals drawn from the posterior distribution.

In the current study, we use the standard biochemical model of

Photosynthesis model constants according to

The internal CO

In the case of C4 photosynthesis, the standard biochemical model includes a third
limitation, the PEP-carboxylation rate

We calculate the PAR absorbed by the canopy as a sum of absorbed direct and diffuse radiation:

All authors contributed to model development and analysis.

The authors declare that they have no conflict of interest.

We would like to acknowledge all data providers for the eddy covariance flux site data. Funding for AmeriFlux data resources was provided by the US Department of Energy's Office of Science. We would also like to thank the developers of the MODIS fAPAR product used in this study. We thank Christoph Müller, Daniel Wallach and an anonymous reviewer for their constructive comments that greatly improved our manuscript.Edited by: C. Müller Reviewed by: D. Wallach and one anonymous referee