Use of agricultural statistics to verify the interannual variability in land surface models: a case study over France with ISBA-A-gs

. In order to verify the interannual variability of the above-ground biomass of herbaceous vegetation simulated by the ISBA-A-gs land surface model, within the SURFEX modelling platform, French agricultural statistics for C3 crops and grasslands were compared with the simulations for the 1994–2008 period. While excellent correlations are obtained for grasslands, representing the interannual variability of crops is more difﬁcult. It is shown that, the Maximum Available soil Water Capacity (MaxAWC) has a large inﬂuence on the correlation between the model and the agricultural statistics. In particular, high values of MaxAWC tend to reduce the impact of the climate interannual variability on the simulated biomass. Also, high values of MaxAWC

The added value of ISBA-A-gs is the possibility to simulate the CO 2 fluxes, in conjunction with the water and energy fluxes and state variables simulated by the model. In particular, the vegetation transpiration calculated by ISBA-A-gs is related to a photosynthesis model able to describe the impact of drought (Calvet, 2000;Calvet et al., 2004). The simulated CO 2 fluxes can be validated along with the evapotranspiration using the extensive in situ flux observations of the FLUXNET initiative, gathering more than 500 sites worldwide (www.fluxdata.org). This was illustrated by Gibelin et al. (2008) for mid-latitudes. Moreover, an option of ISBA-A-gs permits the simulation of the vegetation biomass and Leaf Area Index (LAI). This option is useful for climate change impact studies , and allows the sequential assimilation of satellite LAI estimates. The latter was demonstrated at the local scale, for an unmanaged grassland site (Sabater et al., 2008;Albergel et al., 2010;Barbu et al., 2011). These studies show that representing LAI observation errors is not easy. Barbu et al. (2011) conclude that for LAI values higher than 2 m 2 m −2 , the LAI observation error is proportional to the LAI value. Indeed, the satellitederived LAI values are affected by a saturation phenomenon at high LAI values, inducing a high uncertainty on yearly maximum LAI values (Garrigues et al., 2008). This uncertainty can be handled by assimilation systems able to sequentially analyse soil moisture and LAI from a daily to a 10day basis, and LAI data are useful in all conditions. However, the interannual variability of the above-ground biomass (B ag ) and of the LAI simulated by ISBA-A-gs is not easy to  (Brut et al., 2009). Independent biomass estimates are needed to verify the model parameter mapping and its impact on the interannual variability of the simulated vegetation biomass. The vegetation biomass is not directly observed so far by Earth observation satellites, and in situ observations related to the vegetation biomass are needed. For crops, the agricultural statistics can be used at the country level, as shown by Smith et al. (2010a,b) for Europe, using the OR-CHIDEE (Krinner et al., 2005) LSM and the STICS (Brisson et al., 2002) crop model. They worked over the period 1972-2003, marked by a very strong increase in crop yields all over Europe caused by more and more intensive crop management practices (use of fertilizers, pesticides, more productive cultivars). In order to extract the interannual variability signal from the yield time series, they detrended the crop yields using a linear trend curve, and they analyzed the standard deviation of the de-trended yield anomalies, only.
In this study, an attempt is made to use the detailed agricultural dry matter yield statistics available in France (Agreste, 2011) for relatively small administrative units ("départements") ranging from 2000 to 10 000 km 2 . In order to analyze the year-to-year variability of the modelled biomass production, we focus on the 1994-2008 period. This 15-yr period is characterized in France , as in many European countries, by crop yields presenting little or no trend.
The objective is to assess to what extent agricultural statistics (crop yields and fodder production) can be used to assess the capability of a generic LSM to reproduce the interannual variability of the dry matter yield. Indeed, ISBA-A-gs is able to represent the climate impact on the main biophysical processes using a limited number of equations and parameters, but ISBA-A-gs does not include any crop-specific parameterisation and is not able to simulate the crop grain yield formation per se. Moreover, the processes that govern variability in the yield of crops are not exactly the same as those that govern variability in unmanaged vegetation biomass accumulation, and while ISBA-A-gs simulates the climatic impacts on photosynthesis and on the vegetation growth, specific factors impacting the agricultural production are not accounted for. The latter include changes in the intensity of the crop management (in relation to technical advances or public policies), pests, diseases, migration of a given crop type from productive to poorer lands, or (in the case of cereals) the grain formation. Also, crop cultivars have been bred to produce high yields of grain, at optimum quality, with optimum use of resources. Therefore, variability in crop yield may not be perfectly correlated with the annual maximum biomass of more "natural" vegetation types. An important aspect of the validation is the verification of the default choice of parameters in SURFEX on the value of the Maximum Available soil Water Content (MaxAWC), on photosynthesis parameters, and on specific plant responses (avoiding or tolerant) to drought. Three contrasting categories of agricultural products described by Agreste were considered: cereals, forage pea, and grass. Maize was not considered as a large proportion (more than 40 %) of the maize grain production in France comes from irrigated fields, and irrigation tends to reduce the impact of the climate interannual variability on the vegetation biomass and on the grain yield (Debaeke and Bertrand, 2008). This triggers a rather poor correlation between the maize yield statistics and the simulated vegetation biomass . Moreover, the maize yield figures given by Agreste do not separate irrigated from rainfed maize over the whole period considered in this study.
The ISBA-A-gs parameters and the available atmospheric and agricultural data over specific regions in France are presented in Sect. 2, for C3 crops (cereals and forage pea in this study) and grasslands. The impact of the ISBA-A-gs parameters on the interannual variability of the simulated B ag is presented in Sect. 3, together with the parameter values optimizing the correlation with agricultural statistics. It is shown to what extent the B ag simulated by the model is consistent with the agricultural statistics. Finally, the results are discussed in Sect. 4, and the main conclusions are summarized in Sect. 5.

Parameters of ISBA-A-gs and studied sites
ISBA-A-gs uses a CO 2 responsive parameterization of photosynthesis based on the model of Goudriaan et al. (1985) modified by Jacobs (1994) and Jacobs et al. (1996). This parameterization is derived from the set of equations commonly used in other land surface models (Farquhar et al., 1980 for C3 plants andCollatz et al., 1992 for C4 plants), and it has the same formulation for C4 plants as for C3 plants, differing only by the input parameters. Moreover, the slope of the response curve of the light-saturated net rate of CO 2 assimilation to the internal CO 2 concentration is represented by the mesophyll conductance (g m ). Therefore, the value of the g m parameter is related to the activity of the Rubisco enzyme (Jacobs et al., 1996), while in the Farquhar model, this quantity is represented by a maximum carboxylation rate parameter (V C,max ). The model also includes a detailed representation of the soil moisture stress. Two different types of drought responses are distinguished for both herbaceous vegetation (Calvet, 2000) and forests (Calvet et al., 2004), depending on the evolution of the water use efficiency (WUE) under moderate stress: WUE increases in the early soil water stress stages in the case of the drought-avoiding response, whereas WUE decreases or remains stable in the case of the drought-tolerant response. Table 1 presents the standard values of ISBA-A-gs parameters (Gibelin et al., 2006) used in the SURFEX modelling platform, for C3 crops and for C3 grasslands. The photosynthesis model is governed by four key parameters: the mesophyll conductance in well-watered conditions, g m , Geosci. Model Dev., 5, 37-54, 2012 www.geosci-model-dev.net/5/37/2012/ Table 1. Standard values of ISBA-A-gs parameters (Gibelin et al., 2006) for 2 vegetation types (C3 crops and C3 grasslands). The mesophyll conductance at a leaf temperature of 25 • C, in well-watered conditions, g m , is in units of mm s −1 , g c is the cuticular conductance, in mm s −1 , θ C is the critical extractable soil moisture content, dimensionless, τ M is the maximum leaf span time, in days, LAI min is the minimum leaf area index, in m 2 m −2 , N L is the leaf nitrogen concentration in % of dry mass, e is the SLA (specific leaf area) sensitivity to N L , in m 2 kg −1 % −1 , f is SLA at N L = 0 %, in m 2 kg −1 .  Calvet, 2000): drought-avoiding and drought-tolerant (red and blue arrows, respectively). For moderate soil water stress (i.e. AWC > θ C × MaxAWC), the deviation of D max from its unstressed value towards its minimum (0.03 kg kg −1 ) or maximum (0.30 kg kg −1 ) value (drought-avoiding and drought-tolerant, respectively), is proportional to AWC, scaled between MaxAWC and θ C × MaxAWC. The value of g m is driven by D max through a logarithmic equation (solid line): ln(g m ) = 2.381 − 0.6103 × ln(D max ), with g m and D max in units of mm s −1 and g kg −1 , respectively. For more pronounced soil water stress (i.e. AWC < θ C × MaxAWC), either g m or D max (drought-avoiding and drought-tolerant, respectively), decrease from its value at AWC = θ C × MaxAWC to its minimum value, proportional to AWC/(θ C × MaxAWC). As an example, the θ C and unstressed g m values of Table 1 are used. the cuticular conductance, g c , the critical extractable soil moisture content, θ C , and the response to drought (droughtavoiding or drought-tolerant). As shown by Fig. 1 for C3 herbaceous plants (Calvet, 2000), drought-avoiding and drought-tolerant responses to soil moisture stress are represented through empirical relationships between g m and the maximum leaf-to-air saturation deficit D max . The latter pa-rameter is related to the sensitivity of stomatal aperture to air humidity (high D max values correspond to a low sensitivity and vice versa). In Table 1, the response to drought is the only parameter distinguishing the standard photosynthesis parameters for C3 crops and C3 grasslands. Plant growth is characterized by five parameters: the maximum leaf span time, τ M , the minimum leaf area index LAI min , the leaf nitrogen concentration N L , the SLA (specific leaf area) sensitivity to N L , e, and SLA at N L = 0 %, f . The latter two differ from C3 crops to C3 grasslands ( Table 1). The value of MaxAWC may change from one location to another, depending on soil and plant characteristics: soil moisture at field capacity, soil moisture at wilting point, and rooting depth. These parameters, together with the fraction of vegetation types, are provided by the ECOCLIMAP global database (Masson et al., 2003), at a spatial resolution of 1 km. ECOCLIMAP is a database of key surface parameters (soil texture, albedo, emissivity, roughness length, LAI, vegetation fraction, and physiological parameters) for land surface modelling. Over France, more often than not, the ECOCLIMAP classes correspond to a combination of 6 main patches (bare soils, coniferous trees, deciduous broadleaf trees, C3 crops, C4 crops, C3 grasslands). An updated version of ECOCLIMAP (ECOCLIMAP-II) is now available over Africa (Kaptué Tchuenté et al., 2010, 2011, and over Europe . It is based on more recent input satellite data (several years of SPOT/VEGETATION NDVI) and distinguishes, also, Spring crops (e.g. wheat) from Summer crops (e.g. maize, sunflower).
In this study, C3 Spring crops are considered, as they are generally rainfed and as such, their yield interannual variability is more markedly related to climatic conditions. Also, permanent grasslands below 1000 m a.s.l. are considered, only, as high altitude grasslands are represented with difficulty by ISBA-A-gs (Brut et al., 2009). Also, the ISOP (Information et Suivi Objectif des Prairies) model-based grassland production index considered in this study (Sect. 2.3.2) is not available above 1000 m a.s.l. (Ruget et al., 2006). Figure 2 presents the location of the studied sites, for both C3 crops and grasslands. For a given "département" administrative unit, they correspond to the 8 km × 8 km SAFRAN atmospheric grid cell (see Sect ECOCLIMAP-II presents the highest average fraction of either C3 crops or grasslands. Each site is representative of one administrative unit and at these sites, the C3 crop or grassland patches represent at least 45 % of the ECOCLIMAP-II grid cell. This procedure permits to ensure that the meteorological variables used to drive the model are consistent with the main location of the croplands (grasslands) within a département. The meteorological data of the SAFRAN sites are used to drive the model. In general, one site (corresponding to either C3 crops or grasslands) per département is studied, but for some départements (e.g. 63-Puy-de-Dôme), two sites are described, one for grasslands, and one for C3 crops.

Forcing atmospheric data
A high-resolution (8 km) atmospheric forcing data set is available for simulations over France. It is provided by the atmospheric analysis system "Système d'Analyse Fournissant des Renseignements A la Neige" (SAFRAN) (Durand et al., 1993(Durand et al., , 1999. SAFRAN is a mesoscale atmospheric analysis system for surface variables. It produces an analysis of air temperature, air humidity, wind speed, incoming shortwave and longwave radiations at the hourly time step, and an analysis of precipitation at the daily time step, using atmospheric simulations and ground data observations. SAFRAN is based on climatically homogeneous zones and is able to take topography effects into account. Originally intended for mountainous areas, it was later extended to cover France. A detailed validation of the SAFRAN analysis over France (Quintana-Seguì et al., 2008) showed that SAFRAN provides accurate meteorological values to force LSM. In particular, SAFRAN uses a large number of rain gauges and can be considered as a reference for the verification over France of global precipitation analyses (Szczypta et al., 2011). Over the studied sites ( Fig. 2), and for the 1994-2008 period, SAFRAN presents a marked positive trend of the average maximum air temperature for April-May-June, i.e. for the start of the growing period: -For C3 crops sites, the trend is systematically positive (ranging from 0.015 Kyr −1 to 0.183 Kyr −1 ), and the average value is 0.126 Kyr −1 .
-For grassland sites, the trend ranges from -0.001 to 0.186 Kyr −1 , and the average value is 0.118 Kyr −1 .
This trend is more acute in Northern France.

Crops
The French agricultural annual statistics are freely available on the web, at the département administrative level (Agreste, 2011). They are based on extensive local to national observations of harvested grain quantities. In this study, the Geosci. Model Dev., 5, 37-54, 2012 www.geosci-model-dev.net/5/37/2012/ Agreste data for the 1994-2008 15-yr period were considered, only. The considered C3 crops were 6 types of cereals (winter wheat, rye, winter barley, spring barley, oat, triticale) and forage pea. Each crop considered alone covers a significant fraction of each of 45 départements (Fig. 2). Figure 3 shows the dry matter yield time series provided by Agreste for cereals and forage pea, from 1994 to 2008. The data are shown for each département, together with the average curve. No significant trend of the average yield is observed, except for forage pea, with a significant (at the 1 % level) negative trend of -6.15 g m −2 yr −1 .

Grasslands
Agreste provides dry matter yield annual values for both permanent and temporary grasslands. In this study, low altitude permanent grasslands were studied for 48 départements (Fig. 2). In Agreste, permanent grasslands are defined as natural grasslands or as planted grasslands older than 6 yr. Also, since 2000, Météo-France has issued the ISOP index (Ruget et al., 2006). This index is derived from an integrated system providing a real-time assessment of the forage production variability over France. The system is based on simulations of the STICS model of Institut National de la Recherche Agronomique (INRA), driven by daily atmospheric variables derived from interpolated ground observations of meteorological variables. In the ISOP-STICS simulations, the grass is regularly cut, from January to October, and the cut biomass is cumulated throughout the year in order to calculate the annual dry matter yield. The harvest dates depend on climatic conditions and are derived from temperature sums. Management practices, such as the frequency of mowing, the thermal time between mowings or the amount of nitrogen supply, were estimated through a national survey carried out in 1998 by the French Ministry of Agriculture. The fodder production was assessed for more than 6000 combinations of soils, climates, and management practices and then aggregated on about 200 forage regions previously defined by the French Ministry of Agriculture. The ISOP index was calibrated for delivering a good representation of the inter-annual variability. In this study, both Agreste and ISOP were used to assess the ISBA-A-gs simulations for grasslands. The advantage of the Agreste data is that they are produced by local experts, and Ruget et al. (2006) used this independent bottom-up information to validate the ISOP product for the 1982-1998 period (for more recent years, the two products are not independent as the local experts contributing to Agreste could use ISOP). Ruget et al. (2006) found that the consistency between the two fodder production estimates varies a lot from one region to another (R 2 varies from 0 to 0.6). The two products present shortcomings: (1) although the STICS model used to produce ISOP was calibrated and validated by Ruget et al. (2006) using five INRA grassland test sites, mapping the numerous STICS parameters is not easy, (2) at a regional scale, the Agreste fodder pro-duction is less accurate than the crop yield estimates. Indeed, most of the French fodder production is used on-site, and the limited commercial exchange of fodder is detrimental to the quantitative monitoring of the grassland productivity. Since the two products present advantages and disadvantages, both were used in this study. The ISOP index used in this study is the ratio of the annual grass production simulated by STICS, for permanent grasslands, to the average value simulated for the period 1982-2006, at a given location. In contrast to Agreste, ISOP is not provided at the département level, but for specific forage regions. The 48 grassland sites presented in Fig. 2 were derived from the département limits and from the ECOCLIMAP-II grassland fraction. For each grassland site, the data from the nearest ISOP forage region were used. Figure 4 shows the dry matter yield time series provided by Agreste for grasslands, together with the ISOP index (at the corresponding forage regions), from 1994 to 2008. The data are shown for each département, together with the average curve. No significant trend of the average yield is observed. ISOP presents a more pronounced interannual variability than the Agreste statistics, especially before 2000. After 2000, the ISOP index information was incorporated into the Agreste statistics, and the correlation between the two estimates increased sharply.

From ISBA-A-gs to the agricultural statistics
The ISBA-A-gs simulations were driven by SAFRAN hourly atmospheric variables. C3 crops and grasslands were simulated using the standard parameters of Table 1. Continuous simulations were performed from 1994 to 2008, for all the sites presented in Fig. 2. As a preliminary sensitivity study (see Sect. 3.1) showed that the interannual variability of the simulated B ag was very sensitive to the g m photosynthesis parameter and to MaxAWC, the 15-yr simulations were repeated 48 (8 × 6) times for each site:  The default response to drought is in bold characters. Note that for cereals, significant negative correlations are found for 6 sites in Northeastern France (02-Aisne, 18-Cher, 39-Jura, 51-Marne, 55-Meuse, 60-Oise), and only 1 site (02-Aisne) for forage pea. * Only the sites where optimal g m and MaxAWC give significant correlations at 1 % level are used. The 1 and 0.1 % levels correspond to R 2 > 0.41 and 0.57, respectively.
Standard ISBA-A-gs simulations do not include a description of agricultural management practices, and vegetation growth is driven by photosynthesis and by the climatic factors (including drought) acting on photosynthesis. Plant growth corresponds to the net assimilation of CO 2 by photosynthesis, and plant mortality is induced by a deficit of photo-synthesis (with respect to an optimal photosynthesis level depending on model parameters). However, a simple irrigation model  may be activated, together with the possibility to prescribe an emergence date (by artificially maintaining LAI at its minimum value, LAI min , presented in Table 1). In this study, these options were not activated. For grasslands, cuts can be prescribed at given dates (Calvet and Soussana, 2001) or when LAI has reached a predefined threshold. In this study, both unmanaged and managed grasslands were simulated. In the latter simulations, cuts were simulated when LAI reached a value of 2 m 2 m −2 .
The variables compared with the agricultural statistics were: (1) for the C3 crops and the unmanaged grasslands, the annual maximum B ag , (2) for managed grasslands, the cumulated cut biomass throughout the annual cycle.

Sensitivity study
The sensitivity of the squared correlation coefficient (R 2 ) of the annual C3 crop maximum B ag simulated by ISBA-A-gs vs. the Agreste yield statistics was investigated over contrasting sites, presenting markedly different optimum MaxAWC and g m values, for various values of key parameters governing the soil moisture stress: MaxAWC, g m , θ C and g c . A large range of parameter values, different from their reference standard values in Table 1, was explored. The parameters were tested one by one (i.e. the other parameters kept their standard value). Also, the chosen sites presented optimal g m or MawAWC values ranging outside the 48 parameter combinations of Sect. 2.4. This permitted assessing the impact of using sub-optimal g m or MaxAWC parameter values.

Key ISBA-A-gs parameters impact the biomass interannual variability
Assuming that the agricultural statistics may help constraining average values of MaxAWC and g m , the values of these parameters were explored in detail as described in Sect. 2.4. For each administrative unit, and for each crop type (cereals, forage pea, and grasslands), optimal values of MaxAWC and g m were obtained, i.e. values providing the best correlation between the agricultural yield statistics and the simulated biomass production. Table 2 presents the median values of the optimum MaxAWC and g m for cereals, forage pea, and grasslands.
Examples of ISBA-A-gs simulations of B ag and AWC, are presented in Figs. 5 and 6, for C3 crops and grasslands, respectively, based on the median MaxAWC and g m values presented in Table 2. The model parameters presented in Table 1 are used, except for g m values derived from Table 2 (g m = 0.75 mm s −1 for unmanaged grasslands, and g m = 1.25 mm s −1 for managed grasslands). The C3 crop and grassland simulations are performed for the SAFRAN grid cells located in the 63-Puy-de-Dôme administrative unit (45.94 • N, 3.21 • E, and 46.23 • N, 2.91 • E, respectively), for MaxAWC values of 175 and 100 mm, respectively (Table 2). For this département, both crops and grasslands are present, and highly significant correlations are obtained. Despite the enhanced photosynthesis and plant transpiration triggered by the higher g m value, the managed grasslands tend to evaporate less than the unmanaged grasslands because LAI does not exceed 2 m 2 m −2 (against annual maximum LAI values ranging from 4.3 to 6.0 m 2 m −2 for the unmanaged grassland).
The sensitivity study described in Sect 2.5 was conducted over the 31-Haute-Garonne and 91-Essonne administrative units (SAFRAN grid cells at 43.57 • N-1.79 • E, and 48.32 • N-2.28 • E, respectively). Over 31-Haute-Garonne, good correlations are found for rye and the Agreste rye yield time series was used to calculate the R 2 values. For 91-Essonne, wheat yields were used. These two sites present contrasting optimum MaxAWC (100 and 250 kg m −2 , respectively) and g m (2 and 1 mm s −1 , respectively) values. The sensitivity study was performed using these parameter values as a reference, and was repeated using sub-optimal parameter values: g m = 1 mm s −1 for 31-Haute-Garonne, and MaxAWC = 200 kg m −2 for 91-Essonne. Figure 7 shows the result of the sensitivity study for the two configurations (optimal and sub-optimal). The values of MaxAWC and g m parameters markedly impact R 2 . The response to the MaxAWC parameter is particularly marked for 31-Haute-Garonne, for both optimal and sub-optimal g m values. For extreme values of these two parameters, low R 2 values between 0 and 0.  θ C , do not impact R 2 much. The θ C parameter is nondimensional and corresponds to a scaled AWC value (i.e. the ratio AWC/MaxAWC). As such, θ C does not depend much on the prescribed MaxAWC value. It must be noted that for the two sites, in all the configurations, θ C = 0.2 presents slightly better results than the standard value of θ C = 0.3 of Table 1. Figure 7 shows that using sub-optimal values of either g m or MaxAWC has a limited impact of the optimum value of its counterpart parameter (i.e. MaxAWC and g m , repectively): the maximum R 2 of the solid and dashed curves of the Max-AWC and g m subfigures of Fig. 7 are obtained at similar MaxAWC and g m values, respectively. Using sub-optimal g m or MaxAWC values tends to reduce the sensitivity of R 2 to θ C and g c .
A noticeable property of MaxAWC is its influence on the amplitude of the interannual variability of the annual maximum biomass simulated by ISBA-A-gs. For example, the optimal MaxAWC value obtained for managed grasslands using ISOP (Table 2) is lower than the value obtained using Agreste (100 and 125 mm, respectively), consistent with the more pronounced interannual variability of the ISOP index. This phenomenon is observed for C3 crops, also. Figures 8 and 9 present the simulated annual maximum biomass of cereals and forage pea, respectively. While the median retrieved values of g m , 1 and 1.5 mm s −1 , respectively, are used, several MaxAWC values are explored. It is shown that low values of this parameter tend to increase the interannual variability. For high values, a negative trend appears, for both cereals and forage pea, as the impact of the climatic trend is no longer masked by a strong interannual variability. From this point of view, the high optimal MaxAWC value obtained for forage pea (200 mm) is consistent with the significant negative trend observed by Agreste for this crop (Sect. 2.3.1). Dev., 5, 37-54, 2012 www.geosci-model-dev.net/5/37/2012/ Fig. 8. Impact of MaxAWC on the interannual variability of the annual maximum B ag simulated by ISBA-A-gs, using the median optimal g m value g m = 1 mm s −1 found for cereals (Table 2).

The interannual variability is more accurately simulated for grasslands than for C3 crops
The R 2 score used in Sect. 3.1 was optimized for all the studied sites by tuning the MaxAWC and g m parameters, with the other model parameters remaining constant at values indicated in Table 1. Figures 10 and 11 present maps of three R 2 levels (non-significant, significant at the 1 % level, significant at the 0.1 % level) for C3 crops (cereals and forage pea) and grasslands (unmanaged and managed), respectively, and the results are summarized in Table 2. For cereals, 6 crops are considered (i.e. winter wheat, rye, winter barley, spring barley, oat, triticale, in this study), and the highest R 2 at a given location is used in Fig. 10 and in Table 2. In Fig. 11, the ISOP index is shown together with Agreste, as slightly better correlations are obtained with ISOP. A striking result is the excellent scores obtained for managed grasslands, with 44 of 48 sites presenting highly significant correlations (at the 0.1 % level) with ISOP, and the rather poor performance obtained for C3 crops, with 5 forage pea sites (over 45) presenting highly significant correlations with Agreste. In the latter case, however, 20 sites present significant correlations (at the 1 % level) with Agreste. Generally, better results are obtained for forage pea than for cereals, and for managed grasslands than for unmanaged grasslands. In Figs. 10 and 11, there is no specific region presenting systematically poor or high R 2 values. This is a positive result as it shows that there is no regional specificity in the quality of the agricultural statistics, nor in the model simulations. However, it must be noted that for cereals, significant negative correlations are found for 6 sites mainly located in Northeastern France (02-Aisne, 18-Cher, 39-Jura, 51-Marne, 55-Meuse, 60-Oise), and only 1 site (02-Aisne) for forage pea. Another interesting result is that the default drought responses (Table 1) present better results than the alternative options, for both C3 crops and grasslands (Table 2). Figure 12 presents the simulated B ag vs. the Agreste grain yield of cereals and forage pea. The ratio of crop yield to the maximum B ag is called the harvest index. For cereals and for forage pea, the harvest indices derived from Fig. 12 range between 20 and 50 %, and between 20 and 40 %, respectively. Overall, this is consistent with Bondeau et al. (2007), giving typical harvest index values for temperate cereals ranging from 20 to 40 %. Figure 13 presents the simulated harvested grass vs. the Agreste and ISOP-STICS yield estimates. The results are shown for the two options of the ISBA-A-gs model available for grasslands: unmanaged and managed. Table 3 displays the corresponding statistical scores. The best correlation is obtained for the unmanaged option of the model vs. Agreste (R 2 = 0.70). In this case, however, the simulated annual maximum B ag is markedly overestimated by the model, by 0.17 kg m −2 , on average. Less scattering is obtained for the managed option of the model vs. ISOP-STICS, with a standard deviation of differences (SDD) of 0.17 kg m −2 . However, the model tends to produce lower yields than ISOP-STICS, with a mean bias of -0.23 kg m −2 .

Sensitivity to g m and to MaxAWC
The results of Sect. 3.1 show that two key parameters of the ISBA-A-gs model have a large impact on the simulated interannual variability of B ag : g m and MaxAWC. While Max-AWC may vary from one site to another, in relation to soil characteristics, large changes in g m are not expected for intensively cultivated crops, as this parameter governs the intrinsic photosynthesis properties, at a given level of nutrient (e.g. nitrogen) availability. In order to assess the relative Fig. 11. As in Fig. 10, except for unmanaged and managed grasslands, and Agreste and ISOP data.  (from top to bottom) the Agreste dry matter yields, the ISOP STICS dry matter yield, and the ISOP index, for the sites where a significant correlation (1 % level) is achieved. The drought-tolerant option is used in the ISBA-A-gs simulations. One regression line is plotted by site, and the dots correspond to the yearly values for all the sites.
impact of the two optimized parameters, Table 2 indicates the number of sites presenting significant R 2 score with either g m or MaxAWC, or both, assumed to be constant in space and equal to their median optimal value. The change in the number of sites with significant R 2 values can be used as a metric to judge the sensitivity of model fit to optimizing vs. fixing the two model parameters MaxAWC and g m . It is found that the detrimental impact on the simulated interannual variability of B ag of prescribing a constant parameter value is systematically higher with a constant MaxAWC than with a constant g m . Also, the model sensitivity varies from one vegetation type to another. Cereals are particularly sensitive to the use of local MaxAWC, as only 4 sites are correlated at the 1 % level with Agreste with a constant MaxAWC (against 13 sites with local MaxAWC values). Forage pea is less sensitive than cereals but the impact of using a constant MaxAWC is still marked (11 against 20 sites). On the other hand, the number of managed grasslands correlating at the 1 % level with ISOP presents little sensitivity to the two parameters. More sensitivity is found for the unmanaged grasslands, especially vs. Agreste. It can be seen that the improvements in R 2 among the significantly correlated sites going from fixed g m and fixed MaxAWC to fixed g m with optimal MaxAWC is similar for cereals and unmanaged grasslands (about 0.1 improvement in R 2 ). Thus, while the model describes some amount of the interannual variance in the grassland yields regardless of whether MaxAWC is fixed or varied, a seemingly significant additional amount of variance is described with the optimal MaxAWC values, at least for unmanaged grasslands. The lower sensitivity observed for managed grasslands may be related to the lower evapotranspiration caused by the LAI limitation imposed by the vegetation cuts. As the use of the soil moisture reservoir is reduced (Fig. 6), prescribing an accurate MaxAWC value is less critical. Indeed, it can be shown that the simulated annual maximum B ag , at the end of the growing season, is correlated with the simulated AWC value at a given stage of the growing season. In the case of the 63-Puy-de-Dôme simulations of unmanaged and managed grasslands (Fig. 6), the best correlations are obtained with the average monthly AWC values simulated in June and July, respectively, with R 2 values of 0.91 and 0.64, respectively.
For cereals (Fig. 5), the best correlations are obtained with the average monthly AWC values simulated in May, with a R 2 value of 0.64. The ISBA-A-gs model is not a crop model and as such, does not simulate the agricultural practices in detail, nor the intensity of the crop management, pest control, crop rotation, or (in the case of cereals) the grain formation. Therefore, the main factor governing the annual maximum B ag , at the end of the growing season, is the soil moisture stress caused by low AWC values. The latter can be caused by low MaxAWC values, and/or by high evaporation rates through stomatal or non-stomatal (cuticular) leaf transpiration, governed by the g m and g c parameters, respectively.

Grasslands vs. C3 crops
This study shows that the yield temporal variability of grasslands is better captured than the variability of croplands. This can be explained by (1) the lack of a specific crop representation in ISBA-A-gs, (2) the use of a model (STICS) to produce the ISOP fodder production index. Indeed, while the Agreste crop yield data are based on harvest observations, estimating the fodder production or the productivity of pasturelands is more challenging. This is why the ISOP-STICS data are used in this study, together with Agreste (Sect. 2.3.2). Although the STICS formulation is quite different from ISBA-A-gs, the comparison of two models generally produces less scatter than comparing a model with measurements performed in the real world. Indeed, any model is based on assumptions and cannot fully represent the temporal variability of the modelled variables, especially at the regional scale addressed in this study. This is also true for the spatial variability. While significant spatial correlations of either crops or grasslands simulations with the Agreste data are found for very few years, the ISOP-STICS spatial variability of the grassland productivity is represented well by ISBA-A-gs. Significant (at the 1 % level) spatial correlations of unmanaged and managed grasslands are found for 7 and 9 years, respectively, with an average R 2 of 0.33. However, even better results are obtained using fixed median values of g m and MaxAWC (10 and 14 yr, respectively), and this shows that a local multi-annual optimisation of the model parameters does not necessarily improve their spatial distribution.  Fig. 10 show that the R 2 scores obtained for C3 crops are extremely heterogeneous. While a few sites present highly significant correlations (e.g. 63-Puy-de-Dôme for both cereals and forage pea), the majority present no significant correlations. These contrasting results may be related to the heterogeneity of the agricultural practices and of the soil types in a particular administrative unit. From this point of view, the 63-Puy-de-Dôme presents less heterogeneity, with most C3 crops concentrated in the "Limagne" plain, surrounded by hilly areas. Also, cereals may be irrigated in some regions, while rainfed crops are represented by the model simulations. Apart from the geographic variability within a département, it is also likely that features of crop production not explicitly represented by the model (see Sect. 1), are changing over time and this contributes to the poor R 2 values for crop sites. Also, the impact of soil type variability is probably more acute for crops than for grasslands, especially managed grasslands (see Sect. 4.1).

Are trends in forage pea production due to climate or to management intensity trends?
As shown in Sect. 2.2, the selected agricultural regions are affected by a marked warming trend of the growing season, during the 1994-2008 period, especially in Northern France. In Sect. 3.1, it was shown that MaxAWC impacts the simulated trend in B ag , as higher values of this parameter tend to limit the impact of the interannual variability and to favour the impact of the climatic trend. Therefore, the observed negative trend in the Agreste forage pea yields can be explained by high values of MaxAWC (with a median value of 200 mm in Table 2, against 175 mm for cereals), associated to the observed climatic trend. Other explanations can be proposed ("Pois", Wikipedia, http://fr.wikipedia.org/w/ index.php?title=Pois&oldid=66239238, last access December 2011). In particular, the amount of agricultural lands devoted to forage pea in France has been decreasing from 6669 km 2 in 1994 to 1002 km 2 in 2008, in relation to less favourable public incentives to the cultivation of forage pea, and to the rapid extension in France of a specific disease caused by a fungus (Aphanomyces euteiches). These factors may have triggered changes in the distribution of MaxAWC values related to forage pea, and a less intensive cultivation of forage pea (e.g. use of poorer lands, and/or less fertilizers and pesticides). Since these factors are not accounted for by the model, and as the model is able to account for climatic factors only, the resulting MaxAWC obtained in this study for forage pea may be overestimated. The reverse is true as the stable yields observed for cereals during the 1994-2008 period may be due to a progression of the intensification able to compensate for the climatic trend . In this case, our optimized model would tend to underestimate the optimum MaxAWC.
This shows that MaxAWC "retrieved" values from agricultural statistics have to be evaluated.

Is the retrieved MaxAWC realistic?
The MaxAWC values obtained in this study can be compared with independent estimates: -values for the rooting zone of the three-layer forcerestore soil model (Boone et al., 1999) currently used in SURFEX, -values derived from a high resolution map of the soil characteristics developed by INRA, and aggregated within the SAFRAN grid cells in three subgrid categories: "minimum", "average", "maximum". Specific values of MaxAWC are derived for the soil types present in the SAFRAN grid cell. The average MaxAWC corresponds to a linear mixing of the specific MaxAWC values, weighted by the fractional cover of each soil type. Table 4 presents the various MaxAWC estimates, for cereals, forage pea, and managed grasslands sites for which a significant correlation (at the 1 % level) with agricultural statistics is achieved. The SURFEX median MaxAWC values do not vary from one vegetation type to another and are all equal to 129 mm. The standard deviations are small and do not exceed 10 mm. Indeed, except for sandy soils, the pedotransfer functions currently used in SURFEX (Noilhan and Mahfouf, 1996), tend to produce little variation of the difference between the field capacity soil moisture and the wilting point (FC-WP), which ranges between 0.085 and 0.090 m 3 m −3 . As the prescribed rooting depth of the 3-layer soil model is the same for C3 crops and grasslands (1.5 m), the resulting MaxAWC varies little. It must be noted that more recent pedotransfer functions (e.g., Wösten et al., 1999;or Saxton and Rawls, 2006) allow much more variability of FC-WP.
The INRA MaxAWC estimates for grassland sites are lower than for C3 crop sites, especially for the average and maximum categories (about 30 mm less). All the MaxAWC estimates obtained in this study are in the range of INRA categories: -from minimum to average for grasslands, -from average to maximum for cereals, -close to maximum for forage pea.
The below average grassland MaxAWC can be explained by the fact that the more productive soils are generally used for crops, and the less productive soils for forests or for grazing and hay production. The 8 km × 8 km sites, although presenting a large fraction of either C3 crops or grasslands (at least 45 % of the ECOCLIMAP-II grid-cells) are not homogeneous, and the three INRA MaxAWC categories may correspond to any kind of vegetation type. However, Table 4 shows that the optimized model estimate of MaxAWC falls at the lower end of the INRA range for grasslands, in conjunction with lower grassland site MaxAWC within the INRA data, for the three categories (minimum, average, maximum). The high value obtained for forage pea is not out of range, but this result has to be considered with caution (see Sect. 4.3).

Conclusions
French annual agricultural statistics were used to assess to what extent the ISBA-A-gs land surface model is able to reproduce the interannual variability of the dry matter yield, over the 1994-2008 period. It was shown that, even if ISBA-A-gs does not simulate specific processes related to agricultural practices, the agricultural statistics have potential to evaluate the impact of key model parameters, in particular those related to the plant response to drought. Two parameters impact more markedly the simulations: g m and MaxAWC. The latter has more influence than g m and impacts both the amplitude of the interannual variability and the biomass production trend in response to the warming trend observed during the growing period (April-June). It is confirmed that the drought-avoiding and drought-tolerant responses used in SURFEX for C3 crops and grasslands, respectively, provide the best correlations of the simulated above-ground biomass with the agricultural statistics. A simple method based on a LAI threshold was used for the first time in ISBA-A-gs to represent managed grasslands. The excellent scores obtained with this new option of the model shows that this parameterization could be used in future studies or applications, together with a better mapping of MaxAWC. Currently, MaxAWC does not vary much in Geosci. Model Dev., 5, 37-54, 2012 www.geosci-model-dev.net/5/37/2012/ SURFEX and this study shows that MaxAWC tends to be underestimated for crops, and overestimated for grasslands. Finally, these results show the potential of using agricultural statistics for model benchmarking.