Representing winter wheat in the Community Land Model (version 4.5)

. Winter wheat is a staple crop for global food security, and is the dominant vegetation cover for a signiﬁcant fraction of Earth’s croplands. As such, it plays an important role in carbon cycling and land–atmosphere interactions in these key regions. Accurate simulation of winter wheat growth is not only crucial for future yield prediction under a changing climate, but also for accurately predicting the energy and water cycles for winter wheat dominated regions. We modiﬁed the winter wheat model in the Community Land Model (CLM) to better simulate winter wheat leaf area index, latent heat ﬂux, net ecosystem exchange of CO 2 , and grain yield. These included schemes to represent vernalization as well as frost tolerance and damage. We calibrated three key parameters (minimum planting temperature, maximum crop growth days, and initial value of leaf carbon allocation coefﬁcient) and modiﬁed the grain carbon allocation algorithm for simulations at the US Southern Great Plains ARM site (US-ARM), and validated the model performance at eight additional sites across North America. We found that the new winter wheat model improved the prediction of monthly variation in leaf area index, reduced latent heat ﬂux, and

Abstract. Winter wheat is a staple crop for global food security, and is the dominant vegetation cover for a significant fraction of Earth's croplands. As such, it plays an important role in carbon cycling and land-atmosphere interactions in these key regions. Accurate simulation of winter wheat growth is not only crucial for future yield prediction under a changing climate, but also for accurately predicting the energy and water cycles for winter wheat dominated regions. We modified the winter wheat model in the Community Land Model (CLM) to better simulate winter wheat leaf area index, latent heat flux, net ecosystem exchange of CO 2 , and grain yield. These included schemes to represent vernalization as well as frost tolerance and damage. We calibrated three key parameters (minimum planting temperature, maximum crop growth days, and initial value of leaf carbon allocation coefficient) and modified the grain carbon allocation algorithm for simulations at the US Southern Great Plains ARM site (US-ARM), and validated the model performance at eight additional sites across North America. We found that the new winter wheat model improved the prediction of monthly variation in leaf area index, reduced latent heat flux, and net ecosystem exchange root mean square error (RMSE) by 41 and 35 % during the spring growing season. The model accurately simulated the interannual variation in yield at the US-ARM site, but underestimated yield at sites and in regions (northwestern and southeastern US) with historically greater yields by 35 %.

Introduction
Wheat is a widely grown temperate cereal (Shewry, 2009), ranked fourth among commodity crops with a global production of 711 million tonnes, and encompasses 13.3 % of global permanent cropland as of 2013 (http://faostat3.fao.org/home/ E). Wheat provides one-fifth of the total caloric input of the world's population (Curtis et al., 2002), and therefore plays an important role in global food security (Chakraborty and Newton, 2011;Vermeulen et al., 2012). In many regions, such as the United States, winter wheat (Triticum aestivum) is the dominant wheat cultivar, accounting for 74 % of the total US wheat production, based on data from the National Agricultural Statistics Service of the U.S. Department of Agriculture in 2013 (http://www.nass.usda.gov).
Winter wheat, which is planted in fall and harvested in early summer, has a different growth cycle and responds to environmental stresses differently from summer crops. Winter wheat may suffer less from summer drought, but is subject to winter damage due to exposure to low temperatures and frequent freeze-thaw cycles (Vico et al., 2014). There are two important over-winter survival mechanisms for winter wheat: vernalization and cold tolerance. Vernalization is the process whereby winter wheat is exposed to a period of non-lethal low temperature required to fully enter the flowering stage and to produce grain in spring (Chouard, 1960). Additionally, winter wheat acclimates to low temperature, giving it the ability to survive cold temperatures. Both of these processes -vernalization and cold tolerance -are cumulative processes and have similar optimum temperature ranges. When the temperature is outside of the optimum range, the processes can be stopped, reversed, and restarted (Fowler et Published by Copernicus Publications on behalf of the European Geosciences Union. al., 1999). Damage can occur when temperatures are lower than the accumulated cold tolerance (reviewed by Barlow et al., 2015). Cold-induced damage has been observed to persist through the remainder of the growing season, and its impact on yield is greater than on growth. Effectively representing these processes in crop models could improve understanding of the uncertainty in the future crop yield projections.
Winter wheat also plays an important role in landatmosphere interactions through effects on energy, water, and carbon fluxes. Winter wheat cropland has much less soil carbon loss compared to maize cropland averaged across several sites (Ceschia et al., 2010), and could either be a carbon sink (Waldo et al., 2016) or source (Anthoni et al., 2004), depending on the year and the location. The earlier growing season can influence surface fluxes of water, energy, and momentum, and hence regional climate (Riley et al., 2009). This land-surface influence is particularly strong in the US Southern Great Plains, where winter wheat is a dominant land-cover type. For example, statistical analyses indicated cooler and moister near-surface air over Oklahoma's winter wheat belt from November to April compared to adjacent grassland, due to the influence of winter wheat (McPherson et al., 2004). This influence highlights the importance of adequately representing winter wheat in land-surface models used for climate projections, in order to assess both the impact of climate change on agriculture and agriculture's influence on regional climate.
The agricultural research community developed several winter wheat models during the 1980s, such as the Agricultural Research Council winter wheat model (ARCWHEAT) (Porter, 1984;Weir et al., 1984) and the Crop Estimation through Resource and Environment Synthesis winter wheat model (CERES-wheat) (Ritchie and Otter, 1985). These models were designed to simulate winter wheat growth at the farm level and have well-defined winter wheat growth phenology, which is a function of thermal time and day length that is adjusted by vernalization and a photoperiod factor. Photosynthesis and respiration processes determine the dry matter for partitioning among roots, shoots, leaves, and grain. Some models (e.g., CERES-wheat) considered winter wheat loss due to extreme low temperature in winter. In contrast to their strength in representing crop growth processes, these models have simplified treatment of important upstream processes that affect crop growth. For example, the photosynthesis scheme is a linear function of intercepted photosynthetically active radiation (PAR), PAR itself is simplified as a constant fraction of incoming solar radiation, and radiation is not separated into direct and diffuse fractions. Further, these crop models were originally developed to simulate individual, as opposed to multiple, crops, making multi-crop simulations at regional and global scales difficult.
To incorporate more physical processes and to simulate crop growth at regional or global scales, some agronomic crop growth models were incorporated into agro-ecosystem models. For example, CERES maize and wheat growth were added to the Decision Support System for Agrotechnology Transfer (DSSAT) model (Jones et al., 2003). A substantial modified version of CERES Wheat (Keating et al., 2001) has also been added to the Agricultural Production Systems Simulator (APSIM) model (Keating et al., 2003). As the effects of vegetation on the atmospheric boundary layer have been increasingly appreciated, some land-surface models started to also incorporate crop growth models to not only simulate crop yield, but also to simulate crop growth effects on surface carbon, water, and energy fluxes. For example, the SU-CROS crop growth model was incorporated into JULES (Van den Hoof et al., 2011) and the STIC crop growth model was incorporated into ORCHIDEE (Wu et al., 2016). In the recent Agricultural Model Intercomparison and Improvement Project (AgMIP), these agro-ecosystem models and landsurface models were categorized as global gridded crop models (GGCMs).
The Community Land Model (CLM) (Oleson et al., 2013) is one of the GGCMs included in AgMIP. It is also a state-ofthe-art land-surface model used in the Community Earth System Model (Hurrell et al., 2013) that simulates biogeophysical and biogeochemical processes on a spatial grid. CLM can be run online, coupled with the atmosphere model, or offline at multiple spatial scales (site, regional, and global) and resolutions. One grid cell in CLM is divided into different land units (urban, glacier, lake, wetland, vegetation), and the vegetation unit can consist of up to 14 natural vegetation types and 64 crop types in the most recent version (a developer version of CLM4.5). CLM is a community effort that incorporates scientific advances through time, such as two-leaf stomatal conductance and photosynthesis, transient land use, multilayer canopy models (Bonan et al., 2012), methane models (Riley et al., 2011), and carbon isotope models .
In order to better represent agricultural ecosystems, Levis et al. (2012) introduced crop growth modules into CLM based on the AgroIBIS model (Kucharik, 2003). Since their introduction, the crop modules in CLM have been updated to represent more crop types (maize, soybean, cotton, wheat, rice, sugarcane, tropical maize, tropical soybean) and processes, such as soybean nitrogen fixation  and ozone impacts on yields (Lombardozzi et al., 2015). In CLM, crop growth depends on photosynthetic processes, which are limited by light, water, and nutrient availability. At each time step, photosynthesis estimations provide the potential available carbon for plant growth, which is adjusted by nitrogen supply and demand. The actual available carbon is distributed to leaf, stem, root, and grain by carbon allocation coefficients that vary based on crop growth stages. While the initial focus for incorporating crop growth into CLM was as a lower boundary condition to the atmosphere, the model also predicts crop yields and is participating in the AgMIP GGCM Intercomparison project (Elliott et al., 2015).
Our new winter wheat model adopted the same phenology phases as the original winter wheat model in CLM, but replaced the vernalization process, added frost tolerance and damage processes, slightly modified the carbon allocation algorithm, and calibrated several key parameters that affect winter wheat growth. Our work focused on improving the representation of the key growth processes for winter wheat in order to (1) better simulate the land-surface influence on surface CO 2 , water, and energy exchanges in winter wheatdominated regions and (2) accurately simulate crop growth and yield so the model can be used for winter wheat yield projections.

Calibration data
We calibrated the simulated leaf area index and yield using observations from the Atmospheric Radiation Measurement Southern Great Plains Central Facility site (US-ARM) in northern Oklahoma, USA. The site has well-documented crop growth and management information, including crop types and planting and harvest dates. The site conducts biweekly leaf area index (LAI) measurements with a light wand (Licor LAI-2000) during the active growing season. Using a combination of in situ LAI and site reflectance spectrum measurements, Williams and Torn (2015) generated a daily LAI product, used here to develop and calibrate the winter wheat model. Six winter wheat seasons are used from the US-ARM site : 2003, 2004, 2006, 2007, 2009, and 2010 (winter wheat was not grown at the US-ARM site during 2005 and 2008).

Validation data
We validated the simulated leaf area index, and leaf, stem, and grain dry weight at five winter wheat field sites (TXLU, KSMA, NESA, NDMA, and ABLE) in North America. The experiments were originally designed to understand winter wheat response to nitrogen fertilization and water treatments (four nitrogen levels and three irrigation regimes) in the Great Plains Major et al., 1988;Reginato et al., 1988), and have been used as part of the AgMIP Wheat project. For our validations, we only validated to seven site-year rainfed plots, which are TXLU-1985&1986, KSMA-1985, NESA-1985&1986, NDMA-1986, and ABLE-1986 We validated the simulated energy, water, and CO 2 flux at three additional eddy flux tower sites: (1) Ponca City (US-PON), (2) Curtice Walter-Berger Cropland (US-CRT), and (3) the Washington State University Cook Agronomy Farm conventional tillage site (CAF-CT) (Fig. 1). These three sites do not have detailed crop growth measurements of tissue biomass, but have surface flux measurements that are crucial to understanding the role of winter wheat in altering landatmosphere interactions. One caveat of using the eddy flux observation is the energy balance closure problem (Foken, 2008;Wilson et al., 2002) due to the eddy covariance technique limitation or the errors in calculating energy flux terms. The energy closure ratios at the four eddy flux sites are 87 % at US-ARM, 91 % at US-PON, 70 % at US-CRT, and 83 % at CAF-CT during the period used in the comparison. We used these imbalanced energy fluxes as is and discussed their impacts on our results.
We also validated the simulated US winter wheat yield with the USDA NASS county-level non-irrigated winter wheat yield data. For the sites that did not have site-level yield observations, we also validated site-level simulations with the nearest county non-irrigated yield.

Model development
Similar to other crops in CLM, winter wheat has four phenological phases, including planting, leaf emergence, grain fill, and harvest. The criteria and thresholds for entering different phenology phases are listed in Table 2. Growing degree days is the key variable controlling phenology, and is measured as heat accumulation during the whole growing season or over a certain period. It was calculated by accumulating the difference (no accumulation if less than 0) between the target temperature (e.g., mean air temperature) and base temperature, and normally has a maximum daily increment. We used three different growing degree day algorithms to determine winter wheat phenology, all using the same base temperature (0 • C) and maximum daily increment (26 • ) (Levis et al., 2012). The 20-year running average of growing degree days (GDD 020 ) uses 2 m air temperature (T 2 m ) from September to June in the Northern Hemisphere (from April to September in the Southern Hemisphere), and is updated each year by averaging the previous 19 years. The growing degree days for soil temperature since planting (GDD tsoi ) uses averaged soil temperature from the top two model soil layers (0.71 and 2.79 cm). Growing degree days since planting (GDD plant ) uses T 2 m , and is reduced by a vernalization factor (see below) after leaf emergence.
To better represent winter wheat phenology, we added two additional processes: vernalization and frost damage. We adopted a generalized winter wheat vernalization model (Eqs. 1-3 were directly adopted from Streck et al., 2003). Similar to other winter crops, winter wheat must be exposed to low and nonfreezing temperature to enter the reproductive stage. Additionally, the vernalization process affects cold tolerance, as discussed below. If plants are not fully vernalized, the potential size of the flower head will be reduced. Vernalization starts after leaf emergence and ends before flowering. To model this process, daily vernalization rate (fvn, Eq. 1) is calculated based on the difference between the crown tem-perature (T crown ) and the optimum vernalization temperature (T opt ). In the CLM crop model, the crown temperature is the crown depth soil temperature (Aase and Siddoway, 1979), calculated as the function of 2 m air temperature and snow depth. The crown temperature is typically warmer than the 2 m air temperature in winter, if the plant is covered by snow, and the same as the 2 m air temperature without snow cover. If the crown temperature is equal to the optimum temperature for a whole day, then fvn is equal to 1. Otherwise, fvn is less than 1 as calculated in Eq. (1).
Next, the sum of fvn over sequential days is the effective vernalization days (VD, Eq. 2).
This is used to calculate the vernalization factor (VF, Eq. 3). VF varies from 0 to 1 (fully vernalized) to represent the ver- (3) Finally, VF was used in adjusting the growing degree days since planting (GDD plant = GDD plant,unadjusted × VF) and the grain carbon allocation coefficient (a grain = a grain, unadjusted × VF). When winter wheat is not fully vernalized (VF < 1), then GDD plant and a grain are reduced, resulting in slowed growth and reduced yield. We quantify the impacts of low temperature damage, including from frost, using three variables: (1) temperature at which 50 % of winter wheat was damaged (LT 50 ), (2) survival probability (f surv ), and (3) winter killing degree days (WDD). Here, Eqs. (4)-(8) were from Bergjord et al. (2008) and Eqs. (9)-(10) were from Vico et al. (2014), without any modifications. The calculations for the three variables are briefly summarized, and more detailed descriptions of the calculations can be found in Bergjord et al. (2008) and Vico et al. (2014). LT 50 (Eq. 4) depends on LT 50 from the previous time step (LT 50t−1 ), low temperature acclimation (i.e., hardening; RATEH), loss of hardening due to exposure to high temperatures (i.e., dehardening; RATED), stress due to respiration under snow (RATER), and exposure to low temperature (RATES). Lower LT 50 results in greater frost tolerance for winter wheat while higher LT 50 indicates lower frost tolerance.
The contribution of hardening to LT 50 was calculated as RATEH (Eq. 5), which was mainly a function of crown temperature (T crown ) and adjusted by a hardening parameter (H param = 0.0093), maximum frost tolerance (LT 50c = −23 • C). RATEH increased rapidly when crown temperature (T crown ) fell below 10 • C. When T crown fell below 0 • C, the slope of RATEH was same as T crown at 0 • C. RATEH is also determined by the difference between the current level of frost tolerance and the maximum level of frost tolerance (LT 50t−1 −LT 50c ). At the beginning of cold acclimation, when LT 50t−1 is much higher than LT 50c , RAHEH increases quickly.
where LT 50i = −0.6+0.142LT 50c represents LT50 for an unacclimated plant. RATED accounts for the dehardening contribution (Eq. 6), which is a function of crown temperature and is adjusted by a dehardening parameter (D param = 2.7×10 −5 ) and LT 50 for a plant that is not acclimated to cold (LT 50i ). Cold acclimation is a cumulative process and can reverse (dehardening) when plants are exposed to high temperature or restart (hardening) when temperature is below 10 • C. The high temperature threshold depends on the vernalization stage. Dehardening occurs when T crown ≥ 10 • for plants that are not fully vernalized (VF < 1), and when T crown ≥ −4 • for plants that are fully vernalized (VF = 1).
Stress due to respiration under snow also increases LT 50 and was calculated as RATER (Eq. 7), which is a function of snow depth and a respiration factor (RE). RE is a regression function fitted to respiration measurements (Sunde, 1996). f (snowdepth) ranges from 0 to 1 for snow depth up to 12.5 cm, and is equal to 1 when snow depth is greater than 12.5 cm.

RATES
where S param = 1.9. Long-term exposure to near-lethal temperature will also increase LT 50 and was calculated as RATES (Eq. 8), which is based on the winter survival model developed by Fowler et al. (1999).
The probability of survival (f surv , Eq. 9) is a function of LT 50 and crown temperature. The probability of survival reaches a median value when T crown equals LT 50 , and increases when T crown is warmer than LT50 and decreases when T crown colder than LT 50 .
Finally, we calculate winter killing degree days (WDD, Eq. 10) as a function of T crown and f surv . WDD not only accounts for the cumulative degree days when the crop was exposed to freezing temperatures but also accounts for the probability of death at the temperature of exposure. High WDD occurs with low temperature and low survival probability.
where T base = 0 • . Although Bergjord et al. (2008) and Vico et al. (2014) defined the frost tolerance and damage indicators described above, they did not propose a model for the growth response to crop damage from low temperatures. Here we developed our own hypothetical two-stage frost damage parameterization (Eqs. 11-12) that includes both instant damage and accumulated damage during the leaf emergence phase of winter wheat growth. In CLM, plant tissues are represented as Table 3. Carbon allocation algorithms for the leaf emergence to grain fill stage, and the grain fill to harvest stage.

Phase
Allocation algorithm Leaf emergence to grain fill a grain = 0 the mass of carbon and nitrogen per m 2 ground. We simulated leaf carbon and nitrogen reduction for each of the two types of frost damage. We assumed that instant damage occurs at the beginning of the growing season (VF < 0.9) when plants are not fully vernalized and have low survival probability when exposed to subzero temperatures. In this case, the growth of leaves most vulnerable to cold (e.g., new leaves or small seedlings) would slow or cease. After many sensitivity tests, we found the best fit to observations by removing an amount of leaf carbon (leafc damage_i = 5 g C m −2 ) to the soil carbon litter pool, scaled by a factor of 1-f surv (Eq. 11) at each time step (half-hourly). The leaf carbon was reduced whenever f surv was less than 1 until leaf carbon reached a minimum value (10 g C m −2 ).
for WDD > 0, f surv < 1, and leafc t > 10 (11) In addition to this instantaneous damage, we introduced an accumulated damage parameterization for when winter wheat is close to or has completed vernalization (VF > 0.9) in spring. We assumed that plants would not be likely to suffer as much from instantaneous frost damage as in the early winter season due to less subzero temperature, but that an extended period of subzero temperatures (large WDD) would lead to severe crop damage. To simulate this, we let WDD accumulate up to a set value (set to 1 • days), when it triggers the accumulated damage function and we track the average f surv for this time period. When WDD > 1 • days, all leaf carbon from the previous time step (leafc t−1 , representing the damage to the whole plant), scaled by a factor of (1-averaged f surv ), was removed from the leaf carbon to the soil carbon litter pool. After leaf carbon was reduced, WDD was reset to 0, and the accumulation and tracking of the averaged f surv was restarted. For both frost damage types, leaf nitrogen was removed to the nitrogen litter pool. The nitrogen was scaled to the reduction of leaf carbon by the fixed C : N ratio (25 for winter wheat). The results show that the simulation of LAI (Fig. S1 in the Supplement) can be improved by including a representation of frost damage in winter wheat models. However, the approach here is based on empirical indicators of frost damage. This suggests the potential for further improvement by incorporating process-level representation of frost damage in future model versions.
leafc t = leafc t−1 × averaged f surv , VF ≥ 0.9 and WDD > 1 CLM leaf (a leaf ) and stem (a livestem ) carbon allocation coefficients for winter wheat were also adjusted during the grain fill to harvest phase. The original a leaf and a livestem changed in time as a function of growing degree days. This approach resulted in a rapid decline in the stem carbon allocation, and led to a grain carbon allocation coefficient that was too large (Fig. S2), producing unrealistically high yields at the US-ARM site. We modified the leaf and stem carbon allocation coefficients to be functions of carbon allocation at the initial time of grain fill (a i,3 leaf and a i,3 livestem ), and therefore a livestem gradually declines and a grain gradually increases during the grain fill phase (Table 3, Fig. S2b).
After the above modification of carbon allocation and addition of the new vernalization and frost damage processes, we calibrated three parameter values (indicated with * in Table 4) in the US-ARM simulation. We adjusted minimum planting temperature and maximum days for growing to better match the US-ARM site planting and harvest dates, and adjusted the initial leaf carbon allocation coefficient to better match the US-ARM LAI and yield.

Experiment design
We set up paired CLM4.5 site simulations using Levis et al.'s (2012) original winter wheat model (CLMBASE) and our modified winter wheat model (CLMWHE) at the winter wheat sites in Table 1. We forced the site simulations with half-hourly observed temperature, relative humidity, precipitation, wind, and incoming solar radiation. Incoming longwave radiation was available at the US-ARM and US-CRT sites and was also input to the simulations at those sites. Each paired simulation ran with the same initial conditions, which were generated using a spin-up of several hundred years at each site (described below). The simulated differences between the original winter wheat and the modified winter wheat are therefore due to the modified parameters and updated processes described above.
Land-surface models, especially those including biogeochemical components, require long-term (thousands of simulation years) spin-up for their carbon and nitrogen pools to reach equilibrium (Shi et al., 2013). Therefore, generating initial conditions with steady-state carbon and nitrogen pools is computationally time-consuming and expensive if the simulation starts with no carbon and nitrogen. To accelerate the spin-up process, we generated site-level initial conditions by interpolating a global simulation that had reached carbon and nitrogen equilibrium, and then further spun up the site-level simulations for 200 years using recycled site observed meteorology for years listed in Table 1. When CLM reaches equilibrium, the averaged land-surface variables during each atmospheric forcing cycle should not change or vary within a threshold (Table S1 in the Supplement). We found latent heat flux, sensible heat flux, leaf area index, and wheat yield reached equilibrium fairly quickly (< 40 years), but the total ecosystem carbon, total soil organic carbon, and total vegetation carbon took a longer time to reach the equilibrium state.
We also set up a regional simulation (50 km resolution, 1979-2010) over the continental US to compare spatial patterns in yield predictions to the USDA NASS county-level winter wheat yield. To get the winter wheat land-cover percentage, we first estimated the winter wheat fraction using the USDA NASS county-level acres harvested data, and then split the wheat land-cover percentage in the default CLM surface file into winter wheat and spring wheat. Since the goal of the regional simulation was to validate the spatial yield and not the carbon pools, we ran a partial spin-up and allowed the crop yield to reach equilibrium while the total ecosystem (i.e., soil) carbon was not at equilibrium.
We applied the nitrogen fertilization in all the simulations. CLM4.5 considered the nitrogen limitation through the down-regulation of the potential photosynthesis based on the nitrogen demand and supply deficit, which was calculated by considering the complex belowground biogeochemical processes (e.g., nitrification, denitrification, leaching, soil organic matter decomposition). When nitrogen supply is less than the nitrogen demand, the potential photosynthesis will be reduced by the deficit factor. For the TXLU, KSMA, NESA, NDMA, and ABLE site simulations, we applied the observed nitrogen fertilization amount (10-20 gN m −2 ) at the same days as the observation, while for the other sites and the US simulations, we applied the default nitrogen fertilization during leaf emergence every year for an amount of 8 gN m −2 . With this nitrogen fertilization, there is no nitrogen limitation at all in our simulations.

Statistical analysis of yield at US-ARM site
To determine the factors that contributed most strongly to yield in observations and the model, we performed statistical regressions for US-ARM observations and CLMWHE outputs separately. We had 11 observed and simulated variables, including growing degree days, nitrogen fertilization, peak leaf area index, precipitation, days of grain fill, days of leaf emergence, day of peak leaf area index, 10 cm soil moisture, 20 cm soil moisture, planting date, and harvest date. We performed simple linear regressions with each of these variables and compared the R 2 values between observational data and simulation outputs.

Leaf area index and dry weight
The modified winter wheat model (CLMWHE) showed a better seasonal growth cycle than the original model (CLM-BASE) (Fig. 2). In the CLMBASE simulation, the vernalization factor is too high even at the beginning of the growing season (Fig. S3). Without the reduction on the growing degree days from the vernalization function, winter wheat LAI and leaf weight reached peaks in December and then declined due to the onset of the grain fill stage. The long grain fill stage (December-May) in CLMBASE did not produce a sufficiently high grain mass because of the low rate of photosynthesis with the low LAI. In the CLMWHE simulation, LAI and leaf weight reached peaks in April, and stem and grain weight reached peaks in May, which are similar to the site observations. The improvements in the seasonal variation are mainly due to the updated vernalization, which produced a reasonable vernalization period of about 2-3 months, reduced the growing degree days and extended the leaf emergence stage. The cold damage scheme also played a role in reasonable simulation of winter LAI and leaf weight. For example, at KSMA-1985, cold damage reduced the LAI and leaf weight in fall yielding a better match to the winter measurement (at DOY = 320). Besides these improvements, we also observed an overestimation of LAI during the later growing season, which is due to the low leaf senescence rate during the grain fill period.
The updated winter wheat model captured the grain weight temporal and spatial variations, and root mean square error (RMSE) and the index of agreement are better in CLMWHE www.geosci-model-dev.net/10/1873/2017/ Geosci. Model Dev., 10, 1873-1888, 2017

ABLE-1986
(a7)  Figure 2. The daily leaf area index (m 2 m −2 ), leaf dry weight (t ha −1 ), stem dry weight (t ha −1 ), and grain dry weight (t ha −1 ) simulations in CLMWHE (the updated winter wheat model) and CLMBASE (the original winter wheat model), and in site observations for 7 site years.
than CLMBASE for 7 site years. RMSE was reduced by 19 % and the index of agreement was increased by 45 %. CLMWHE showed higher grain weight in 1986 than 1985 at TXLU and NESA, as did the observations, because 1986 was a wetter year for both TXLU (8 % higher annual precipitation than 1985) and NESA (84 % higher). In 1986, CLMWHE showed more grain weight in NESA and NDMA than TXLU and ABLE, as in the observations. For the four flux tower sites, CLMWHE also improved LAI and crop growth seasonal variations (Fig. 3a-d). Both sites exhibited reduced RMSE compared to CLMBASE (Table S3) , and (q-t) sensible heat flux (W m −2 ) for observations, CLMWHE, and CLMBASE across four sites. The US-ARM site data were averaged over 6 winter wheat years (2003,2004,2006,2007,2009,2010), US-PON data were averaged over 1997 and 1998, US-CRT data are from 2013, and CAF-CT data are from 2014. The error bars indicate the standard error for the month across years, and there are no error bars for US-CRT and CAF-CT because the values are for 1 year. mated peak LAI but captured the seasonal LAI variation (peak in April and then decline). At the US-PON site, CLMWHE overestimated LAI throughout the growing season but showed similar seasonal variation. Although the US-CRT and CAF-CT sites have no LAI observations, CLMWHE generally increased LAI and had a more reasonable seasonal variation compared to CLMBASE.

Surface carbon, water and energy fluxes
The improved simulation of LAI seasonal variation led to better monthly patterns of net ecosystem exchange of CO 2 (NEE) (Fig. 3e-h). In Fig. 3, negative values indicate a carbon sink, where the crop gains more carbon through photosynthesis than is lost due to respiration. During the winter wheat growing season, the observed NEE is most negative coincident with peak LAI. CLMWHE captured these seasonal patterns at US-ARM and US-CRT sites, although it did underestimate the NEE magnitudes at their peak. The underestimation of peak LAI may have contributed to this bias. CLMBASE has much smaller NEE relative to CLMWHE, consistent with the lower LAI. We also observed a discrepancy after harvest, where CLMWHE (and CLMBASE, to a lesser extent) simulated a strong carbon source for the site, but observations exhibited either neutral NEE at US-ARM or a smaller NEE at US-CRT site. This discrepancy is due to the model treating the land cover as bare ground after harvest, when in reality weeds (identified by visual inspection of daily site photographs) quickly exert influence on surface fluxes of carbon. The annual net radiation (Rn) simulations (Fig. 3i-l) at the four sites were slightly improved in CLMWHE. Averaged across the four sites, Rn RMSE was reduced from 16.6 W m −2 in CLMBASE to 12.9 W m −2 in CLMWHE. The latent heat flux (LE) simulation was improved during March-May ( Fig. 3m-p). The spring LE RMSE was reduced by 10-70 % across the four sites in CLMWHE due to the better LAI simulation in spring. However, the annual LE RMSE was only slightly reduced (up to 23 % RMSE reduction in CLMWHE) at US-ARM, US-PON, and US-CRT, and showed no improvement at CAF-CT. The sensible heat flux (H ) showed no obvious improvement (Fig. 3q-t).
At the US-ARM and US-PON sites, the LE monthly variation patterns were improved by better representing the leaf area index, but this improvement was limited by surface energy partitioning problems in the model. The model partitioned more energy to LE than was observed during the period when LAI declines in the late growing season (May-July). The observed LE is 45 and 53 % of net radiation at the US-ARM and US-PON sites, while LE simulated in CLMWHE is 53 and 67 % of net radiation at the US-ARM and US-PON sites. This energy partitioning problem is reversed at the US-CRT and CAF-CT sites, where the model partitioned less energy to LE than observations. The observed LE is 68 and 66 % of net radiation at the US-CRT and CAF-CT sites, while simulated LE in CLMWHE is 52 and 30 % of net radiation at the US-CRT and CAF-CT sites. Both sites are rainfed with no irrigation applied. In addition, the month of peak LE does not coincide with the month of peak LAI in the observations at US-ARM and US-PON. In observations, LE reaches a peak at the same time when LAI is at its peak, but in CLMWHE, LE reaches a peak 1 month later than the LAI peak. The lack of energy balance closure for the eddy flux measurements could affect the energy fluxes' RMSE estimations, but will not change the major conclusions here: CLMWHE showed improved spring LE simulations than CLMBASE, and the simulated LE peak was 1 month later than LAI peaks. Finally, we note that the winter wheat model did not improve surface energy partitioning in summer after winter wheat harvest.
We found that the overestimation of LE in summer and fall can be reduced using a new soil evaporation scheme (Swenson and Lawrence, 2014) that will be available in CLM5.
In CLM, vegetation affects LE through leaf transpiration, and LE in vegetated grid cells has three components: soil evaporation, wet leaf evaporation, and dry leaf transpiration (Lawrence et al., 2007). The excessive spring soil evaporation in CLM has been reported in earlier versions of CLM (Lu and Kueppers, 2012;Stockli et al., 2008) and some effort has been made to reduce soil evaporation. For example, Sakaguchi and Zeng (2009) added a litter resistance to soil evaporation in CLM3.5 that reduced the annual averaged soil evaporation. Recent work by Swenson and Lawrence (2014) added a dry surface layer that increased the soil resistance and reduced soil evaporation. We tested the new dry surface layer scheme at the US-ARM site, and found that soil evaporation was reduced by 21 % and the LE simulation was improved in May-December (Fig. 4c). However, the spring LE was still underestimated and the LE peak was still 1 month later than the LAI peak, which is due to the leaf transpiration reaching its peak 1 month later than the LAI peak (Fig. 4c).

Yield
The accuracy of the simulated yield depended on whether the region has a similar climate to the site where the model was calibrated (Fig. 5). US-ARM had the smallest RMSE (0.80 t ha −1 ) due to calibration, and the US-PON site had only a slightly higher RMSE (1.11 t ha −1 ) than US-ARM because the two sites have similar climates (both are located in northern Oklahoma). The yield was overestimated by 0.59 and 1.00 t ha −1 for US-ARM and US-PON. However, at US-CRT and CAF-CT, which are far from US-ARM, the yield RMSE values were much higher (2.46 and 3.68 t ha −1 ) and yields were underestimated by 2.45 and 3.68 t ha −1 . In terms of the interannual variation in yield, CLMWHE accurately simulated the yield decline at the US-ARM site from 2003 to 2006 and captured the interannual variation from 2007 to 2010, but failed to simulate the lowest yield in 2007. We also note that CAF-CT is the only site where yield simulations with CLMWHE were worse than CLMBASE. Here the yield RMSE increased from 0.90 t ha −1 in CLMBASE to 3.86 t ha −1 in CLMWHE (discussed further below). CLMWHE (Fig. 6b) showed a better US yield estimation (RMSE reduced by 24 %) than CLMBASE (Fig. 6c) but still underestimated the US winter wheat yield by 35 % compared to USDA county-level non-irrigated winter wheat yield data averaged across 1979-2010 (Fig. 6a), which is largely due to the underestimation of the northwestern US winter wheat yield. In the simulation, winter wheat growth in the northwest was limited by soil water availability. Figure 7 shows that the plant wetness factor (btran, averaged across growing season) was < 0.5 in much of the region. In CLM, btran varies between 0 and 1 and represents the available soil water to the plant (1 means no water stress at all). The low btran in this region limited photosynthesis and reduced crop yield in the model. We applied irrigation to a single point in the northwest, and the yield increased from 1.98 to 5.42 t ha −1 with irrigation, which is consistent with yields in subregions of the northwest. For the southeastern US, CLMWHE simulated a similar yield to the Southern Great Plains, but the simulated yield was lower than USDA yield for the region, which may be due to model deficiencies in the representation of fertilization, lack of regional varieties, or other forms of crop management not well captured in the model.
We quantified frost damage impacts on LAI and yield in the US domain through CLMWHE simulations with and without the frost damage function. Frost damage resulted in lower LAI and yield, with spatial variation across the US (Fig. 8). For the domain average, frost damage reduced LAI by 27 % (or 1.69 m 2 m −2 ) and reduced yield by 28 % (or 0.5 t ha −1 ). The greatest reduction (> 45 %) in LAI occurred in Texas and the southeastern US, which was due to insufficient hardening, producing a high LT 50 and low survival rate. LAI in the cold northern US regions had less impact (< 15 %) from frost damage. The cold damage indirectly affects yield through reduced photosynthesis with lower LAI, but photosynthesis and yield changes were not always geographically consistent with the LAI damage. For example, the northern Great Plains and Midwest had greater percentage reductions (> 45 %) in yield than reductions in LAI (< 15 %).
A simple, single-variable, statistical yield regression indicated that variables important in predicting CLMWHE yield may be irrelevant for predicting observed yield. The simulated yields depend most on the growing degree days (R 2 = 0.94), which only explained 24 % of observed yield variation (Fig. 9). Although there are many other variables that contribute to variation in the CLMWHE yield, such as peak LAI, length of leaf emergence period, harvest date, and day of LAI peak, these variables have strong correlations with growing degree days, which suggests that crop yields in CLM depend too much on growing degree days. Soil moisture, especially the lower layer soil moisture at 20 cm, is the only variable that explained a large amount of yield variation in both observations (R 2 = 0.80) and CLMWHE (R 2 = 0.86). So improved representation of soil hydrology, especially the interannual variability of soil moisture, may improve the simulations of yield variation.

Discussion and conclusions
We improved the winter wheat model in CLM with new vernalization, frost tolerance, and frost damage processes. We modified the grain carbon allocation algorithm and performed a calibration on three key parameters (minimum planting temperature, maximum crop growth days, and initial value of leaf carbon allocation coefficient) at the US-ARM site, and then validated the model performance at multiple other sites in North America. These model alterations led to large improvements for crop phenology (indicated by LAI), net ecosystem exchange, and spring latent heat flux. Additionally, the modeled yield RMSE is comparable to literature values (Palosuo et al., 2011). However, there are several remaining limitations of the model that need to be resolved in a future version. CLM needs to better represent the land cover after harvest to include the influence of weeds and litter on the carbon balance. Although CLM properly simulated the seasonal evolution of NEE, the NEE RMSE at US-ARM and US-CRT (2-3 µmol m −2 s −1 ) is higher than the Lund-Potsdam-Jena managed Land model (LPJ-ml) simulation (Bondeau et al., 2007) at the US-PON site (1.09 µmol m −2 s −1 ), which is largely due to incorrect simulation of NEE after harvest. When win-ter wheat is not alive, CLM represents the land cover as bare ground so GPP is zero but heterotrophic respiration from litter and soil organic matter is still large, which resulted in a carbon source after harvest (positive NEE). This is not true for the US-ARM site, where we observed weed growth after harvest and positive NEE (Raz-Yaseef et al., 2015). This vegetation cover after harvest resulted in a near-zero NEE at US-ARM or negative NEE at US-CRT site. Appropriate simulation of the post-harvest land cover is critical for better representing the role of agriculture in global carbon fluxes.
CLM needs to further increase the influence of crops and vegetation on the surface energy balance and latent heat flux (LE) in particular. The LE simulation in CLM has a R 2 range from 0.62 to 0.97 across the four sites, which is better than other model simulations at the same sites. For example, Arora et al. (2003) simulated LE RMSE 22.0 W m −2 at US-PON from March to May in 1997 using their coupled landsurface and terrestrial ecosystem model (CLASS-Twoleaf model), and we simulated LE RMSE 10.55 W m −2 at the same site from March to May averaged for 1998-1999. But our LE response to the improved LAI was not as strong as we expected. Williams and Torn (2015) showed that vegetation has stronger controls on surface heat flux partitioning than soil moisture at the US-ARM site, where LAI explained 53 % of the variation in evaporative fraction (EF = LE/(LE+H)), while soil moisture only explained 11 % of EF variation. For our 6 winter wheat years, Williams and Torn (2015) used 8 years that included other cover types, and we found similar patterns in the US-ARM observations. LAI explained 40 % of EF variation, while soil moisture only explained 7 % (not shown). However, EF in CLMWHE and CLMBASE was not as well predicted by LAI, which only explained 5 and 1 %, respectively, of variation in EF. In CLM, vegetation affects LE through leaf transpiration, and LE in vegetated grid cells has three components: soil evaporation, wet leaf evaporation, and dry leaf transpiration (Lawrence et al., 2007). The wet leaf evaporation is the smallest and overall LE depends on the tradeoff between soil evaporation and leaf transpiration. Soil evaporation is dominant when LAI is small, and leaf transpiration is dominant when LAI is higher. Using the US-ARM Geosci. Model Dev., 10, 1873-1888, 2017 www.geosci-model-dev.net/10/1873/2017/ site as an example, in CLMBASE, the leaf transpiration is very small due to low LAI but soil evaporation is very large, which is opposite in CLMWHE ( Fig. 4a and b). Such a tradeoff is why the large increase in LAI in CLMWHE only increased overall LE a small amount compared to CLMBASE. We found that although the new soil evaporation parameterization (Swenson and Lawrence, 2014) in a later version of CLM reduced soil evaporation and improved the summer and fall LE simulation (Fig. 4), it also reduced spring soil evaporation (Fig. 4b) and induced an even lower spring LE. If we assume this reduction in soil evaporation is reasonable, then further improvement of the LE simulation needs to be focused on increasing the leaf transpiration and correcting the inconsistent peak time between leaf transpiration and LAI. CLMWHE tends to underestimate the winter wheat yield, but the yield RMSE is comparable to other literature values.
The averaged yield RMSE across the four sites is 1.96 t ha −1 , which was within the range of other winter wheat models' yield RMSE (1.41-2.15 t ha −1 ) reported by Palosuo et al. (2011), although the simulation sites and years are different. The low simulated yield may be due to the insufficient calibrations. Table 4 listed the key crop growth parameters used in CLMWHE. We calibrated these parameters at the US-ARM site, and applied the same values everywhere, which is a common approach in land-surface model development. However, the US-ARM site represents a relatively low yield compared to the US national average. This likely contributed to underestimated yields at sites or in regions with historically greater yields, such as at US-CRT and CAF-CT, and in the southeastern and northwestern US. The current modeling framework of CLM does not facilitate the substantial calibration required to more accurately capture the full range of observed winter wheat yields. As a gridded global crop model, gridded parameters (e.g., maximum maturity days, leaf emerge and grain fill threshold, and background litter fall factor) that allow for spatial variation in the key parameters should be considered in future versions of the model. Alternately, for parameters with spatial structure linked to environmental variation, parameters could vary with climate or soil conditions. We investigated the causes of the low yield in 2007 at the US-ARM site. The observational yield data in Fig. 4 are from the county-level USDA yield estimate, which is very similar (RMSE = 0.11 t ha −1 ) to the US-ARM site observed yield. Both the site observed yield and USDA county-level yield showed the lowest values in 2007 (1.35 t ha −1 ), so the low yield in 2007 is not specific to the field represented by the US-ARM site. The field notes indicate that only part of the wheat field was harvested in early July of 2007, while the remainder of the field was not harvested due to wheat sprouting in the head. Pre-harvest sprouting reduces the quality (and price) of the grain, and can occur when the crop is exposed to prolonged heavy rain. We examined the precipitation, temperature, and wind speed during May and June across the 8 years and found that in 2007 there was double the mean precipitation in June (108.2 % higher than the 8year June average). Such large amounts of precipitation may have caused the low observed yield. Assuming that the low yield was strongly linked to the high rainfall, the implication is that the winter wheat crop model needs to include more types of environmental damage to fully simulate interannual variation in yields.
Our new winter wheat model improved the LAI and yield simulation compared to the original winter wheat model except at the CAF-CT site due to (1) drier soil conditions during the grain fill phase and (2) the adjusted grain carbon allocation coefficient in CLMWHE. CLMWHE started the grain fill phase during the end of May, while CLMBASE started the grain fill phase in the beginning of May. In mid-May, the higher LAI in CLMWHE resulted in 30 % more LE than CLMBASE and dried the soil. The plant wetness factor dropped from 0.98 on 15 May to 0.19 on 28 May in CLMWHE, but remained greater than 0.89 through May in CLMBASE. The grain carbon allocation in CLMWHE is strongly limited by soil water available to the plant, so grain carbon was much smaller in CLMWHE than in CLMBASE. The larger LAI also increased LE at the other three sites relative to the baseline simulations, but did not result in longterm water stress due to sufficient precipitation during the rainy season. The CAF-CT site has 10 times less precipitation than the other three sites in May. The observed LE at CAF-CT site is much higher than the simulation given the same precipitation, suggesting the plant wetness factor in the model is too sensitive to low precipitation. Some of our modeling approaches need further improvements to the processes supported by new observations. We developed hypothetical (empirically based) frost damage functions that account for both small and frequent damage early in the growing season, and severe damage in winter and spring. Such a hypothetical approach is not uncommon in crop modeling when lacking observations at a process level. For example, CERES-Wheat (Ritchie and Otter, 1985) developed a hypothetical leaf senescence scheme during cold temperature that monitored a cold hardening index (http://nowlin.css.msu.edu/wheat_book/CHAPTER3. html). We tested the CERES-Wheat leaf senescence scheme in CLM and found it produced too much reduction in LAI. This finding motivated our approach based on recently developed frost tolerance indicators. The magnitude of the leaf carbon reductions and how such reductions are linked to frost damage requires more observations, such as high-frequency aboveground and belowground biomass measurements. Furthermore, the linear yield regressions showed that the yields in CLM depend too much on growing degree days, a sensitivity that is not reflected in observations. In CLM, growing degree days not only determine crop phenology, but are also involved in calculation of the carbon allocation coefficients (Table 3). Exploring other possible factors that control phenology and carbon allocation may improve crop simulation in CLM. Meanwhile, soil moisture, especially the deeper soil moisture, explains a large amount of the yield variation in both observations and the simulations. Fixing the current biases in soil hydrology and reducing interannual variability in the simulated soil moisture will benefit the yield simulation.
In summary, we found that our new winter wheat model in CLM better captured the monthly variation of leaf area index and improved the latent heat flux and net ecosystem exchange simulation in spring. Our model correctly simulated the interannual variation in yield at the US-ARM site, but the crop growth calibration at the US-ARM site introduced a low-yield bias that produced underestimates of the yield in high-yield sites (US-CRT and CAF-CT) and regions (northwestern and southeastern US). Our analysis indicates that while this model of winter wheat represents a substantial step forward in simulating the processes that influence winter wheat growth and yield, further refinements would be helpful to capture the impacts of environmental stress on energy partitioning, carbon fluxes, and yield, and would improve simulations of regional variation.
Code availability. The winter wheat code in CLM4.5 can be requested from Yaqiong Lu (yaqiong@ucar.edu). It will be available in the next released version of the Community Land Model (version 5) for public access.
The Supplement related to this article is available online at doi:10.5194/gmd-10-1873-2017-supplement.