As part of the terrestrial branch of the Japan-funded Arctic Climate Change Research Project (GRENE-TEA), which aims to clarify the role and function of the terrestrial Arctic in the climate system and assess the influence of its changes on a global scale, this model intercomparison project (GTMIP) is designed to (1) enhance communication and understanding between the modelling and field scientists and (2) assess the uncertainty and variations stemming from variability in model implementation/design and in model outputs using climatic and historical conditions in the Arctic terrestrial regions. This paper provides an overview of all GTMIP activity, and the experiment protocol of Stage 1, which is site simulations driven by statistically fitted data created using the GRENE-TEA site observations for the last 3 decades. The target metrics for the model evaluation cover key processes in both physics and biogeochemistry, including energy budgets, snow, permafrost, phenology, and carbon budgets. Exemplary results for distributions of four metrics (annual mean latent heat flux, annual maximum snow depth, gross primary production, and net ecosystem production) and for seasonal transitions are provided to give an outlook of the planned analysis that will delineate the inter-dependence among the key processes and provide clues for improving model performance.
The pan-Arctic ecosystem is characterized by low mean temperatures, snow cover, and seasonal frozen ground or permafrost with a large carbon reservoir, covered by various biomes (plant types) ranging from deciduous and evergreen forests to tundra. The Arctic climate and ecosystem differ from the tropical and temperate counterparts primarily because it is a frozen world. Moreover, the terrestrial Arctic varies from area to area according to the location, glacial history, and climatic conditions. However, sites, networks, and opportunities for direct observations are still sparse relative to the warmer regions owing to physical and logistical limitations. To investigate the impact of climate change in this region, a number of studies using both analyses of observed data and numerical modelling have been carried out (e.g. Zhang et al., 2005; Brown and Robinson, 2011; Brutel-Vuilmet et al., 2013; Koven et al., 2011, 2013; Slater and Lawrence, 2013). Various numerical modelling schemes have been developed to treat physical and biogeochemical processes on and below the land surface. Some of these processes are site-specific or process-oriented, while others are implemented as components of atmosphere–ocean coupled global climate models (AOGCMs), or Earth system models (ESMs) to interact with the overlying atmosphere. Among these processes, snowpack, ground freezing/thawing, and carbon exchange are the most relevant and important processes in terrestrial process models (TPMs) for investigating the climate and ecosystem of the pan-Arctic region.
“Pirates of the Arctic” sit at the round table.
The GRENE-TEA model intercomparison project (GTMIP) was originally planned as part of the terrestrial research project of the GRENE Arctic Climate Change Research Project (GRENE-TEA) to achieve the following targets: (a) to pass possible improvements regarding physical and biogeochemical processes for Arctic terrestrial modelling (excluding glaciers and ice sheets) in the existing AOGCM terrestrial schemes for the AOGCM research community, and (b) to lay the foundations for the development of future-generation Arctic terrestrial models. The project, however, involves groups of researchers from different backgrounds/disciplines (e.g. physics/geophysics, glaciology, biogeochemistry, ecosystem, forestry) with a wide range of research methods (e.g. field observations, remote sensing, numerical modelling), target domains (e.g. northern Europe, Siberia, Alaska, northern Canada) and scales (from site-level to pan-Arctic). As is often the case, multidisciplinary opportunities were limited, initially creating a considerable challenge for the project (Fig. 1a). Communications between groups (e.g. modelling and field studies, physical and ecosystem disciplines, process-oriented and large-scale modelling), if any, were inconclusive and sporadic. Observational practices and procedures (e.g. variables to measure, equipment to use, standard zero depth for ground measurements) were different among groups and disciplines and lacked standardization. Although each individual group had the needs and intention to interact with other groups, the requisite collaboration could not be achieved. Opinions obtained in the early stages revealed hidden quests for possible collaborations for observational data for driving and/or validating data, use of numerical models to test empirical hypotheses gained at the field, interpretation of observed phenomena, and optimization of observation network strategies. As a result of this situation, the model intercomparison project was deliberately blueprinted to promote communication and understanding between modelling and empirical scientists, and among modellers: the GTMIP protocols and data sets are set to function as a hub for the groups involved in the project (Fig. 1b). It also aimed to enhance the standardization of observation practices among the GRENE-TEA observation sites and to form a tight collaboration between the field and modelling communities, laying a cornerstone for creating the driving data set (details of the Stage 1 driving data and their creation as a product of collaboration between modellers and field scientists are documented by Sueyoshi et al., 2015).
Since the 1990s, a number of model intercomparison projects (MIPs) have been carried out, focusing on the performance of TPMs, AOGCMs, and ESMs; examples include PILPS (Project for Intercomparison of Land-Surface Parameterization Schemes; Henderson-Sellers, 1993), SnowMIP (Snow Models Intercomparison Project; Etchevers et al., 2004; Essery et al., 2009), Potsdam NPP MIP (Potsdam Net Primary Production Model Intercomparison Project; Cramer et al., 1999), C4MIP (Coupled Climate–Carbon Cycle Model Intercomparison Project; Friedlingstein et al., 2006), CMIP5 (Coupled Model Intercomparison Project; Taylor et al., 2012), and MsTMIP (Multi-scale synthesis and Terrestrial Model Intercomparison Project; Huntzinger et al., 2013), to name a few.
For snow dynamics, SnowMIP2 showed a broad variety in the maximum snow accumulation values, particularly at warmer sites and in warmer winters, although the duration of snow cover was relatively well simulated (Essery et al., 2009). The same study also noted that the SnowMIP2 models tend to predict winter soil temperatures that are too low for cold sites and for sites with shallow snow, a discrepancy arguably caused by the remaining uncertainties in ecological and physical processes and the scarcity of winter process measurements for model development and testing in the boreal zone. The CMIP5 models simulated the snow cover extent for most of the Arctic region well, except for the southern realm of the seasonal snow cover area (Brutel-Vulmet et al., 2013). The poor performance of some of the TPMs in this region is due to an incorrect timing of the snow onset and possibly to an incorrect representation of the annual maximum snow cover fraction (Brutel-Vulmet et al., 2013). For ground freezing/thawing processes, Koven et al. (2013) showed the current status of the performance of AOGCMs for permafrost processes based on CMIP5 experiments. There was large disagreement among modelled soil temperatures, which may have been due to the representation of the thermal connection between the air and the land surface and, in particular, its mediation by snow in winter. Vertical profiles of the mean and amplitude of modelled soil temperatures showed large variations, some of which can be attributed to differences in the physical properties of the modelled soils and coupling between energy and water transfer. This appears to be particularly relevant for the representation of organic layers.
For the biogeochemical cycles, a number of studies based on MIPs have been
carried out. The broad global distribution of net primary productivity (NPP)
and the relationship of annual NPP to the major climatic variables coincide
in most areas with differences among the 17 global terrestrial
biogeochemical models that cannot be attributed to the fundamental modelling
strategies (Cramer et al., 1999). The ESMs in CMIP5 use the climate and
carbon cycle performance metrics, and they showed that the models correctly
reproduced the main climatic variables controlling the spatial and temporal
characteristics of the carbon cycle (Anav et al., 2013). However, several
weaknesses were found in the modelling of the land carbon cycle: for example,
the leaf area index is generally overestimated by models compared with
remote sensing data (Anav et al., 2013); NPP and terrestrial carbon storage
responses to CO
At scales from a continental level (including those mentioned above) to site level (model–observation comparisons; e.g. Zaehle et al., 2014), different MIPs have also been conducted and generally study physical or ecosystem processes separately. PILPS (Henderson-Sellers et al., 1993) and a series of snow MIPs (Etchevers et al., 2004; Essery et al., 2009) are well-known MIPs for physical processes, targeting hydrology and snow dynamics. Recently, a MIP for tundra sites has been conducted but its focus is limited to soil thermal dynamics (Ekici et al., 2015). In turn, ecosystem MIPs on continental scales have two predecessors: the North American Carbon Program site synthesis (Schwalm et al., 2010) and CarboEastAsia-MIP (Ichii et al., 2013). Although both MIPs employ multiple terrestrial biosphere models to different eddy-covariance measurement sites (Schwalm et al., 2010, with 22 models for 44 sites in North America; Ichii et al., 2013, with 8 models for 24 sites in Asia), boreal and Arctic sites were not the major targets. In other studies targeting specific eco-climatic regions, the Arctic was again not the main domain: Jung et al. (2007) assessed GPPs for Europe and Ichii et al. (2010) for Japan. Rawlins et al. (2015) assessed carbon budget differences among several GCM-compatible models in northern Eurasia, with little examination of the physical processes. In other regions than the Arctic, there have been cross-sectional evaluations of physical and ecosystem processes, such as Morales et al. (2005), evaluating carbon and water fluxes in Europe, and de Gonçalves et al. (2013), the LBA-Data Model Intercomparison Project (LBA-DMIP), analysing water and carbon fluxes in the Amazon.
The GTMIP consists of two stages (Fig. 2): one dimensional, historical GRENE-TEA site evaluations for examining the model's behaviour and its uncertainty (Stage 1); and circumpolar evaluations using projected climate change data from GCM outputs (Stage 2). Hereafter, we describe the Stage 1 protocol. This stage aims to evaluate the physical and biogeochemical TPMs through 3-decade site simulations driven and validated by the GRENE-TEA site-derived data. It calls for broader participation in the activity from a wider community to assure robust assessments for model-derived uncertainty and to efficiently investigate the terrestrial system response to climate variability considering the diversity of the pan-Arctic sites. Thus, the scope and geographical domain of GTMIP Stage 1 is unique in its target of the Arctic region, including both taiga and tundra, and in its evaluations of the behaviour of the energy–snow–soil–vegetation subsystem, employing a wide range of models from physical land surface schemes to terrestrial ecosystems.
Schematic diagram for stages 1 and 2 of GTMIP.
In GTMIP, a variety of models ranging from specific models that focus on snowpack formation processes to highly complex DGVMs (dynamic global vegetation models) are expected to participate. The following five categories (from “a” to “e”) set the unit for the key processes to assess the performance of the existing TPMs in the pan-Arctic region, to evaluate the variations among the models and the mechanisms behind their strengths and weaknesses, and to obtain information and guidance to improve the next generation of TPMs. The five categories are (a) exchange of energy and water between atmosphere and land, (b) the snowpack, (c) phenology, (d) ground freezing/thawing and the active layer, and (e) the carbon budget. The categories cover the essential processes that make the pan-Arctic region unique compared with other regions: seasonal changes in both physical and biogeochemical processes and the associated strong climate feedback, which are characterized by liquid–ice phase changes, the subsequent ecosystem response, and their interactions.
The key process categories and target processes.
The scientific questions at the Stage 1 are the following. How well do the TPMs reproduce target metrics (examples are shown in column B in Table 1) in terms of agreement with observations? How do the reproductions vary among the models? If the reproductions are good or poor in some models, which processes in the TPMs are responsible and why?
Location map of the GRENE-TEA sites.
The target period for Stage 1 was set from 1980 to 2013 to provide at least 30 years of data, the minimum requirement for climatological analyses. The period is also favourable in terms of the accuracy and coherence of the relevant large-scale climate data thanks to the fully fledged operation of various satellite observations (e.g. Dee et al., 2011). We are providing the following driving data for Stage 1: surface air temperature, precipitation, specific humidity, air pressure, wind speed, incident short-wave and long-wave radiation.
For this stage (site simulations), forcing and validation data have been
prepared, taking maximum advantage of the observation data from GRENE-TEA
sites in operation (Fairbanks (FB) in Alaska; Tiksi (TK), Yakutsk (YK),
Chokurdakh (CH), and Tura (TR) in Russia; and Kevo (KV) in Finland, shown in
Fig. 3), to evaluate the inter-model and inter-site variations for
1980–2013. These sites, the latitudes of which vary from
62 to 71
Because of the severe conditions for maintaining monitoring sites in the Arctic
region, continuous observation data over years are scarce, which makes it
very difficult to create ready-to-drive data directly from observations
(e.g. owing to missing values, discontinuity of measurement periods,
outliers). To overcome this problem, we first constructed the backbone of the
continuous forcing data (called “level 0” or L0; Saito et al., 2014a) from
climate reanalysis products to avoid the issues of limited coverage and/or
missing data, or the lack of consistency inherent in observational data,
using the bias-corrected monthly Climate Research Unit (CRU) for the
temperature data set (Harris et al., 2014) and the Global Precipitation
Climatology Project (GPCP) for the precipitation data set (Adler et al., 2003)
at the respective nearest grid to the sites. The European Centre for
Medium-range Weather Forecasts ReAnalysis (ERA)-interim reanalysis data (Dee
et al., 2011) were chosen from four products (National Centers for
Environmental Prediction/National Center for Atmospheric Research
(NCEP/NCAR), NCEP-Department of Energy (DOE), Japanese Reanalysis
(JRA)-55, and ERA-interim) because they showed the smallest bias relative to
the monthly CRU and GPCP in terms of 2 m air temperature and precipitation in
the pan-Arctic region (north of 60
Assimilation of the observed data was then applied to reflect local characteristics and to derive the primary driving data, “level 1” data (L1; Saito et al., 2014b) and, in addition, the level 1 hybrid data (L1H) by replacing data with observed data when available. The L1 data set was provided for four sites (FB, KV, TK, and YK) owing to the availability of the observed data for validations. For the creation of the site-specific data, collaboration with the field scientists who are in charge of the observation sites and know the circumstances of the data obtained was critical. Further details on the creation of the L0 and L1 data sets, and their basic statistics, are described in Sueyoshi et al. (2015).
The location, dominant vegetation type, soil, climate, fraction of
photosynthetically active radiation (fPAR), possible data for validation,
and references for observed data for
Continued.
As the warming trend is becoming visible, in particular for northern
high-latitude regions (IPCC, 2013), the 20-year detrended meteorological
driving data set is provided for spin up, allowing biogeochemical models to
set up initial soil carbon conditions without the warming trends and/or ENSO
(El Niño–Southern Oscillation). This data set is based on the L1 data for
the period of 1980–1999 (Saito et al., 2015). The monthly values of the
fraction of
photosynthetically active radiation (fPAR) and leaf area index (LAI) data sets
at GRENE-TEA sites, created based on Moderate Resolution Imaging
Spectroradiometer (MODIS) satellite data (MOD15A2, MYD15A2), are also
provided where required (Saito et al., 2014c). These driving data sets are
provided in the ASCII fixed-length record files and are available through
the Arctic Data Archive System (ADS;
The site description, including locations, dominant vegetation types, soil, climate, fPAR, LAI, data for model validation, and references for observation data, is summarized in Table 2.
As already proposed in existing MIP studies (e.g. Ichii et al., 2010), we set Stage 1 to consist of two further substages: 1A and 1B. Substage 1A, which aims to evaluate the inter-model variations in baseline performance at each site, requested the participants to use the parameters in the default settings for the provided boundary conditions, such as land cover type. In contrast, substage 1B allows tuning for the best reproduction of observations so that the parameter sensitivity among the sites can be evaluated. Process 1B is particularly important for the pan-Arctic region because many monitoring sites are located in temperate regions and models are generally validated against these environmental conditions.
We set the initial condition date to 1 September 1979, so that simulations started with a no-snow condition. The initial data for the model boundary conditions are available, as most stations can provide observation data for soil temperature and soil moisture profiles. However, each model could use its own method for initialization.
The spin-up process may also differ between models. However, we recommend
continuing spin up until a steady state is achieved for the main variables
(see Sect. 2.5). For example, Takata (2002) defined a threshold of a steady
state in a slowly varying system as
For biogeochemical cycle models, in particular, we recommend maintaining spin
up over at least 2000 years using the detrended meteorological driving data
(also provided through ADS) because soil accumulation is quite slow owing to
the low soil temperature and pre-industrial atmospheric CO
We request participants to submit those variables listed in Table S1 (refer to the Supplement) in ASCII format with CSV-type files. The template file for output submission has been provided through ADS.
The variables for submission are categorized into six groups: (0) model driving, (1) energy and water budget, (2) snow dynamics, (3) vegetation, (4) subsurface hydrological and thermal states, and (5) carbon budget, in parallel to the analysis categories. Since the spectrum of the participating models is expected to be very large (ranging from physical to biogeochemical to ecosystem models; Fig. 4), we made an extensive list of output variables to cover the expected range. However, the actual output variables a model submits will be dependent on the model's specification. Considering this spread, the priority for each variable, classed at three levels, was set according to the necessity and availability for evaluation of the model performance. In addition, participants are requested to provide information on the status of the variables in their model (i.e. model driving, prescribed parameter, prognostic, diagnostic, or not applicable), through the provided questionnaire (Supplement, Table S3; provided through ADS), to identify the characteristics of the model.
The habitat of models participating in the GTMIP. The vertical and horizontal axes show the ratio of the incorporation of biogeochemical processes and physical processes, respectively.
The list of metrics for model performance evaluation for
Continued.
Although the temporal resolution of a variable should depend on the model, we
request submission of the variables with the minimum temporal resolution
available for the model. For the models that provide daily outputs, the time
for each day should be defined by the local time (FB: UTC
Example comparison of model outputs with observations, and the inter-model range for the annual mean latent heat flux for averages from 1980 to 2013. The results of biogeochemical and physical models are shown by boxes and lines in orange and blue, respectively. The biogeochemical models included are BEAMS, Biome-BGC, CHANGE, SEIB-DGVM, and VISIT, while the physical models are 2LM, JULES, MATSIRO, and PB-SDM. The orange and blue horizontal lines indicate medians. The bottom and top of the boxes correspond to the 25th and 75th percentiles of the average values, for 1980–2013 (except BEMAS, which is for 2001–2011), of model outputs. The bottom and top of the lines show the minimum and maximum outputs from the participating models, respectively. The dots show the observed average values for 2011, 2012, and 2013 at FB and for 1998, 2001, 2003, 2004, 2007, and 2008 at YK.
As for Fig. 3, except the plot displays annual maximum snow depth. The physical models include 2LM, JULES, MATSIRO, PB-SDM, SMAP, and SNOWPACK (for FB and KVTK only). The observation shows the average values for 1980–2012, 1996–2013, 1980–2008, and 1980–2008 at FB, KV, TK, and YK, respectively.
Participation in GTMIP Stage 1 is voluntary and open to any interested modellers or institutions. A total of 16 TPMs have announced their participation in GTMIP Stage 1. These models are the permafrost model (FROST), physical snow models (SMAP and SNOWPACK), land surface models (2LM, HAL, JULES, several versions of MATSIRO, and SPAC-multilayer), a physical and biogeochemical soil dynamics model (PB-SDM), terrestrial biogeochemical models (BEAMS, Biome-BGC, STEM1, and VISIT), dynamic global vegetation models (LPJ and SEIB-DGVM, coupled with a land surface model (Noah-LSM) or stand-alone), and a coupled hydrological and biogeochemical model (CHANGE). The models with higher degrees of complexity in their treatment of physical processes are 2LM, CHANGE, FROST, HAL, JULES, MATSIRO, PB-SDM, SNOWPACK, SMAP, and SPAC-multilayer. The models with higher degrees of complexity in their treatment of biogeochemical processes are BEAMS, Biome-BGC, CHANGE, LPJ, SEIB-DGVM, STEM1, and VISIT. The models enabled to couple with AOGCMs (currently, JULES, HAL, LPJ, MATSIRO, and SMAP) make up about 30 % of the participating models.
As for Fig. 3, except the plot displays annual gross primary production. The relevant biogeochemical models include BEAMS, Biome-BGC, CHANGE, LPJ, SEIB-DGVM, STEM1, and VISIT. The observation shows the average values for 2011–2013 and 2004–2012 at FB and YK, respectively.
As for Fig. 5, except the plot displays annual net primary production.
Example of seasonal transitions in ground temperature, snow, and vegetation among models.
To illustrate the variability of the participating models with respect to the implemented physical and biogeochemical processes, we created a diagram showing the habitat of the currently participating models (Fig. 4) by incorporating the model survey results referred to in the previous section. The spread of the models is large for both physical and biogeochemical process dimensions, which will benefit the evaluation and attribute examinations of the models regarding their ability to reproduce observations.
This section presents the analysis plan for GTMIP Stage 1 and sample outputs
based on already submitted materials. To answer the key questions for the
target processes proposed in Sect. 2.1, we plan to analyse the model output
by describing the model–model and model–observation differences, discerning
the cause of these differences, and investigating parameter sensitivity. The
outputs of multiple models will be compared in terms of the metrics shown in
Table 3. These metrics are divided into five categories (i.e. energy and
water budget, snowpack, phenology, subsurface hydrological and thermal
states, and carbon budget). For terrestrial climate simulations on the
decadal scale, the most important outputs are the latent heat flux (energy
and water budget) and the net ecosystem exchange (carbon budget). The latent
heat flux (evapotranspiration) is the essential driver of precipitation
inland at high latitudes owing to high rates of recycling (e.g. Dirmeyer et
al., 2009; Saito et al., 2006). Net ecosystem exchange (NEE) plays a
fundamental role in determining global CO
Analyses will be organized and conducted in the following manner. Topical analyses, constituting major subsets of the project outcomes, will evaluate characteristics of model performances and their inter-site variations within each of the above five categories, while cross-sectional analyses between categories will explore the functionality and strength of interactions between processes. These analyses will be utilized for mining crucial processes to improve the site-level TPMs as well as large-scale GCM/ESM components.
First, the focus will be on model output variability for both the inter-annual and the inter-decadal timescales, based on the output time series over more than 30 years. Inter-site differences will also be evaluated for the four GRENE-TEA sites in the Arctic region, each of which has distinct characteristics. The vegetation type for three of the four sites is forest (two evergreen conifer: FB and KV; one deciduous conifer: YK) and the remaining site is tundra (TK). Three sites (FB, TK, and YK) are in the permafrost region, while KV is underlain by seasonally frozen ground. Figures 5–8 show statistical summary comparisons of the model outputs by site (the land cover and soil type parameters used for the simulations are shown in Table 2), expressing inter-model variations for physical and biogeochemical models using box plots for four variables of the metrics mentioned above: the annual mean latent heat flux (Qle_total_an), the annual maximum snow depth (SnowDepth_max), the annual gross primary production (GPP_an), and the annual net ecosystem production (NEP_an), respectively. When observed values were available (i.e. latent heat flux for FB for 2011–2013 and YK for 1998, 2001, 2003, 2004, 2007, and 2008), they are shown by black dots.
Second, the cause or attributes of the differences among models, or between models and observations, will be explored by employing statistical evaluations such as multivariate analyses and time series analyses on the metrics and individual eco-climate variables. This will improve understanding of the interrelation between the incorporated processes in each model. Figure 9 shows an exemplary comparison of a seasonal transition in the snow–permafrost–vegetation subsystem, expressed similarly by box plots. The figure summarizes the average dates for (from bottom to top) the completion of snowmelt, the thawing of the top soil layer, the start and end of greening, the freezing of the top soil layer, and the start of seasonal snow accumulation. A comparison of the timings of these events over years and sites will illustrate the individual model's characteristic behaviour in seasonal transitions, and their strength regarding process interactions, in combination with ordinary multivariate analysis techniques.
Finally, sensitivity tests for the model parameters are planned to quantify the effect of parameter sensitivity on the model's reproducibility.
This paper presented an overview of the GTMIP activity and the experiment protocol for the Stage 1 intercomparison, with site simulations using the GRENE-TEA site observation data in the pan-Arctic region for the previous 3 decades. We described the framework of our project including targets, and provided data sets, conditions on model integration, lists of model output variables, and the habitat of currently participating models. We also included analysis plans and exemplary results to give an outlook of the model–model and model–observation comparisons with respect to the major metrics defined for the energy budget, snowpack dynamics, and the carbon budget. This model intercomparison project was realized through a tight collaboration between the GRENE-TEA-participating modelling and field scientists. Additionally, we expect to offer insightful demonstrations of various cold-region terrestrial physical and biogeochemical TPMs and valuable information for future improvements of the relevant models. All meteorological driving data for this project have already been made publicly available through ADS. The model outputs and comprehensive results from the GTMIP, which we hope will provide a useful benchmark data set for the community, will also be available to the public at the end of the project.
This study is supported by the GRENE Arctic Climate Change Research Project, Ministry of Education, Culture, Sports, Science and Technology, Japan. Edited by: D. Roche