The HadGEM2-ES implementation of CMIP5 centennial simulations

The scientific understanding of the Earth’s climate system, including the central question of how the climate system is likely to respond to human-induced perturbations, is comprehensively captured in GCMs and Earth System Models (ESM). Diagnosing the simulated climate response, and comparing responses across different models, is crucially dependent on transparent assumptions of how the GCM/ESM has been driven – especially because the implementation can involve subjective decisions and may differ between modelling groups performing the same experiment. This paper outlines the climate forcings and setup of Correspondence to: C. D. Jones (chris.d.jones@metoffice.gov.uk) the Met Office Hadley Centre ESM, HadGEM2-ES for the CMIP5 set of centennial experiments. We document the prescribed greenhouse gas concentrations, aerosol precursors, stratospheric and tropospheric ozone assumptions, as well as implementation of land-use change and natural forcings for the HadGEM2-ES historical and future experiments following the Representative Concentration Pathways. In addition, we provide details of how HadGEM2-ES ensemble members were initialised from the control run and how the palaeoclimate and AMIP experiments, as well as the “emissiondriven” RCP experiments were performed. Published by Copernicus Publications on behalf of the European Geosciences Union. 544 C. D. Jones et al.: The HadGEM2-ES implementation of CMIP5 centennial simulations


Introduction
Phase 5 of the Coupled Model Intercomparison Project (CMIP5) is a standard experimental protocol for studying the output of coupled ocean-atmosphere general circulation models (GCMs). It provides a community-based infrastructure in support of climate model diagnosis, validation, intercomparison, documentation and data access. The purpose of these experiments is to address outstanding scientific questions that arose as part of the IPCC Fourth Assessment report (AR4) process, improve understanding of climate, and to provide estimates of future climate change that will be useful to those considering its possible consequences and the effect of mitigation strategies.
CMIP5 began in 2009 and is meant to provide a framework for coordinated climate change experiments over a five year period and includes simulations for assessment in the IPCC Fifth Assessment Report (AR5) as well as others that extend beyond the AR5. The IPCC's AR5 is scheduled to be published in September 2013. CMIP5 promotes a standard set of model simulations in order to: -evaluate how realistic the models are in simulating the recent past, -provide projections of future climate change on two time scales, near term (out to about 2035) and long term (out to 2100 and beyond), and -understand some of the factors responsible for differences in model projections, including quantifying some key feedbacks such as those involving clouds and the carbon cycle.
A much more detailed description can be found on the CMIP5 project webpages (see URL 1 in Appendix A) and in Taylor et al. (2009).
There are a number of new types of experiments proposed for CMIP5 in comparison with previous incarnations. As in previous intercomparison exercises, the main focus and effort rests on the longer time-scale ("centennial") experiments, including now emission-driven runs of models that include a coupled carbon-cycle (ESMs). These centennial experiments are being performed at the Met Office Hadley Centre with the HadGEM2-ES Earth System model Martin et al., 2011); a configuration of the Met Office's Unified Model. Figure 1 outlines the main experiments and groups them into categories. The inner circle denotes "core" priority experiments with tier 1 (middle circle) and tier 2 (outer circle) having successively lower priority. Experiments are split between climate projections (blue), idealised experiments aimed at elucidating process understanding in the models (yellow), model evaluation, including preindustrial control runs and historical experiments (red), and additional experiments for models with a coupled carbon cycle (green).
In the following, we briefly describe HadGEM2-ES ESM, which is documented in detail in Collins et al. (2011). HadGEM2-ES is a coupled AOGCM with atmospheric resolution of N96 (1.875 • × 1.25 • ) with 38 vertical levels and an ocean resolution of 1 • (increasing to 1/3 • at the equator) and 40 vertical levels. HadGEM2-ES also represents interactive land and ocean carbon cycles and dynamic vegetation with an option to prescribe either atmospheric CO 2 concentrations or to prescribe anthropogenic CO 2 emissions and simulate CO 2 concentrations as described in Sect. 2. An interactive tropospheric chemistry scheme is also included, which simulates the evolution of atmospheric composition and interactions with atmospheric aerosols. The model timestep is 30 min (atmosphere and land) and 1 h (ocean). Extensive diagnostic output is being made available to the CMIP5 multi-model archive. Output is available either at certain prescribed frequencies or as time-average values over certain periods as detailed in the CMIP5 output guidelines (see URL 2 in Appendix A).
The CMIP5 simulations include 4 future scenarios referred to as "Representative Concentration Pathways" or RCPs (Moss et al., 2010). These future scenarios have been generated by four integrated assessment models (IAMs) and selected from over 300 published scenarios of future greenhouse gas emissions resulting from socio-economic and energy-system modelling. These RCPs are labelled according to the approximate global radiative forcing level in 2100 for RCP8.5 (Riahi et al., 2007), during stabilisation after 2150 for RCP4.5 (Clarke et al., 2007;Smith and Wigley, 2006) and RCP6 (Fujino et al., 2006) or the point of maximal forcing levels in the case RCP3-PD (van Vuuren et al., 2006(van Vuuren et al., , 2007, with PD standing for "Peak and Decline". The latter scenario has previously been known as RCP2.6, as radiative forcing levels decline towards 2.6 Wm −2 by 2100. Note that these radiative forcing levels are illustrative only, because greenhouse gas concentrations, aerosol and tropospheric ozone precursors are prescribed, resulting in a wide spread in radiative forcings across different models. The experimental protocol involves performing a historical simulation (defined for HadGEM2-ES as 1860 to 2005) using the historical record of climate forcing factors such as greenhouse gases, aerosols and natural forcings such as solar and volcanic changes. The model state at 2005 is then used as the initial condition for the 4 future RCP simulations. Further extension of the RCP simulations to 2300 is also implemented as detailed in the RCP White Paper (see URL 3 in Appendix A) and Meinshausen et al. (2011).
Many of these experiments require technical implementation by means of either or both of the following:  Taylor et al. (2009) with each experiment being represented by an area that is proportional to the experiment's length in model years. The inner circle denotes "core" priority experiments with tier 1 (middle circle) and tier 2 (outer circle) having successively lower priority. Experiments are split between climate projections (blue), idealised experiments aimed at elucidating process understanding in the models (yellow), model evaluation, including pre-industrial control runs and historical experiments (red), and additional experiments for models with a coupled carbon cycle (green). See CMIP5 project webpage (Appendix A) for more detailed information. (D&A: detection and attribution, ECP: extended RCP simulations to 2300).
-code changes to alter the scientific behaviour of the model, such as to decouple various feedbacks and interactions (e.g. the "uncoupled" carbon cycle experiments).
This paper presents in detail the technical aspects of how these model forcings are implemented in HadGEM2-ES. It is not our intention here to present scientific results from the experiments. This analysis will be left for subsequent work.
The CMIP5 experiments performed with HadGEM2-ES are listed in Table 1 along with the relevant forcings for each experiment. How these forcings are then implemented is detailed in the following sections with Sect. 2 describing the atmospheric CO 2 concentrations for the concentrationdriven runs as well as the CO 2 emission assumptions for the emission-driven experiments. Section 3 details the boundary conditions of atmospheric concentrations of the other wellmixed greenhouse gases. Tropospheric and stratospheric ozone assumptions are detailed in Sect. 4. Section 5 details the treatment of aerosols, while Sect. 6 documents that applied to land use pattern changes. Natural forcings, both solar and volcanic, are described in Sect. 7. Apart from these recent history, centennial 21st century and longer-term experiments, we describe as well the setup for the palaeoclimatic runs in Sect. 8. The more general issue of how the ensemble members are branched off the control run is described in Sect. 9, and Sect. 10 concludes. A list of URL locators for websites holding relevant data is included as an Appendix.

CO 2 concentration
For simulations requiring prescribed atmospheric CO 2 concentrations, a single global 3-D constant provided as an annual mean mass mixing ratio was used -linearly interpolated in the model at each timestep. This prescribed CO 2 concentration is then passed to the model's radiation scheme, and constitutes a boundary condition for the terrestrial and ocean  Table lists the climate forcings required to be changed from the control run (Experiment 3.1) in order to set-up and perform each CMIP5 experiment. The presence of a cross denotes that that forcing is changed, and is documented in the section listed in the column title. An absence of a cross does not mean that forcing is missing, but that it is kept the same as in the control run. 6.8 which will be documented elsewhere. b "well mixed GHG" here covers CH 4 , N 2 O, and halocarbons, but not CO 2 which is treated in Sect. 2. "Ozone" covers tropospheric and stratospheric and includes emissions of pre-cursor gases which affect tropospheric ozone (see Sect. 4) c "natural" forcing covers both solar and volcanic changes (Sect. 7) d "geophysical" here is taken to include changes in: prescribed ice sheet extent (incl. height), land-sea mask, ocean bathymetry. e the control run does have "forcing" in that we prescribe several things to be constant. The relevant sections describe how each climate forcing is set up for the control run. The Table then lists aspects which differ from the control (either by being time varying in scenarios, or by being held constant at different values such as in the 4 × CO 2 simulation). f "(E)" denotes that an initial-condition ensemble is required for these experiments. Section 9 describes how the initial conditions are derived. g land-use in the AMIP and last millennium experiments is described in their respective Sects. (9.2, 8.3) rather than the land-use Sect. 6. h palaeoclimate orbital forcing is described in Sect. 8 rather than under solar forcing in Sect. 7.1. i The "ESM control" simulation actually has an absence of forcing as CO 2 is simulated in this experiment, not prescribed. j "D&A" stands for detection and attribution. carbon cycle. The oceanic partial pressure of CO 2 , pCO 2 , is always simulated prognostically from this, i.e. it is not itself prescribed.
The CO 2 concentrations used were taken from the CMIP5 dataset (see URL 4 in Appendix A). The historical part of the concentrations (1860-2005) is derived from a combination of the Law Dome ice core (Etheridge et al., 1996), NOAA global mean data (see URL 5 in Appendix A) and measurements from Mauna Loa (Keeling et al., 2009). After 2005, CO 2 concentrations recommended for CMIP5 were calculated for the 21st century from harmonized CO 2 emissions of the four IAMs that underlie the four RCPs. Beyond 2100, these concentrations were extended, so that the CO 2 concentrations under the highest RCP, RCP8.5, stabilize just below 2000 ppm by 2250. Both the medium RCPs smoothly stabilize around 2150, with RCP4.5 stabilizing close to the 2100 value of the former SRES B1 scenario (∼540 ppm). The lower RCP, RCP3-PD, illustrates a world with net negative emissions after 2070 and sees declining CO 2 concentrations after 2050, with a decline of 0.5 ppm yr −1 around 2100 (see Fig. 2). These CO 2 concentrations are prescribed in HadGEM2-ES's historical, AMIP, RCP simulations and the carbon-cycle uncoupled experiments. The detection and attribution experiments with time varying CO 2 also use these values, but the detection and attribution experiments with fixed CO 2 levels use a constant, pre-industrial value of 286.3 ppm. This CMIP5 dataset also provides the CO 2 concentration used for the pre-industrial control simulation (taken here to be 1860 AD), which is 286.3 ppm. CO 2 for the palaeoclimate simulations is described in Sect. 8. More details on the CMIP5 CO 2 concentrations and how they were derived are provided in Meinshausen et al. (2011).
Aside from these centennial simulations, idealized experiments are performed with HadGEM2-ES for CMIP5 in order to estimate, inter alia, transient and equilibrium climate , and the four RCPs as well as their extensions beyond 2100 (RCP8.5: red; RCP6: yellow; RCP4.5: green; RCP3-PD: blue). See Meinshausen et al. (2011) for further details. Note that the x-axis beyond 2100 is compressed. sensitivity and the climate-carbon cycle feedback. For the idealised annual 1 % increase in CO 2 concentration, we start from the control-run level of 286.3 ppm in 1859 up to 4 × CO 2 (1144 ppm) after 140 yr (Experiment 6.1). Equivalently, our instantaneous quadrupling to 4 × CO 2 uses a concentration of 1144 ppm in order to allow diagnosis of shortterm forcing adjustments and equilibrium climate sensitivities (Experiment 6.3).

Decoupled carbon cycle experiments
Using additional code modifications to the appropriate modules of the HadGEM2-ES model, it is possible to decouple different carbon-cycle feedbacks. For the decoupled carbon cycle experiments (5.4, 5.5) we decoupled the climate and carbon cycle in 2 different ways. The C4MIP intercomparison exercise (Friedlingstein et al., 2006) defined an "UN-COUPLED" methodology in which only the carbon cycle component responded to changes in atmospheric CO 2 levels. Gregory et al. (2009) additionally describe the counterpart experiment where only the model's radiation scheme responds to changes in CO 2 . Gregory et al. (2009) recommend performing both experiments (as the results may not combine linearly to give the fully coupled behaviour) and labelling such experiments in terms of what is rather than is not coupled. Hence we performed the biogeochemically coupled ("BGC") experiments (5.4) in which the models biogeochemistry is coupled (i.e., the biogeochemistry modules respond to the changing atmospheric CO 2 concentration) and the radiation scheme is uncoupled (and uses the preindustrial level of CO 2 which is held constant) and also radiatively coupled ("RAD") experiments (5.5) in which the model's radiation scheme is allowed to respond to changes in atmospheric CO 2 levels, but the biogeochemistry components (land vegetation and ocean chemistry and ecosystem) use a constant CO 2 level, again set to the preindustrial value. Both decoupled experiments can be achieved with single simulations in which time-varying or time-fixed values of CO 2 are used as input data to the respective sections of model code. We performed both BGC and RAD experiments for the idealised (1 %) and transient, multi-forcing (historical/RCP4.5) scenarios.

Emissions data
In addition to running with prescribed atmospheric CO 2 concentrations, HadGEM2-ES can be configured to run with a fully interactive carbon cycle. Here, atmospheric CO 2 is treated as a 3-D prognostic tracer, transported by atmospheric circulation, and free to evolve in response to prescribed surface emissions and simulated natural fluxes to and from the oceans and land. This approach is required for the "Emission-driven" simulations (5.1-5.3) shown in green in Fig. 1, and it also allows additional model evaluation by comparison with flask and station measuring sites such as at Mauna Loa (e.g. Law et al., 2006;Cadule et al., 2010).
A 2-D timeseries of total anthropogenic emissions was constructed by summing contributions from fossil fuel use and land-use change. For the historical simulation, annual mean emissions from fossil fuel burning, cement manufacture, and gas-flaring were provided on a 1 • × 1 • grid from 1850 to 1949 (Boden et al., 2010), with monthly means from 1950 to 2005 (Andres et al., 2011). For the RCP8.5 simulation, the harmonized fossil fuel emissions for 2005 to 2100 were used (as available in the RCP database, see URL 6 in Appendix A). The land use change (LUC) emissions are based on the regional totals of Houghton (2008), which were provided as annual means of the period 1850-2005. Within each of the ten regions the emissions were linearly weighted by population density on a 1 • × 1 • grid (for more information, see URL 7 in Appendix A). These population data were also used by Klein Goldewijk (2001) and are linearly interpolated between the years 1850, 1900, 1910, 1920, 1930, 1940, 1950, 1960, 1970, 1980, and 1990. After the year 1990 population density is assumed to stay constant. Additionally, high population density was set to a limit of 20 persons per km 2 to avoid large emissions in urban centres. The weighting with population data inhibits land use change emissions in deserts and high northern latitudes, which improves the latitudinal distribution of the emissions. However, the method is insufficient to provide realistic local land use change emissions (e.g. in tropical forests).
The gridded (1 • × 1 • ) fossil fuel and land-use emissions data, originally provided as a flux per gridbox, were converted to flux per unit area, then regridded as annual means onto the HadGEM2-ES model grid. A small scaling adjustment was made after regridding to ensure the global totals matched those of the 1 • × 1 • data exactly. Future emissions were not provided with spatial information so we scaled the 2005 geographical pattern for fossil-fuel and land-use emissions to give the correct global total into the future. The CO 2 emissions are updated daily in the model by linearly interpolation between the annual values (or monthly, from 1950-2005). HadGEM2-ES has the functionality to interactively simulate land-use emissions of CO 2 directly from a prescribed scenario of land-use change and simulated vegetation cover and biomass (see Sect. 6). However, the model has not been fully evaluated in this respect, so for CMIP5 experiments we disable this feature and choose rather to prescribe reconstructed land-use emissions from Houghton (2008). By simulating changes in carbon storage due to imposed land use change, but imposing land-use CO 2 emissions to the atmosphere from an external dataset we introduce some degree of inconsistency in this simulation. Work is required to evaluate and improve the simulation of land-use emissions so that they can be used interactively in such simulations in the future.
The uncertainty in annual land-use emissions of ±0.5 GtC (cf. Le Quéré et al., 2009) is relatively large compared to the total land use emissions (an estimated 1.467 GtC in 2005, Houghton, 2008). The RCP scenarios have been harmonised towards the average LUC emission value of all four original IAM emission estimates, i.e., 1.196 GtC in 2005. This is substantially lower than the value calculated by Houghton (2008) of 1.467 GtC in the same year, although still within the uncertainty. The climate-carbon cycle modelling community preferred to use the original Houghton (2008) estimates for historical emissions. A smooth transition between the historical and the RCP simulations was ensured by scaling the last five years of the historical LUC emissions to factor in a linearly-increasing contribution from the harmonised RCP values. In 2001 the two values were combined in the ratio 80 %:20 % (Houghton: RCP), followed by 60 %:40 % in 2002, and so on until 0 %:100 % (i.e. the RCP value) in 2005, as shown in Fig. 3.
By rescaling the Houghton (2008) data between 2000 and 2005 to merge smoothly with the RCP value in 2005, we lower total emissions in this period by 0.94 GtC compared to the original Houghton estimates (Table 2). In the presence of fossil emissions of more than 40 GtC in this period this difference is small. Total emissions and the relative contribution of fossil fuel and LUC are shown in Fig. 4.

Carbon conservation
In the emissions-driven experiments, conservation of carbon in the earth system is required. The concentration of atmospheric CO 2 influences the carbon exchange with the oceans and terrestrial biosphere. Any drift in atmospheric CO 2 will modify these fluxes accordingly, and thereby impact the land and ocean carbon stores as well as the climate itself. While the transport of atmospheric tracers in HadGEM2-ES is designed to be conservative, the conservation is not perfect and in centennial scale simulations this non-conservation becomes significant. This has been addressed by employing an explicit "mass fixer" which calculates a global scaling of CO 2 to ensure that the change in the global mean mass mixing ratio of CO 2 in the atmosphere matches the total flux of CO 2 into or out of it each timestep (Corbin and Law, 2011). Figure 5 demonstrates HadGEM2-ES's ability to conserve atmospheric CO 2 , following implementation of the mass fixer scheme described here. The evolution of the atmospheric CO 2 burden calculated by the model matches almost exactly the accumulation over time of the CO 2 flux to the atmosphere. The lower panel of Fig. 5 shows the difference between the two. This residual difference is most likely explained by changes in the total mass of the atmosphere in HadGEM2-ES over time, since CO 2 mass mixing ratio is conserved rather than CO 2 mass. CO 2 is chemically inert in HadGEM2-ES so all of the changes in its concentration are driven by surface emissions or sinks.

Non-CO 2 well mixed greenhouse gases
Specification of the following non-CO 2 well-mixed greenhouse gases is required in HadGEM2-ES: CH 4 , N 2 O and halocarbons. For the control run, historical and RCP simulations they are implemented as described below and shown in Fig. 2. The CO 2 emissions-driven experiment and the historical/RCP decoupled carbon cycle experiments also use these time varying values, as do the AMIP runs and the detection and attribution experiments which require time variation of GHGs. GHG concentrations during the palaeo-climate simulations are described in Sect. 8.

CH 4 concentration
Atmospheric methane concentrations were prescribed as global mean mass mixing ratios. For experiments with timevariable CH 4 concentrations (historical RCP experiments), these were linearly interpolated from the annual concentrations for every time step of the model. These interpolated CH 4 concentrations were then passed to the tropospheric chemistry scheme in HadGEM2-ES (United Kingdom Chemistry and Aerosols: UKCA, O'Connor et al., 2011). Within UKCA, the surface CH 4 concentration was forced to follow the prescribed scenario and surface CH 4 emissions were decoupled. CH 4 concentrations above the surface were calculated interactively, and the full 3-D CH 4 field was then passed from UKCA to the HadGEM2-ES radiation scheme.
As CH 4 concentrations were only prescribed at the surface, CH 4 in HadGEM2-ES above the surface is free to evolve in a non-uniform structure and may differ from prescribed, well-mixed historical or RCP CH 4 concentrations. The impact of passing a full 3-D CH 4 field from UKCA to the Table 2. Definition of the manner in which the Houghton (2008) land use emissions data (H08) and RCP data were combined in years 2000 to 2005. The last two columns show how the cumulative emissions from the original H08 data compare with those of the rescaled data, the latter being 0.94 GtC lower over the period considered.
Year H08   radiation scheme rather than passing a uniform concentration everywhere was evaluated in a present-day atmosphere-only configuration of the HadGEM1 model Martin et al., 2006). The full 3-D CH 4 field lead to the extratropical stratosphere being cooler by 0.5-1.0 K, thereby reducing the warm temperature biases in the model (O'Connor et al., 2009). The CH 4 concentrations used were taken from the recommended CMIP5 dataset. For the historical period (1860 to 2005), these were assembled from Law Dome ice core measurements reported by Etheridge et al. (1998) and prepared for the NASA GISS model (see URL 8 in Appendix A). Beyond 1984, concentrations were provided by E. Dlugokencky and from the global NOAA/ESRL global monitoring network (see URL 9 in Appendix A). For more details, see Meinshausen et al. (2011). Figure 2b shows CH 4 concentrations over the historical period (1860 to 2005) and for the RCP scenarios up to 2300. This dataset also provided the CH 4 concentration used in the pre-industrial control simulation (taken here to be 1860 AD) which was 805.25 ppb.

N 2 O concentration
Atmospheric N 2 O concentrations were prescribed as a time series of annual global mean mass mixing ratios in the centennial CMIP5 simulations, as described in . The annual concentrations were linearly interpolated onto the time steps of HadGEM2-ES and passed to the model's radiation scheme. Figure 2c shows the N 2 O concentrations for the 4 RCPs over the historical period and from the RCPs from 2005-2300. This dataset also provided the N 2 O concentration used in the pre-industrial control simulation (taken here to be 1860 AD) which was 276.4 ppb.

Atmospheric halocarbon concentration
Atmospheric concentrations of halocarbons were prescribed as a time series of annual global mean concentrations in the centennial multi-forcing CMIP5 simulations and interpolated linearly to the model's time steps. The future concentrations of halocarbons controlled under the Montreal Protocol are primarily based on the emissions underlying the WMO A1 scenario (Daniel et al., 2007) -calculated with a simplified climate model MAGICC, taking into account changes in atmospheric lifetimes due to changes in tropospheric OHrelated sinks and stratospheric sinks due to an enhancement of the Brewer-Dobson circulation .
The CMIP5 dataset provided concentrations of 27 halocarbon species, more than GCMs generally represent separately (for example, HadGEM2-ES explicitly represents the radiative forcing of 6 of these species). The data is therefore also supplied aggregated into concentrations of "equivalent CFC-12" and "equivalent HFC-134a", representing all gases controlled under the Montreal and Kyoto protocols, respectively. These equivalent concentrations were used in HadGEM2-ES ( Fig. 2d). Halocarbon concentrations were set to zero for the pre-industrial control run.
The CMIP5 "equivalent" concentrations of CFC12 and HFC134a were derived by simply summing the radiative forcing of individual species and assuming linearity of the relationship between the concentration and radiative forcing for a single species and additivity of multiple species. To quantify the difference between using equivalent CFC-12 and HFC-134a and the full set of possible species a set of five test simulations was completed: 1. Control: halocarbons assumed zero, CO 2 at 1 × CO 2 (286.3 ppm).
Upward and downward fluxes of longwave radiation were saved on all vertical levels in the atmosphere after the first model timestep (so that the meteorology is identical). It should be noted that species not available in HadGEM2-ES are combined into either HFC-134a or CFC-12 according to their classification. Species are combined into equivalent HFC-134a and CFC-12 by summing their radiative forcing consistently with the CMIP5 methodology. Figure 6 shows excellent agreement between the "equivalent" gases and the more detailed representation from experiments 3, 4, 5 above. Zonal mean differences are within 1 m Wm −2 everywhere showing that the use of 2 CFC equivalent species in CMIP5 is justified.

Tropospheric ozone pre-cursor emissions and concentrations
Tropospheric ozone (O 3 ) is a significant greenhouse gas due to its absorption in the infrared, visible, and ultraviolet spectral regions (Lacis et al., 1990). It has increased substantially since pre-industrial times, particularly in the northern mid-latitudes (e.g. Staehelin et al., 2001), which has been linked by various studies to increasing emissions of tropospheric O 3 pre-cursors: nitrogen oxides (NO x = NO + NO 2 ), carbon monoxide (CO), methane (CH 4 ), and non-methane volatile organic compounds (NMVOCs; e.g. Wang and Jacob, 1998  Although transport and chemistry were calculated up to the model lid, boundary conditions were applied within UKCA. In the case of O 3 , it was overwritten in those model levels which were 3 levels (approximately 3-4 km) above the diagnosed tropopause (Hoerling et al., 1993) using the stratospheric O 3 concentration dataset described in Sect. 4.2. It is this combined O 3 field which is then passed to the model's radiation scheme. Furthermore, oxidation of sulphur dioxide and dimethyl sulphide (DMS) into sulphate aerosol (described in Sect. 5) involves hydroxyl (OH), hydroperoxyl (HO 2 ), hydrogen peroxide (H 2 O 2 ), and O 3 , whose concentrations are provided to the model's sulphur cycle from UKCA.
No prescribed tropospheric ozone abundance data were used within HadGEM2-ES. Instead, the tropospheric evolution of ozone was simulated using surface and aircraft emissions of tropospheric ozone precursors and reactive gases. It is these emissions, rather than tropospheric ozone concentrations which are held constant in the pre-industrial control simulation. For the palaeoclimate simulations, the same preindustrial emissions are also used as described in Sect. 8. For the historical and future simulations (including the emissions driven and decoupled carbon cycle experiments, and AMIP runs) a time-varying data set of emissions is used. As the time evolution of tropospheric ozone is simulated rather than prescribed, it may diverge from historical or RCP supplied tropospheric ozone . The emissions data used by HadGEM2-ES has been supplied for CMIP5 by Lamarque et al. (2010) and by the IAMs for the 4 RCPs. Speciated surface emissions were provided for the following sectors: land-based anthropogenic sources (agriculture, agricultural waste burning, energy production and distribution, industry, residential and commercial combustion, solvent production and use, land-based transportation, and waste treatment and disposal), biomass burning (forest fires and grass fires), and shipping. They were valid for the specific year provided with a time resolution of 10 years in the case of anthropogenic and shipping emissions but as decadal means for biomass burning. This was considered appropriate for biomass burning emissions due to their substantial inter-annual variability both globally and regionally (Lamarque et al., 2010). All surface emissions were provided as monthly means on a 0.5 • × 0.5 • grid. In the case of aircraft emissions, they were provided as monthly means on a 0.5 • × 0.5 • horizontal grid and on 25 levels in the vertical, extending from the surface up to 15 km.
For the UKCA tropospheric chemistry scheme used in HadGEM2-ES, surface emissions for the following species were considered: C 2 H 6 , C 3 H 8 , CH 4 , CO, HCHO, Me 2 CO, MeCHO, and NO x . For the CMIP5 simulations, the spatially uniform surface CH 4 concentration is prescribed (as described in Sect. 3.1), and hence the surface CH 4 emissions are essentially redundant in this case. For each species the provided emissions were re-gridded onto the model's N96 grid (1.75 • × 1.25 • ). A small adjustment was made after regridding to ensure the global totals matched those of the original data.
For emissions of C 2 H 6 , it was decided to combine all C2 species (C 2 H 6 , ethene (C 2 H 4 ), and ethyne (C 2 H 2 )) and treat as emissions of C 2 H 6 . These were each converted to kg(C 2 H 6 ) m −2 s −1 , added together, and then regridded. For C 3 H 8 , the C3 species (propane and propene (C 3 H 6 )) were similarly combined and treated as emissions of C 3 H 8 .
For CO, emissions from land-based anthropogenic sources, biomass burning, and shipping were taken for the historical period from Lamarque et al. (2010). These were added together and re-gridded on to an intermediate 1 • × 1 • grid in terms of kg(CO) m −2 s −1 . Oceanic CO emissions were also added (45 Tg(CO) yr −1 ), and their spatial and temporal distribution were provided by the Global Emissions Inventory Activity (see URL 10 in Appendix A), based on distributions of oceanic VOC emissions from Guenther et al. (1995). In the absence of an isoprene (C 5 H 8 ) oxidation mechanism in the UKCA tropospheric chemistry scheme used in HadGEM2-ES, an additional 354 Tg(CO) yr −1 was added based on a global mean CO yield of 30 % from C 5 H 8 from a study by Pfister et al. (2008) and a global C 5 H 8 emission source of 506 TgC yr (Guenther et al., 1995). It is distributed spatially and temporally using C 5 H 8 emissions from Guenther et al. (1995) and added to the other monthly mean emissions on the 1 • × 1 • grid before regridding. For HCHO emissions, the monthly mean land-based anthropogenic sources were combined with monthly mean biomass burning emissions from Lamarque et al. (2010) for the historical period and re-gridded. Similar processing was applied to the future emissions supplied by the IAMs for the 4 RCPs.
For MeCHO, the monthly mean NMVOC biomass burning emissions from Lamarque et al. (2010) for the historical period were used. Using different emission factors from Andreae and Merlet (2001) for grass fires, tropical forest fires, and extra-tropical forest fires, emissions of NMVOCs were converted into emissions of MeCHO (i.e. kg(MeCHO) m −2 s −1 ). Surface emissions of Me 2 CO were taken from land-based anthropogenic sources and biomass burning from Lamarque et al. (2010Lamarque et al. ( , 2011. These were added together and re-gridded on to an intermediate 1 • × 1 • grid in terms of kg(Me 2 CO) m −2 s −1 . Then, the dominant source of Me 2 CO from vegetation was added, based on a global distribution from Guenther et al. (1995) and scaled to give a global annual total of 40.0 Tg(Me 2 CO) yr −1 . The total monthly mean emissions were then re-gridded on to the model's N96 grid. For future emissions, the processing was identical.
Finally for NO x surface emissions, contributions from land-based anthropogenic sources, biomass burning, and shipping from Lamarque et al. (2010) were added together and re-gridded on to an intermediate 1 • × 1 • grid in terms of kg(NO) m −2 s −1 . Added to these were a contribution from natural soil emissions, based on a global and monthly distribution provided by GEIA on a 1 • × 1 • grid (see URL 10 in Appendix A), and based on the global empirical model of soil-biogenic emissions from Yienger and Levy II (1995). These were scaled to contribute an additional 12 Tg(NO) yr −1 . A similar approach was adopted when processing the future emissions. All emissions provided were processed as above for the years supplied and a linear interpolation applied between years to produce emissions for every year. Figure 7 shows the time evolution of tropospheric O 3 pre-cursor surface emissions over the 1850-2100 time period. After 2100, tropospheric ozone precursor emissions were kept constant.
In the case of NO x emissions, 3-D emissions from aircraft were also considered. These were supplied as monthly mean fields of either NO or NO 2 on a 25 level (L25) 0.5 × 0.5 grid by Lamarque et al. (2010) for the historical period. For HadGEM2-ES we used the NO emissions. They were first re-gridded on to an N96 × L25 grid and then projected on to the model's N96 × L38 grid, ensuring that the global annual total emissions were conserved. A similar approach was adopted when processing the future emissions.
No additional coding in the HadGEM2-ES or UKCA models was necessary for the treatment of tropospheric ozone pre-cursor emissions. The only code change was required for the Detection and Attribution "greenhouse gases only" simulation (7.2). In this case, the UKCA model was modified to maintain the global mean surface CH 4 concentration at pre-industrial levels i.e. 805.25 ppb. This was to ensure that the increase in CH 4 concentration as seen by the radi-ation scheme did not affect concentrations of tropospheric oxidants, thereby influencing the rate of sulphate aerosol formation.

Stratospheric ozone concentration
HadGEM2-ES requires stratospheric ozone to be input as monthly zonal/height ancillary files. CMIP5 recommends the use of the AC&C/SPARC ozone database (Cionni et al., 2011) which covers the period 1850 to 2100 and can be used in climate models that do not include interactive chemistry. The pre-industrial dataset consists of a repeating seasonal cycle of ozone values, and this is also used for the palaeoclimate simulations described in Sect. 8. For the historical and future simulations (including the emissions driven and decoupled carbon cycle experiments, and AMIP runs) a timevarying data set of stratospheric ozone is used.
The historical part of the AC&C/SPARC ozone database spans the period 1850 to 2009 and consists of separate stratospheric and tropospheric data sources. The future part of the AC&C/SPARC ozone database covers the period 2010 to 2100 and seamlessly extends the historical database also including separate stratospheric and tropospheric data sources based on 13 CCMs that performed a future simulation until 2100 under the SRES A1B GHG scenario.
The AC&C/SPARC ozone is provided on pressure levels between 1000-1 hPa. The UK National Centre for Atmospheric Science (NCAS) has produced an updated version of the SPARC ozone dataset as follows.
A multiple-linear regression was performed on the historical raw pressure-level data between 1000-1 hPa consistent with the Randel and Wu (1999) method used to construct the timeseries. The ozone was then represented as: O 3 (t) = a * SOL + b * EESC + seasonal cycle + residuals. For consistency, the indices of 11-yr solar cycle (SOL) and total equivalent chlorine (EESC) are identical to those used to prepare the original dataset. The SOL index is a 180.5 nm timeseries provided by Fei Wu at NCAR. The standard SPARC ozone dataset which extends into the future does not include solar cycle variability post-2009. For production of a dataset extending into the future including an 11-yr ozone solar cycle, the solar regression index is used to build a future time series consistent with a repeating solar irradiance compiled by the Met Office Hadley Centre (see Sect. 7.1) and is modelled as a sinusoid with a period of 11 yr, with mean and max-min values corresponding to solar cycle 23 normalised against the 180.5 nm timeseries used in the historical ozone. There is no solar ozone signal in the high latitudes.
The data were then horizontally interpolated onto a N96 grid. Vertical interpolation was achieved by hydrostatically mapping the SPARC ozone data from pressure surfaces onto pressure surface equivalent levels corresponding to the height-based grid used by HadGEM2-ES using a scale height of 7 km.

Tropospheric aerosol forcing
HadGEM2-ES simulates concentrations of six tropospheric aerosol species: ammonium sulphate, fossil-fuel black carbon, fossil-fuel organic carbon, biomass-burning, sea-salt, and mineral dust aerosols (Bellouin et al., 2007;. Although an ammonium nitrate aerosol scheme is available to HadGEM2-ES, it was still in its developmental version when CMIP5 simulations started, hence nitrate aerosols are not included in the CMIP5 simulations. In addition, secondary organic aerosols from biogenic emissions are represented by a fixed climatology. All aerosol species can exert a direct effect by scattering and absorbing shortwave and longwave radiation, and a semi-direct effect whereby this direct effect modifies atmospheric vertical profiles of temperature and clouds. In HadGEM2-ES all aerosol species, except fossil-fuel black carbon and mineral dust, also contribute to both the first and second indirect effects on clouds, modifying cloud albedo and precipitation efficiency, respectively. Changes in direct and indirect effects since 1860 are termed aerosol radiative forcing. The magnitude of this forcing depends on changes in aerosols, which are due in part to changes in emissions of primary aerosols and aerosol precursors. Changes in emission rates are either derived from external datasets or due to changes in the simulated climate. Here we document how any changes in emission rates are implemented in the HadGEM2-ES CMIP5 centennial experiments. In the control run we specify a repeating seasonal cycle of 1860 emissions, and this is also used in the palaeoclimate simulations (Sect. 8). Historical and future simulations (including the emissions-driven and decoupled carbon cycle experiments and AMIP runs) use time-varying emissions as described in this section.
In HadGEM2-ES sea-salt and mineral dust aerosol emissions are computed interactively, whereas emission datasets drive schemes for sulphate, fossil-fuel black and organic carbon, and biomass aerosols. Unless otherwise stated, datasets are derived from the historical and RCP time series prepared for CMIP5. All non-interactive emission fields are interpolated by the model every five simulated days from prescribed monthly-mean fields. Timeseries of non-interactive emissions are shown in Fig. 10. Aircraft emissions of aerosol precursors and primary aerosols are not included in the model.
The sulphur cycle, which provides concentrations of ammonium sulphate aerosols, requires emissions of sulphur dioxide (SO 2 ) and dimethyl-sulphide (DMS). Sulphur dioxide emissions are derived from sector-based emissions. Emissions for all sectors are injected at the surface, except for energy emissions and half of industrial emissions which are injected at 0.5 km to represent chimney-level emissions. Sulphur dioxide emissions from biomass burning are not included. The model accounts for three-dimensional background emissions of sulphur dioxide from degassing volcanoes, taken from Andres and Kasgnoc (1998). This represents a constant rate of 0.62 Tg[S] yr −1 on a global average, independent of the year simulated and is not part of the implementation of volcanic climate forcing which we discuss in Sect. 7.2. Similarly, land-based DMS emissions do not vary in time and give 0.86 Tg yr −1 (Spiro et al., 1992). Oceanic DMS emissions are provided interactively by the biogeochemical scheme of the ocean model as a function of local chlorophyll concentrations and mixed layer depth (based on Simo and Dachs, 2002). In an objective assessment against ship-board and time-series DMS observations, the HadGEM2-ES interactive ocean DMS scheme performs with similar skill to that found in the widely used Kettle et al. (1999) climatology (Halloran et al., 2010). The primary differences between the model-simulated and the climatology-interpolated surface ocean DMS fields are; lower model Southern Hemisphere summer Southern Ocean DMS concentrations, higher model annual equatorial DMS concentrations, and a reduced model seasonal cycle amplitude. Oxidation of sulphur-dioxide and DMS into sulphate aerosol involves hydroxyl (OH), hydroperoxyl (HO 2 ), hydrogen peroxide (H 2 O 2 ), and ozone (O 3 ): concentrations for those oxidants are provided by the tropospheric chemistry scheme. Emissions of primary black and organic carbon from fossil fuel and biofuel are injected at 80 m. Emissions of biomassburning aerosols are the sum of the biomass-burning emissions of black and organic carbon. Grassfire emissions are assumed to be located at the surface, while forest fire emissions are injected homogeneously across the boundary layer (0.8 to 2.9 km).
Sea-salt emissions are computed interactively over open oceans at each model time step from near-surface (10 m) wind speeds (Jones et al., 2001). Mineral dust emissions are also interactive, and depend on near-surface wind speed, land cover and soil properties. The scheme is described in Woodward (2011). It is based on that designed for HadAM3 (Woodward, 2001) with major developments including the modelling of particles up to 2 mm in the horizontal flux, threshold friction velocities based on Bagnold (1941), a modified version of the Fécan et al. (1999) soil moisture treatment and the utilisation of a preferential source multiplier similar to that described in Ginoux et al. (2001).
Finally, secondary organic aerosols from biogenic emissions are represented by monthly distributions of threedimensional mass-mixing ratios obtained from a chemistry transport model (Derwent et al., 2003). These distributions are constant for all simulated years.

Land-use and land-use change
The HadGEM2-ES land-surface scheme incorporates the TRIFFID DGVM (Cox, 2001), and as such simulates internally the land cover (and its evolution) in response to climate (and climate change). Hence we do not directly impose prescribed land-cover or vegetation types, but rather provide a fractional mask of anthropogenic disturbance as a boundary condition to the dynamic vegetation scheme. Previous Met Office Hadley Centre coupled climate-carbon cycle simulations (e.g. Cox et al., 2000;Freidlingstein et al., 2006) used a static (present day) agricultural mask. However the dynamic vegetation scheme, TRIFFID has now been updated to allow time-varying land-use distributions in the CMIP5 simulations.
TRIFFID represents the fractional coverage in each grid cell of 5 plant functional types (PFTs: broadleaf tree, needleleaf tree, C3 grass, C4 grass, shrub) and also bare soil. Prescribed fractions of urban areas, lakes and ice are also included from the IGBP land cover map (Loveland et al., 2000) and do not vary in time. The summed fractional coverage of crop and pasture is provided as a time-varying input. Within a grid box tree and shrub PFTs are excluded from this fraction allowing natural grasses to grow and represent "crops". Abandonment of crop land removes this constraint on trees and shrubs but we do not specify instant replacement by these woody PFTs, but rather their regrowth is simulated by the model's vegetation dynamics. If woody vegetation cover reduces because of a land use change, vegetation carbon from the removed woody PFTs goes partially to the soil carbon pool and partially to a series of wood products pools. These wood products pools have turnover rates of 1, 10 and 100 yr and are not sensitive to environmental conditions. The fraction of vegetation carbon directed into the wood products pool is proportional to the ratio of above ground and below ground carbon pools ((leaf carbon + stem carbon)/root carbon). Distribution of disturbed biomass into the different carbon pools depends on the vegetation type consistent with McGuire et al. (2001) and is shown in Table 3.
HadGEM2-ES is therefore able to simulate both biophysical and biogeochemical effects of land-use change as well as natural changes in vegetation cover in response to changing climate and CO 2 . In this version of the model only anthropogenic disturbance in the form of crop and pasture is represented. Data on within-grid-cell transitions due to shifting cultivation or the impact of wood harvest are not yet used. As described in Sect. 2.2, CO 2 emissions from land-use change can be simulated by HadGEM2-ES but are not used interactively in the emissions driven experiment.
The biophysical impacts of land use change include the direct effect of changes to surface albedo and roughness due to land-cover change and also changes to the hydrological cycle due to changes in evapotranspiration and runoff. There is also an indirect physical effect due to changes in surface emissions of mineral dust caused by changes in bare soil fraction, windspeed and soil moisture, which has a radiative effect in the atmosphere.
Historic and future simulations (including the emissionsdriven and decoupled carbon cycle simulations and AMIP runs) use time varying disturbance from the Hurtt et al. (2011) dataset described below. The pre-industrial control simulation uses an agricultural disturbance mask, fixed in time at 1860 values in this same dataset. The natural and GHG detection and attribution simulations (7.1, 7.2) also use a fixed, pre-industrial land-use disturbance mask, but the land-use only simulations (7.3) use the time varying historical data as in the full historical simulation. For the mid-Holocene and LGM experiments, there is no agricultural disturbance (which therefore differs from the control run where a pre-industrial disturbance mask is used). The Last Millennium and AMIP simulations do not use the dynamic vegetation scheme of HadGEM2-ES and instead directly prescribe land-cover as described in Sects. 8 and 9, respectively.
The historical land use data is based on the HYDE database v3.1 (Klein Goldewijk et al., 2010, 2011, whilst the future RCP land use scenarios were produced by the respective IAMs and are thus internally consistent with the socio-economic storylines and carbon emissions of the scenarios. A harmonization manipulation was performed, as described in Hurtt et al. (2009Hurtt et al. ( , 2011, that attempts to preserve gridded and regional IAM crop and pasture changes as much as possible while minimizing the differences in 2005 between the historical estimates and future projections (Fig. 11). The harmonization procedure employs the Global Land-use Model (GLM) that ensures a smooth and consistent transition in the harmonization year, grids (or re-grids) the data when necessary, spatially allocates national/regional wood harvest statistics, and computes all the resulting landuse states and transitions between land-use states annually from 1500-2100 at half-degree (fractional) spatial resolution, including the effects of wood harvesting and shifting cultivation. Both historical and future scenarios were made available at 0.5 • × 0.5 • resolution with annual increments and were downloaded (see URL 11 in Appendix A). The crop and pasture fraction is re-gridded onto the HadGEM2-ES grid using area average re-gridding. Crop and pasture are then combined to produce a combined "agriculture" mask ( Fig. 12). It is assumed crop and pasture both mean "only grass, no tree or shrub". This assumption is simplistic as in some regions of the world "pasture" refers to rangeland where animals are allowed to graze on whatever natural vegetation exists there (which may include trees and shrub). Similarly, woody biofuel crops are treated (erroneously in this case) as non-woody crops. As noted by Hurtt et al. (2011), the definition and reporting of biofuel differs even within the IAMs producing the 4 RCP scenarios. However, the necessary data to avoid these problems are not available and we expect the impact of any inconsistency to be minor. It remains an outstanding research activity to improve past and present reconstructions of land-cover which can account for temporally and regionally varying changes in definitions and terminology.
Our approach of allocating displaced woody biomass into product pools which subsequently release CO 2 to the atmosphere means that our definition of "land use CO 2 flux" that will be reported in this diagnostic is rather limited -it will not contain any subsequent changes in soil carbon for example, nor will it capture any effects of agricultural abandonment and regrowth. This diagnostic, therefore, should be seen as a part of the complex system of land-use carbon fluxes. A more complete picture of the impact of land-use change on carbon storage in HadGEM2-ES would require further simulations as discussed in Arora and Boer (2010). For example calculations could be made with offline simulations of the land-surface model, or two different GCM simulations (with and without land-use changes) and diagnosing the differences between them. It is vital when reporting or  analysing land-use emissions from such models, or comparing between different models or techniques that the precise methodology is described to avoid misunderstanding. It remains a research priority to formally define methodologies for reporting simulated land-use fluxes.
An additional uncertainty in reporting the land use carbon fluxes is that the wood products pools are assumed to be zero everywhere at 1860 whilst the terrestrial carbon cycle (carbon content and vegetation fractions) have been run to equilibrium with 1860 climate and anthropogenic disturbance. Changes in land use cover prior to 1860 involve land use expansion and hence both direct emissions prior to 1860 and some legacy emissions post-1860 due to inputs of disturbed biomass to the soil carbon. No attempt has been made to include these effects in our output but future work will assess and quantify this effect.

Natural climate forcing
HadGEM2-ES can simulate the climate response to two aspects of natural climate forcing: changes in solar irradiance and stratospheric volcanic aerosol. In the control experiment these forcings are kept constant in time. For the historical experiments (including the emissions-driven and decoupled carbon cycle simulations and AMIP runs) they are varied due to observed reconstructions. For simulations of future periods, where natural forcings are not known, they are varied as described here to minimise the impact of possibly incorrect assumptions about the natural forcings. See Sect. 8 for details on the solar and volcanic forcings applied to the palaeoclimate simulations.

Total solar irradiance
The way the model deals with variations in total solar irradiance (TSI) is the same as in earlier generations of Hadley Centre models, HadCM3 (Stott et al., 2000;Tett et al., 2002) and HadGEM1 (Stott et al., 2006). Annual mean variations in TSI are partitioned across the six shortwave spectral bands (0.2-10 µm) to estimate the associated spectral changes with TSI variations (Lean et al., 1995a). With the changes across the spectral bands the Rayleigh scattering and ozone absorption properties are also varied. See Stott et al. (2006) for further details.
The TSI data used for the historic period were recommended by CMIP5 (Lean et al., 2009 -L09) and are created from reconstructions of solar cycle and background variations in TSI. The solar cycle component is produced from a multiple regression of proxy measures of bright and dark regions of the Sun with satellite reconstructions of TSI (Fröhlich and . Background variations in TSI are produced from a model of solar magnetic flux incorporating historic sunspot numbers (Wang et al., 2005). The annual mean TSI was processed to force the mean of the 1700-2004 period to be the same as the model control solar constant value (1365 Wm −2 ).
The annual mean TSI and variations across the UV, visible and IR bands are shown in Fig. 13. For comparison the TSI used in previous model simulations are also shown. The TSI now recommended for use in CMIP5 studies is consistent with the latest assessment of TSI variations by the IPCC's Fourth Assessment report -AR4 -  which estimated the solar radiative forcing to be 50 % of that given in the previous report. The increase in TSI for L09 between the Maunder minimum in the 17th century and the average over the last 2 solar cycles of the 20th century is 1.11 Wm −2 . This compares to 2.73 Wm −2 for the TSI used in the HadGEM1 simulations (Stott et al., 2006) and 2.95 Wm −2 in the HadCM3 simulations (Stott et al., 2000).

Stratospheric volcanic aerosol
How HadGEM2-ES incorporates changes in stratospheric volcanic aerosol is the same as in the HadGEM1 model (Stott et al., 2006). Aerosols in the troposphere are linked to emission sources and sulphur and chemistry feedbacks. Aerosols in the  (Lean et al., 1995b) used in HadCM3 (Stott et al., 2000) and SK03 (Solanki and Krivova, 2003) used in HadGEM1 (Stott et al., 2006). Solar irradiance averaged over (b) the ultraviolet band (200-320 nm), (c) the two visible bands (320-690 nm), and (d) infrared bands (690-1190, 1190-2380, and 2380-10 000 nm). Percentages are given with respect to the solar constant (1365 Wm −2 ), and associated distribution across the shortwave spectral bands. stratosphere are separated from these processes and are prescribed. Stratospheric aerosol concentrations are varied across four equal area latitudinal zones on a monthly timescale. The aerosol is distributed vertically above the tropopause such that the mass mixing ratio is constant across the levels. In this version of the model, volcanic aerosol is not related to, and does not interact with, other simulated aerosol behaviour.
The dataset used for the historic period was monthly stratospheric optical depths, at 550 nm, from 1850 to 2000 (Sato et al., 1993, see URL 12 in Appendix A) which was averaged over the four equal area latitudinal zones and converted into aerosol concentrations (Stott et al., 2006 and figures therein).
In a previous study (Stott et al., 2006) the data was extended past 2000 by continuing an assumed 1 yr timescale decay, from 1997, to a minimum and then keeping concentrations constant. There is some evidence that background aerosol concentrations are not as low as this assumes (Thomason et al., 2008). Also future volcanic  (Stott, 2006), for the period 1990-2040.
activity is likely to introduce further aerosol into the stratosphere. There was no specific CMIP5 recommendations, apart from suggesting that the same concentration of stratospheric aerosol is present in the future simulations as in the control (Taylor et al., 2009), being aware of any step-change in aerosol. The future dataset of optical depth is constructed as follows. The 1 yr decay timescale constructed for the post 1997 period appears to give a break point in the data. We reconstruct the data, to remove the break point, from 1997 to 2002 by continuing the decay timescale of 3.3 yr seen in the 1995-1997 period of the data. A value of stable observed optical depth at 1020 nm since 2000 was found to be 0.001 (Thomason et al., 2008). As optical depth is estimated to vary inversely with wavelength, this suggests a minimum of global stratospheric aerosol optical depth of 0.002 at 550 nm, approximately 20 times more than used in the HadGEM1 study. During the period 2020-2040 concentrations were increased to match those in the control simulation (optical depth 0.0097). This compromise was an attempt to balance the lack of knowledge of when large eruptions would occur in the future with the unlikely possibility of no major volcanic eruptions significantly influencing aerosol amounts in the stratosphere for 100 yr. The global mean of the stratospheric optical depth is shown in Fig. 14, compared with what was used in the HadGEM1 simulation.

Palaeoclimate boundary conditions, including geophysical changes
In order to complete the palaeo-climate simulations (3.4-3.6) a number of modifications need to be made to the model. The mid-Holocene simulation (3.4) required GHG concentrations of CH 4 (650 ppb) and N 2 O (270 ppb), and halocarbon concentrations of zero (as in the pre-industrial control simulation). Stratospheric Ozone and pre-cursor emissions of tropospheric ozone remain the same as the pre-industrial control run, as do concentrations of CO 2 and the land sea mask. Because tropospheric Ozone is calculated interactively in HadGEM2-ES, Ozone concentrations in the palaeo simulations may not be identical to those in the pre-industrial control simulation: it is the pre-cursor emissions which we keep the same as the control run. This is also the case for dust and ocean DMS emissions which are simulated interactively and may differ due to changes in the simulated climate or vegetation cover.

Mid-Holocene (6 kya)
In the mid-Holocene, Earth's orbit differed from present day affecting the timing and magnitude of solar energy reaching the surface. Orbital parameters were modified to correspond to those required by the PMIP3 protocol (see URL 13 in Appendix A). Figure 15 shows the monthly anomalies of TOA SW radiation relative to the present day for the official PMIP3 requirements (black) and as calculated within HadGEM2-ES (red). The land use disturbance mask is set to zero everywhere for the mid-Holocene thus assuming that there is no human activity which would displace forests in any location.

Last glacial maximum (LGM, 21 kya)
The LGM simulation (3.5) setup requires major changes to the geophysical state of the land and ocean bed. This simulation has not yet been performed and some of these modifications are ongoing. The land sea mask and orography are changed to increase ice sheet volumes and to represent the associated decreased sea level. The bathymetry of the ocean model is also changed to reflect the decreased sealevel. Figure 16 shows how the land-sea mask changes. GHG concentrations are prescribed as, CO 2 of 185 ppm, CH 4 of 350 ppb, N 2 O of 200 ppb. Halocarbons are zero (as in the pre-industrial control setup) and O 3 is treated the same as in the pre-industrial control run by using the same stratospheric ozone concentrations and tropospheric ozone precursor emissions. Boundary condition files were downloaded from the PMIP3 website (see URL 13 in Appendix A). Orbital parameters will be changed to the required configuration and the river routing ancillary will also be manually updated to take into account changes in the land sea mask and ensure that all rivers flow into the ocean rather than terminating at a land-point.

Last Millennium (800 AD-present)
Different from the centennial simulation, in the Last Millennium simulation no anthropogenic disturbance is used to update the land cover boundary conditions. Instead the land cover is updated from historical land cover reconstruction data from Pongratz et al. (2008, hereafter P08). The original data are on a grid of 0.5 • × 0.5 • and provide the spatial distribution of 14 vegetation types from the present day back to year 800 AD. The vegetation types in the P08 database are mapped into the 5 TRIFFID vegetation classes. The details of the reclassification are shown in Table 4. In case of no one-to-one mapping, the following rules are applied: -C3/C4 pasture is treated as natural C3/C4 grass.
-Tundra is treated as mixture of shrubs, grass and bare soil. The mixture is chosen to match as close as possible the distribution obtained in Essery et al. (2003) in tundra regions for the present time.
-Crops are treated as in Essery et al. (2003), as a mixture of C3/C4 grass and soil. The ratio between C3 and C4 grass is used as threshold to discriminate between C3 and C4 grass to be used in crop.
After the application of inland water mask and ice mask, the unclassified fraction of each grid cell is filled with the soil land class. For the urban land class we used the data from HYDE3.1 (Klein Goldewijk et al., 2010, 2011. The data provide the urban/built-up area on a grid cell of 0.083 • × 0.083 • . We used local area-averaging interpolation to regrid HYDE 3.1 data into the P08 grid. In the coastal areas only the grid cells where at least 30 % of the original data showed urban coverage were considered as urban. The half-degree historical land cover data is then re-gridded onto the HadGEM2-ES grid using area average re-gridding. Figure 17 shows the total woodland (needle leaf+broad leaf trees) reduction with respect to year 800 respectively in year 1000, 1500 and 1990 on the HadGEM2 grid.
For the volcanic forcing we use the reconstruction of aerosol optical depth (AOD) provided by Crowley et al. (2008) and we maintain the same latitudinal distribution as described in Sect. 7.2. The reconstruction is based on icecore records from Antarctic and Greenland calibrated based on the Pinatubo eruption and is validated by comparison to the 20th century instrumental records. The data closely match the Sato et al. (1993) reconstruction for the 20th century. Figure 18 shows the volcanic aerosol optical depth at 0.55 µm integrated across the lower stratosphere between 15 and 25 km for the 4 latitudinal bands from the year 800 to 2000.
For the solar forcing up to 1810 we implemented the data of Steinhilber et al. (2009) observed TSI changes from 1976 onwards, for the period 1810-2000 we used the solar reconstruction of Wang et al. (2005), which is based on a flux transport model of the open and closed flux which used the observed sunspot record as the main input. For consistency between the two forcings, the Steinhilber et al., reconstruction was normalised to the Wang et al., values from 1976Wang et al., values from -2006 and also had a synthetic 11 yr cycle overlaid, according to the PMIP3 guidelines (Schmidt et al., 2011). To get the two different reconstructions to match up, a linear combination of the Wang et al., reconstruction with background and without background was used so that the mean values of the two reconstructions were identical between 1810 and 1820. Eventually the whole TSI has been normalized to a mean value over the whole period of 1365 Wm −2 . The forcing over the total duration of the simulation runs is shown in Fig. 19, with different colours to highlight the two reconstructions used.
UKCA is included in these simulations allowing the simulation of a 3-D methane field and interaction with O 3 and aerosols, but with the concentrations of well-mixed GHGs CO 2 , CH 4 and N 2 O prescribed. The set up follows the PMIP3 standard (Schmidt et al., 2011): data over the post-1860 industrial period (Hansen and Sato, 2004) are linked with splines through the ice core results of the last 2 millennia. Fig. 19. Reconstruction of total solar irradiance since year 850AD. The black line indicates data from Steinhilber et al. (2009), the red line data from Wang et al. (2005).
For the pre-1860 period, black carbon aerosols are set to zero, while the biomass burning is kept constant at the preindustrial, 1860 value. CMIP5 requests initial-condition ensembles of simulations of some experiments in order to be able to estimate any component of apparent changes in climate which may be due to internal variability in the model. In order to produce an ensemble of initial condition members for the historical simulations it is necessary to somehow perturb the initial conditions. A standard technique for this is simply to choose different points on the control run from which to take the initial conditions for a simulation. GCMs possess sufficient sensitivity to initial conditions that for even a small perturbation, their day-to-day weather will soon diverge. But they may also possess some long-term "memory" which may mean ensemble members taken too close together in the control simulation, or from widespread but similar initial states, are not fully independent. Extensive evidence exists from previous long control simulations showing simulated climate possesses large-scale variations on decadal to centennial timescales (Delworth et al., 1993;Delworth and Mann, 2000;Latif et al., 2004;Knight et al., 2005). Typically, these variations are associated with the principal modes of decadal variability of the climate system -the Atlantic Multidecadal Oscillation (AMO) (Enfield et al., 2001) and the Pacific Decadal Oscillation (PDO), sometimes referred to as the Interdecadal Pacific Oscillation (IPO) (Power et al., 1999). The AMO is a North Atlantic-centred mode in which sea surface temperatures (SSTs) vary coherently within the basin on multidecadal to centennial timescales, and which can have far reaching climate impacts (Knight et al., 2006). The PDO/IPO has a characteristic pattern of anomalously warm and cool SSTs in the Pacific Ocean that resembles a modified El Niño pattern, and typically has a shorter timescale of about two decades (Kwon and Deser, 2007). So-called "perfect model" experiments (Collins and Sinha, 2003), in which sections of model control simulations are repeated after small initial perturbations, demonstrate the potential for multidecadal oceanic processes to provide a long-term memory of the initial state. This is undesirable as we would like the ensemble mean to provide an unbiased estimate of the model's response to imposed forcings. In terms of the initialisation of the transient simulations from the model control described here, this implies that care needs to be taken in choosing a sufficient range of initial states with respect to decadal modes.
North Atlantic and Pacific patterns of the decadalcentennial variability in the HadGEM2-ES control simulation were derived from a principal component analysis of low-pass filtered simulated annual mean SST data in each basin. The filter half-power timescales were chosen to preserve only the decadal and longer components of the variability. The patterns derived bear a strong resemblance to those seen in observations (Parker et al., 2007). Projecting these patterns from 500 yr of the control run against the low-pass filtered SST fields, indices of Atlantic and Pacific decadal variability were derived (Fig. 20). The Atlantic index (labelled "AMO") has considerable variability on decadal to centennial timescales, whereas the Pacific mode (labelled "IPO") tends towards variations on shorter timescales. Despite long-term variability, neither index exhibits a long-term drift. Figure 21 shows the trajectory of the control model in the space defined by these two indices, as well as the points at which the ensemble members of the transient simulation were initialised. We wanted to retain an objective method for selecting initial conditions rather than using this metric to subjectively choose years from the control run. As such we select initial conditions at 50 yr intervals from the control run (as indicated by red dashed lines in Fig. 20), and use these indices of long-term variability to monitor whether these initial states are independent as desired. The range of initial states selected possesses an average that is close to zero compared to the variability in both indices. This indicates that there is no mean signal of the AMO or IPO in the initial conditions, giving confidence that the net long-term signal from the initial state has been minimised in the transient ensemble. We note that the 4 ensemble members chosen span a reasonable range of the IPO variability but a relatively narrow range of AMO variability, clustered close to zero. Future work may explore the response of extra ensemble members which start from deliberately chosen high or low AMO states.

Atmosphere only model (AMIP) simulations (3.3E)
Traditionally, AMIP experiments (Gates, 1992) comprise the atmosphere-only version of a GCM forced only by timevarying fields of prescribed sea-surface temperatures (SSTs) and sea-ice. The atmospheric component of GCMs generally includes the land-surface model which means that surface properties such as soil temperature and moisture are simulated in AMIP experiments, but the land-cover would be prescribed from a climatology and held constant in time.
For CMIP5 the AMIP experiments, 3.3(E), use timevarying datasets of SST and sea-ice at monthly resolution as recommended by CMIP5 (Hurrell et al., 2008). The experiment design also recommends time varying forcing of the other climate drivers such as GHGs, aerosols and natural forcing as imposed in the coupled historical simulation, 3.2. For ESMs such as HadGEM2-ES which include dynamic vegetation there is a decision to make regarding whether to prescribe or simulate the land cover, and if the latter how to initialise it for the start of the AMIP period . For the HadGEM2 AMIP simulations we chose to prescribe the land cover, but from a time-varying dataset to represent the impact of historical changes in anthropogenic land use. The land-cover dataset was derived from the IGBP presentday climatology (Loveland et al., 2000) and reconstructions of anthropogenic land-use from the HYDE3 dataset (Klein Goldewijk et al., 2010) as processed for CMIP5 by Hurtt et al. (2011). It is thus consistent with the land-use changes imposed in the fully coupled HadGEM2-ES simulations with dynamic vegetation (see Sect. 6). Historical land-use and future projections (see URL 11 in Appendix A) and the dataset of crop, pasture and urban area in version 1 used to construct time varying land cover. Land cover in HadGEM2 consists of nine surface types; broadleaf trees, needleleaf trees, C3 grass, C4 grass, shrubs, urban, water, soil and ice (Essery et al., 2003). Consistent with our use of this data in the HadGEM2-ES simulations, here crop and pasture are assumed to be a combination of C3 and C4 grass. Using the fractions of C3 and C4 grasses derived from the IGBP climatology, crop and pasture are converted into C3 and C4 grass. Changes due to time-variant C3, C4 grass and urban area are matched by removing equally distributed fractions of broadleaf, needleleaf tress and shrubs in order to conserve the total vegetated fraction of each grid cell. Water, soil, and ice are represented by the IGBP climatology. As a result of the atmosphere-only version of HadGEM2 having a different land-cover and different surface climatology it was decided to retune the mineral dust emissions scheme to enable better As long-term memory in coupled GCMs is mainly due to ocean processes it was not necessary to separate AMIP ensemble member initial conditions by 50 years in the control run. Rather, in order to initialise the 5 HadGEM2 AMIP ensemble members we chose to perturb the initial conditions in 2 different ways (Table 5). Firstly we could take the atmospheric (including land surface) state from part-way through a previous AMIP simulation (perturbation method listed as "atmos and surface" in Table 5), or secondly we could reset the land-surface state back to climatological values (listed as "atmos"). As recommended in the CMIP5 experimental design, we used a mix of the two approaches. For all experiments, there was a 3-month spin-up period from September to December 1978. In such atmosphere-only experiments we expect the atmospheric state to adjust rapidly (< 1 month) to the prescribed SST and sea-ice boundary conditions. However, the land surface could exhibit memory on seasonal or longer timescales (Koster and Suarez, 2001). Hence these two approaches to initialising AMIP ensemble members can be later analysed as two sub-ensembles to assess the importance of land-surface state for predictability on seasonal to decadal timescales.

Discussion and concluding comments
Arbitrary or subjective decisions in experimental design can cause differences in results which hamper attempts to quantify and understand model spread and uncertainty in future climate projections. CMIP5 represents a coordinated attempt to define a common modelling protocol by which modelling centres worldwide can abide, in order to facilitate comparison of complex GCM experiments and thus avoid the impacts of subjective decisions. However, by necessity there may still be a number of subjective decisions required when elements of the experimental protocol are not applicable to a certain model or model configuration. It is our hope that any such occurrences with HadGEM2-ES will not have a large impact on the interpretation of the results, but we discuss here for completeness some possible impacts.
In the HadGEM2-ES earth system model , some of the components of the earth system are now simulated interactively by the model rather than being prescribed as external boundary conditions. For example, HadGEM2-ES includes interactive tropospheric chemistry and hence can simulate the evolution of atmospheric methane and ozone concentration in response to meteorological conditions and emissions of reactive gases. Therefore atmospheric composition may not follow exactly the CMIP5prescribed values. In our experiments we have forced the surface methane concentrations to follow the CMIP5 values in order to reduce any model drift away from the scenarios, but there may still be differences in CH 4 concentrations in the free atmosphere away from the surface.
Similarly, simulated tropospheric ozone may not follow the exact CMIP5-prescribed concentrations. In general, we see this enhanced, process-based functionality of the model as a benefit -the rationale behind developing such a complex earth system model is precisely to study these interactions and allow them to change consistently with future climate in a way not possible with prescribed concentrations. But we do acknowledge that these differences also represent a divergence from the precise CMIP5 protocol which should be borne in mind during subsequent multi-model analysis.
Other areas where a divergence may occur due to the structure of the ESM include land-use forcing and set-up of detection and attribution experiments.
By prescribing anthropogenic disturbance in addition to simulated, dynamic vegetation we risk diverging from the intended impact of the prescribed land-use change. For example, if the model initially simulates too much or too little forest in a region to be deforested then the impact of this deforestation on both carbon storage and physical surface properties will be too great or too little. We mitigate the risk of this impact in the emission-driven experiments by overwriting the land-use flux seen by the atmosphere by the CMIP5prescribed land-use emissions dataset. However, the issue remains for any biophysical effect of land-use change. Future work is required to quantify the impact of this effect.
Detection and attribution studies aim to attribute changes in observed climate to driving processes. Model experiments are designed to do this by varying or holding constant separate forcings such as natural, greenhouse gases or aerosols. In this way the characteristic spatial or temporal patterns of response to each forcing can be derived and an optimal scaling found to best match observations. The scaling is then used to deduce if the signal for the forcing is detected and if the scaled response is consistent with the original signal, thus providing confidence for an attribution statement. In GCMs to date the distinction between, say, natural and greenhouse gas forcing is clear, and it is easy in the model to hold one fixed whilst varying the other. However, in an ESM where simulated GHG concentrations (such as methane or ozone) or aerosols may respond to climate, this distinction becomes slightly blurred. If natural forcings affect atmospheric processes which alter GHG or aerosol amounts then should these be allowed to vary or not in the "natural" detection and attribution experiment? Or should the HadGEM2-ES "natural" simulations be forced to solely consider the direct radiative effects of the natural forcings?
In the HadGEM2-ES natural experiment we varied the TSI and stratospheric volcanic aerosol concentrations as per the historical simulation, kept all other emissions/concentrations the same as in the control simulation and allowed the earth system processes to vary as normal. We decided on a more complicated set up for the GHG only forced run. Concentrations of the greenhouse gases (CO 2 , CH 4 , N 2 O and the halocarbon species) were prescribed throughout the atmosphere for the radiation scheme. However, prescribing the methane seen by the chemistry scheme would influence the chemistry and thus species like ozone, which we did not want to be varied in this experiment. So in the interactive chemistry part of the model, the set up was as in the control, i.e. methane emissions at surface and concentrations in the atmosphere as in the control simulation. This "chemistry methane" does not interact with the radiation scheme. Our choice to do this was therefore a pragmatic compromise between the ESM part of the model and the needs of a detection and attribution study.
In this paper we have documented how we have implemented the CMIP5 experimental protocol for the centennial simulations in the Met Office HadGEM2-ES earth system model. We have successfully set-up and performed the experiments as described here and will make the results available via the PCMDI multi-model database. We hope these will form a valuable contribution to the CMIP5 modelling activity.