The DeepMIP contribution to PMIP4: experimental design for model simulations of the EECO, PETM, and pre-PETM (version 1.0)

. Past warm periods provide an opportunity to evaluate climate models under extreme forcing scenarios, in particular high ( > 800 ppmv) atmospheric CO 2 concentrations. Although a post hoc intercomparison of Eocene ( ∼ 50 Ma) climate model simulations and geological data has been carried out previously, models of past high-CO 2 periods have never been evaluated in a consistent framework. Here, we present an experimental design for climate model simulations of three warm periods within the early Eocene and the latest Paleocene (the EECO, PETM, and pre-PETM). Together with the CMIP6 pre-industrial control and abrupt 4 × CO 2 simulations, and additional sensitivity studies, these form the ﬁrst phase of DeepMIP – the Deep-time Model Intercomparison Project, itself a group within the wider Paleoclimate Modelling Intercomparison Project (PMIP). The experimental design speciﬁes and provides guidance on boundary conditions associated with palaeogeography, greenhouse gases, astronomical conﬁguration, solar constant, land surface processes, and aerosols. Initial conditions, simulation length, and output variables are also speciﬁed. Finally, we explain how the geological data sets, which will be used to evaluate the simulations, will be developed.


Introduction
There is a large community of Earth scientists with strong interests in "deep-time" palaeoclimates, here defined as climates of the pre-Pliocene (i.e. prior to ∼ 5 Ma). Recently, a growing community of modelling groups focussing on these periods is also beginning to emerge. DeepMIP -the Deeptime Model Intercomparison Project -brings together modellers, the data community, and other scientists into a multidisciplinary international effort dedicated to conceiving, designing, carrying out, analysing, and disseminating an improved understanding of these time periods. It also aims to assess their relevance for our understanding of future climate change. DeepMIP is a working group in the wider Paleoclimate Modelling Intercomparison Project (PMIP4), which itself is a part of the sixth phase of the Coupled Model Intercomparison Project (CMIP6, Eyring et al., 2016). In Deep-MIP, we will focus on three time periods in the latest Paleocene and early Eocene (∼ 55-50 Ma), and for the first time, carry out a formal coordinated model-data intercomparison. In addition to the experimental design presented here, Deep-MIP will synthesize existing climate proxy records, and develop new ones if appropriate. The aim will be to effectively characterize our understanding of the palaeoclimate of the chosen interval through the synthesis of climate proxy records, to compare this with the model simulations, and to understand the reasons for the intra-and inter-model and data differences. The ultimate aim is to encourage model development in response to any robust model deficiencies that emerge from the model-data comparison. This is of particular relevance to models that are also used for future climate projection, given the relative warmth and high CO 2 that characterizes many intervals of deep-time.

Previous work
An informal, post hoc model-data intercomparison has previously been carried out for the early Eocene . This compared the results of four models from five modelling groups with marine and terrestrial data syntheses, and explored the reasons for the model-model differences using energy balance diagnostics. That study contributed to the recent IPCC AR5 report (Box 5.1, Fig. 1), but it also revealed challenging differences between model simulations of this period and intriguing model-data mismatches, as well as inconsistencies between proxies (Fig. 1). For example, proxy-derived SST estimates indicate a weak meridional temperature gradient during the early Eocene which cannot easily be reconciled with the model simulations. Further work resulting from this intercomparison included that of Gasson et al. (2014), who investigated the CO 2 thresholds for Antarctic ice sheet inception; Lunt et al. (2013), who compared the ensemble and data to further Eocene simulations; and Carmichael et al. (2016), who investigated the hydrological cycle across the ensemble and compared model results with proxies for precipitation.
The previous exercise points to the need for a more coordinated experimental design (different modelling groups had carried out simulations with different boundary conditions, and different initial conditions, etc.), and a greater understanding of the reasons behind differences between different  Figure 1. Zonal mean Eocene sea-surface temperature warming, presented as an anomaly relative to present or pre-industrial values. Warming from the five models in "Eomip"  are shown as coloured lines; for each model only the CO 2 concentration that best fits the temperature proxy observations is shown. Warming derived from the proxies are shown as filled circles, with error bars representing the range of uncertainty associated with proxy calibration and temporal variability. Larger symbols represent "background" early Eocene state, smaller symbols represent the EECO. Adapted from Fig. 8a in Lunt et al. (2012). climate proxies. Those challenges provide the motivation for DeepMIP.

The chosen intervals -the Early Eocene Climatic
Optimum (EECO) the Paleocene-Eocene Thermal Maximum (PETM), and the pre-PETM.
The choice of a time interval on which to focus is based on a balance between (i) the magnitude of the anticipated climate signal (larger signals have a higher signal-to-uncertainty ratio, and larger signals provide a greater challenge to models), (ii) the uncertainties in boundary conditions that characterize the interval (small uncertainties result in more robust conclusions as to the models' abilities, and minimize the model sensitivity studies required to explore the uncertainties), and (iii) the amount and geographic distribution of palaeoclimate data available with which to evaluate the model simulations.
We have chosen to focus on the latest Paleocene and early Eocene -∼ 55 to ∼ 50 Ma (the Ypresian stage), as it is the most recent geological interval characterized by high (> 800 ppmv) atmospheric CO 2 concentrations. Within the latest Paleocene and early Eocene, DeepMIP will focus on three periods (see Fig. 2): 1. The Early Eocene Climatic Optimum (EECO, ∼53-51 Ma) which is the period of greatest sustained (> 1 Myr) warmth in the last 65 million years.
2. The Paleocene-Eocene Thermal Maximum (PETM, ∼ 55 Ma) which is the event of greatest warmth in the last 65 million years.
3. The period just before the PETM (pre-PETM, or latest Paleocene) which is relatively warm compared with modern, but is cooler than both the PETM and the EECO.
These intervals have been the focus of numerous studies in the geological literature, and some syntheses of proxies from these intervals already exist (e.g. Huber and Caballero, 2011;Lunt et al., 2012;Dunkley Jones et al., 2013). The pre-PETM provides a reference point for both the PETM and the EECO. In addition, all three time periods can be referenced to modern or pre-industrial. This is in recognition that both modelling and proxies are most robust when considering relative changes, as opposed to absolutes.
Compared to earlier warm periods, such as the mid-Cretaceous, the palaeogeography during the early Eocene is reasonably well constrained, and freely available digital palaeogeographic data sets exist; however, there are wide uncertainties in estimates of atmospheric CO 2 at this time. Furthermore, due at least in part to interest in the Eocene and PETM for providing information of relevance to the future (e.g. Anagnostou et al., 2016;Zeebe et al., 2016), there is a relative wealth of climate proxy data with which the model results can be compared.

Experimental design
The DeepMIP experimental protocol consists of five main simulations -pre-industrial, future, two in the early Eocene (EECO and PETM), and one in the latest Paleocene (pre-PETM), plus a number of optional sensitivity studies (see Sect. 4.3). The simulations are summarized in Table 1.

Pre-industrial and future simulations
The pre-industrial simulation should be as close as possible to the CMIP6 standard, piControl (Eyring et al., 2016). Many groups will already have carried out this simulation as part of CMIP6. Some groups may need to make changes to their CMIP6 model configuration for the DeepMIP palaeoclimate simulations (for example changes to ocean diffusivity). If this is the case, we encourage groups to carry out a new pre-industrial simulation with the model configuration used for DeepMIP palaeoclimate simulations.
The future simulation is the CMIP6 standard abrupt-4xCO2 simulation (Eyring et al., 2016), which branches off from the piControl simulation, and in which atmospheric CO 2 is abruptly quadrupled and then held constant for at least 150 years.

EECO/PETM and pre-PETM simulations
This section describes the DeepMIP palaeoclimate simulations. There are three standard palaeoclimate simulations (deepmip-stand-3xCO2, deepmip-stand-6xCO2, deepmipstand-12xCO2), which differ only in their atmospheric CO 2 concentration, plus a number of optional sensitivity studies. In general terms, we consider the deepmip-stand-3xCO2 simulation as representative of the pre-PETM, and the other two simulations as representing two different scenarios for the EECO and/or PETM.

Palaeogeography and land-sea mask
Herold et al. (2014, henceforth H14) is a peer-reviewed, traceable, freely available digital reconstruction of the early Eocene interval. It includes topography and sub-gridscale topography, bathymetry, tidal dissipation, vegetation, aerosol distributions, and river runoff. The palaeogeography from H14 should be used for all the standard DeepMIP palaeoclimate simulations (see Table 1); they are provided digitally in netcdf format in the Supplement of H14 (see Table 2), at a resolution of 1 • × 1 • , and are illustrated here in Fig. 3a. The palaeogeographic height should be applied as an absolute, rather than as an anomaly to the pre-industrial topography. Most models additionally require some fields related to the sub-gridscale orography to be provided. Because subgridscale orographies are very sensitive to the resolution of the underlying data set, the sub-gridscale orography (if it is required by the model) can be estimated based on fields also provided in Supplement of H14. This can be implemented as the modelling groups see fit, but care should be taken that the pre-industrial and Eocene sub-gridscale topographies are as consistent as possible. In addition, the code used to calculate the sub-gridscale orographies in the CESM (Gent et al., 2011) model is also provided in the Supplement of H14.
The land-sea mask can be initially calculated from the palaeogeographic height, by assigning ocean to palaeogeographic heights less than or equal to zero. Care should be taken when defining the land-sea mask for the ocean component of the model that the various seaways are preserved at the model resolution; this may require some manual manipulation of the land-sea mask.
Included in Supplement of this paper are palaeorotations such that the modern location of grid cells in the Eocene palaeogeography can be identified, as can the Eocene location of modern grid cells.
We encourage sensitivity studies to the palaeogeographysee Sect. 4.3.2.

Land surface
i. Vegetation: the vegetation in the DeepMIP palaeoclimate simulations should be prescribed as that in H14, which is included digitally as a netcdf file in the Supplement of H14 (Table 2; note that the BIOME4 vegetation should be used rather than the Sewall vegetation, and that groups may choose to base their vegetation either on the 27 biomes or the 10 megabiomes), and shown here in Fig. 4. Groups should make a lookup table for converting the H14 Eocene data set to a format that is appropriate for their model. To aid in this process, a modern vegetation data set is also provided in the Supplement of H14, using the same plant functional types as in the H14 Eocene reconstruction; in addition, the lookup table for the CLM (Oleson et al., 2010) land model is provided as a guide in the Supplement of this paper.
ii. Soils: parameters associated with soils should be given constant values over the globe, with values for these parameters (e.g. albedo, water-holding capacity, etc.) given by the global mean of the group's pre-industrial simulation.
iii. Lakes: no lakes should be prescribed in the DeepMIP palaeoclimate simulations, unless these are predicted dynamically by the model.
iv. River runoff: river runoff should be taken from the H14 reconstruction, which is included digitally as a netcdf file in the Supplement of H14 (see Table 2).  Table 2).

Greenhouse gas concentrations
Each group should carry out three simulations at three different atmospheric CO 2 concentrations, expressed as multiples of the value in the pre-industrial simulation (typically 280 ppmv, Sect. 4.1): (i) 3× pre-industrial (typically 840 ppmv), (ii) 6× pre-industrial (typically 1680 ppmv), and (iii) 12× pre-industrial (typically 3360 ppmv). Assuming a simple relationship between CO 2 and temperature, the benthic oxygen isotope record (see Fig. 2) implies that, within uncertainty of the CO 2 proxies, CO 2 concentrations in the EECO and PETM were similar. As such, whereas the low-CO 2 simulation can be considered as representing the pre-PETM, the two higher CO 2 simulations are intended to represent a range of possible PETM and EECO climate states. The values themselves are based primarily on recent work using boron isotopes (Anagnostou et al., 2016), which indicates that EECO CO 2 was 1625 ± 760 ppmv (Fig. 5).
It is thought that non-CO 2 greenhouse gases during the early Eocene were elevated relative to pre-industrial, especially CH 4 (e.g. ∼ 3000 ppbv, Beerling et al., 2011). However, there is considerable uncertainty as to exactly how elevated they were. Given these uncertainties, and the fact that we have chosen to use a modern solar constant as opposed to a reduced solar constant (see Sect. 4.2.5), which would otherwise offset the CH 4 increase, all non-CO 2 greenhouse gases and trace gases should be set at the CMIP6 pre-industrial concentrations. In effect, we assume that the CO 2 forcing represents the CO 2 , CH 4 (and other non-CO 2 greenhouse gases), and solar forcings. For reference, the radiative forcing associated with an increase in CH 4 concentrations from preindustrial values to 3000 ppbv is +0.98 Wm −2 (Byrne and Goldblatt, 2014), and the radiative forcing associated with an decrease in solar constant from 1361 to 1355.15 Wm −2 (see Some groups may find the higher CO 2 simulations problematic as some models are known to develop a runaway greenhouse at high CO 2 (M. Heinemann, personal communication, 2012). In this case, in addition to the 3× simulation, groups can carry out simulations at 2× and 4×. In this way, the modelled Eocene climate sensitivity and its nonlinearities can still be investigated.
If groups only have the computational resources to carry out two simulations, they should carry out the 3× and 6× simulations. For groups that can only carry out a single simulation, the analysis of the runs will be limited due to the focus on anomalies in DeepMIP, but we still encourage such groups to participate; in this case they should just carry out the 3× simulation.
For groups with extensive computational resources, we encourage them to carry out additional sensitivity simulations over a range of CO 2 values, and in particular at 1×, see Sect. 4.3.1.

Aerosols
The representation of aerosols (including mineral dust) in Earth system models is undergoing a period of rapid development. Therefore, we leave the implementation of aerosol fields or emissions rather flexible, and give several options. Groups may choose to (i) leave aerosol distributions or emissions identical to pre-industrial (taking account of the changed land-sea mask), or (ii) treat aerosols prognostically, or (iii) use aerosol concentrations (including mineral dust) from H14, or (iv) use aerosol optical depths from H14, or (v) some combination of the above, depending on the aerosol type. The crucial thing is that groups are asked to document exactly how they have implemented aerosols.

Solar constant and astronomical parameters
All simulations should be carried out with the same solar constant and astronomical parameters as in the pre-industrial simulation. The solar constant in the CMIP6 piControl simulation is defined as 1361.0 W m −2 (Matthes et al., 2016). Although the early Eocene (51 Ma) solar constant was ∼ 0.43 % less than this (Gough, 1981), i.e. ∼ 1355 W m −2 , we choose to use a modern value in order to (i) aid comparison of any 1×CO 2 simulations (see Sect. 4.3.1) with pre-industrial simulations, and (b) to offset the absence of elevated CH 4 in the experimental design (see Sect. 4.2.3). As with all of Earth history, astronomical conditions varied throughout the early Eocene. There is some evidence that the PETM and other Paleogene hyperthermals may have been paced by astronomical forcing (Lourens et al., 2005;Lunt et al., 2011), but the phase of the response relative to the forcing is unknown. The modern orbit has relatively low eccentricity, and so represents a forcing close to the long-term average, and also facilitates comparison with the control pre-industrial simulation. However, we do encourage sensitivity studies to astronomical configuration (see Sect. 4.3.3).

Initial conditions
i. Atmosphere and land surface: simulations may be initialized with any state of the atmosphere and land surface, as long as the initial condition would not typically take longer than ∼ 50 years to spin up in a model with fixed sea-surface temperatures; for example, initial snow cover should not be hundreds of metres depth.
ii. Ocean: given that even with relatively long simulations, some vestiges of the initial ocean temperature and salinity structure will remain at the end of the simulations, we recommend that all groups adopt the same initialization procedure for the ocean, but encourage groups to carry out sensitivity studies to the initialization (see Sect. 4.3.7). The ocean should be initialized as stationary, with no initial sea ice, and a zonally symmetric temperature (T , • C) and globally constant salinity (S, psu) distribution given by the following: where φ is latitude, and z is depth of the ocean (metres below surface).
Some groups have previously found that initializing the model with relatively cold (< 10 • C) ocean temperatures at depth results in a relatively long spinup (> 5000 years), due to the suppression of convection -hence the relatively warm initial temperatures at depth prescribed here. Groups for which the recommended initial temperature structure still results in a stratified ocean with little convection, and hence are likely to have long equilibration timescales (for example those with a model with a particularly high climate sensitivity), may wish to initialize their model with warmer deep ocean temperatures. If so, this should be clearly documented.
The value of 34.7 psu is the same as the modern mean ocean value. Although the lack of ice sheets in the Eocene would result in a decrease in mean ocean salinity relative to the modern of about 0.6 psu, on these timescales long-term geological sources and sinks of NaCl associated with crustal recycling also play an important role; Hay et al. (2006) estimate mean ocean salinity to be between 35.1 and 36.5 during the Eocene. Given the uncertainties, we choose a modern value for simplicity. If groups prefer to initialize salinity with a non-homogeneous distribution, or with a different absolute value, they may do this, but it should be documented.
For simulations in which oxygen, carbon or other isotopic systems or passive tracers are included, these can be initialized as each individual group sees fit.

Length of simulation
Simulations should be carried out for as long as possible. Ideally, simulations should be (a) at least 1000 years in length, and (b) have an imbalance in the top-of-atmosphere net radiation of less than 0.3 W m −2 (or have a similar imbalance to that of the pre-industrial control), and (c) have sea-surface temperatures that are not strongly trending (less than 0.1 • C per century in the global mean). Climatologies should be calculated based on the final 100 years of the simulation.

Output format
We strongly recommend that DeepMIP model output should be uploaded to the anticipated PMIP4 component of the CMIP6 database (Eyring et al., 2016), distributed through the Earth System Grid Federation (ESGF). However, if this is not possible, then netcdf files of the variables in Appendix A, including Tables A1-A3, should be uploaded to the Deep-MIP Model Database, which will be set up if and when required. In any case, for the "highest priority" variables in Appendix A, Tables A1-A3, all months of the simulations should be retained, such that averages can be calculated from arbitrary years of the simulation, and such that equilibrium states can be estimated using the approach of Gregory et al. (2004).

Sensitivity studies
Sections 4.1 and 4.2 give a summary of the five main simulations. Here we outline some optional sensitivity studies that groups may wish to carry out, although there is no guarantee that other groups will do the same simulations.

Sensitivity to CO 2
Groups may wish to explore more fully the sensitivity of their model to CO 2 , and associated non-linearities (e.g. Caballero and Huber, 2013), by carrying out additional simulations over a range of CO 2 . Normally these would be multiples of the pre-industrial concentration, in addition to the standard 3×, 6×, and 12× simulations. In particular, we encourage groups to carry out a 1× simulation, for comparison with the pre-industrial control -this simulation enables the contribution of non-CO 2 forcings (palaeogeography and ice sheets) to early Eocene warmth to be evaluated.

Sensitivity to palaeogeography
Getech Group plc (www.getech.com) have provided an alternative palaeogeographic reconstruction that may be used for sensitivity studies, in particular the simulation deepmip-sensgeoggetech (see Tables 1, 2). It is included digitally in Lunt et al. (2016) as a netcdf file at a resolution of 3.75 • longitude × 2.5 • latitude, and is shown in Fig. 3c. Because a highresolution version of this topography is not available, groups will need to use the sub-gridscale palaeogeography from the H14 reconstruction, and interpolate to the new land-sea mask as appropriate. The vegetation, river routing, etc. from H14 will also need to be extrapolated to the new land-sea mask. Ideally, groups would carry out these simulations at the same three CO 2 levels as in the standard simulations, but if groups can only carry out a limited number of simulations with this palaeogeography, they should carry them out in the following order of priority (highest priority first): 3×, 6×, 12×.
Both Getech and H14 use the plate rotation model of Müller et al. (2008), which is derived from relative plate motions tied to a mantle reference frame. In their recent study, van Hinsbergen et al. (2015) argue that for palaeoclimate studies, plate motions should be tied to the spin axis of the Earth using a palaeomagnetic reference frame in order to obtain accurate estimates of palaeolatitude. For this reason, we also provide an additional version of the H14 palaeogeography, but rotated to a palaeomagnetic reference frame based on the methods outlined by van Hinsbergen et al. (2015) and Baatsen et al. (2016), for use in sensitivity study deepmipsens-geogpalmag (see Tables 1, 2). This is shown in Fig. 3b, and provided in the Supplement to this paper.
Furthermore, some of the topographic features could have evolved significantly throughout the ∼ 55-51 Ma period of interest, making it unlikely that a single palaeogeography can represent all the DeepMIP time periods to the same extent.
Groups are therefore encouraged to carry out sensitivity studies around the H14 palaeogeography, to explore the uncertainties in climate which may result from uncertainties in the spatial and temporal evolution of different topographic features. These studies may include the widening or constricting and shallowing or deepening of key ocean gateways, changing the bathymetry and extent of ocean shelves, and raising or lowering mountain ranges. In particular, we encourage groups to carry out sensitivity studies in which the NE Atlantic-Arctic gateway to the east of Greenland is closed. This is because there is evidence that a short, transient period of approximately kilometre-scale tectonic uplift of NW Europe and Greenland, associated with the North Atlantic Large Igneous Province, severely restricted the NE Atlantic-Arctic oceanic gateway during the PETM period in comparison with the pre-PETM and EECO periods (Hartley et al., 2011;Jones and White, 2003;Maclennan and Jones, 2006;Saunders et al., 2007).

Sensitivity to astronomical parameters
Evidence of cyclicity during the Paleocene and early Eocene indicates that a component of the warmth of the PETM may be astronomically forced (Lourens et al., 2005;Westerhold et al., 2007;Galeotti et al., 2010). As such, we encourage sensitivity studies of astronomical configuration. As the standard DeepMIP palaeoclimate simulations are configured with a modern orbit, which has relatively low eccentricity, we suggest groups carry out additional simulations with high eccentricity (e = 0.054 compared with a modern value of e = 0.017), with Northern Hemispheric winter corresponding with both aphelion and perihelion.

Sensitivity to vegetation
Those groups which have a model that includes dynamic vegetation may carry out sensitivity studies with dynamic vegetation turned on. The initial condition should be broadleaf or needleleaf trees at all locations. Ideally groups would carry out these simulations at the same three CO 2 levels as in the standard simulations, but if groups can only carry out a limited number of simulations with the dynamic vegetation, they should carry them out in the following order of priority (highest priority first): 3×, 6×, 12×. Groups with models that include a dynamic vegetation component can choose to pass to their vegetation model either the ambient atmospheric CO 2 or a lower concentration if required for model stability.

Sensitivity to solar constant
Groups may wish to explore the relative radiative forcing of the solar luminosity compared with other forcings, by carrying out an Eocene simulation with a reduced solar luminosity. The suggested reduction is 0.43 % (Gough, 1981), which would normally be from 1361.0 W m −2 in the mod-Geosci. Model Dev., 10, 889-901, 2017 www.geosci-model-dev.net/10/889/2017/ ern to 1355.15 W m −2 in the Eocene. This would typically be carried out at a CO 2 level of 3×.

Sensitivity to non-CO 2 greenhouse gases
Groups may choose to explore sensitivity to non-CO 2 greenhouse gases (see Sect. 4.2.3 for discussion of CH 4 ), in particular if these can be predicted by the model interactively.

Sensitivity to initialization
We encourage groups to carry out sensitivity studies to the initialization of the ocean temperature and salinity. It is possible that models will exhibit bistability with respect to initial condition, and as discussed in Sect. 4.2.6 we expect that the equilibration time will be a function of the initial conditions and will be different for different models.

"Best in show"
Participants are invited to carry out simulations in which they attempt to best-match existing climate proxy data. This may be done in a number of ways, for example by modifying the aerosols (Huber and Caballero, 2011), cloud properties , physics parameters , using very high CO 2 (Huber and Caballero, 2011), incorporating dynamic vegetation (Loptson et al., 2014), modifying gateways (Roberts et al., 2009), modifying orbital configuration, including non-CO 2 greenhouse gases, or a combination of the above and other modifications.

Climate proxies
A major focus of DeepMIP will be to develop a new synthesis of climate proxy data for the latest Paleocene and early Eocene, focussing on the three targeted time intervals: pre-PETM, PETM, and EECO. The main focus of DeepMIP will be on temperature and precipitation proxies. Two working groups have been set up to compile these data from marine and terrestrial records. These groups will also work together to generate new data sets for poorly documented regions, such as the tropics, and will seek multiple lines of evidence for climate reconstructions wherever possible. The marine working group is excited by the possibility of using innovative analytical techniques (e.g. Kozdon et al., 2013) to recover robust estimates for sea-surface temperature from planktic foraminiferal assemblages within legacy sediment cores of the International Ocean Discovery Program. Published data sets will be combined into an open-access online database. The EECO and PETM or pre-PETM marine compilations of Lunt et al. (2012), Hollis et al. (2012), and Dunkley Jones et al. (2013), and EECO terrestrial compilations of Huber and Caballero (2011) provide a starting point for this database. One of the great challenges for these working groups will be to develop new ways to assess climate proxy reliability and quantify uncertainties. In some cases, it may be more straightforward to consider relative changes in proxies rather than report absolute values. Climate proxy system modelling (Evans et al., 2013) coupled with Bayesian analysis (e.g. Khider et al., 2015;Tierney and Tingley, 2014) has great potential for improving estimation of uncertainties and directly linking our climate proxy compilation with the climate simulations. In addition to these quantitative estimates of uncertainty, all data will be qualitatively assessed based on expert opinion, for example by characterizing proxies as high, medium, or low confidence (as has been done in PlioMIP, see Dowsett et al., 2012). We anticipate a companion paper to this one in which we will give more details of the DeepMIP data and associated protocols.

Products
In addition to this experimental design paper, and papers describing the new climate proxy syntheses, once the model simulations are complete we anticipate the production of overarching papers describing the "large-scale features" of the model simulations, and model-data comparisons. Following this, we anticipate a number of spin-off papers looking at various other aspects of the model simulations (e.g. ENSO, ocean circulation, monsoons). In particular we expect papers that explore the relevance of the DeepMIP simulations and climate proxy syntheses for future climate, for example through model developments that arise as a result of the model-data comparison, or emergent constraints (Bracegirdle and Stephenson, 2013) on global-scale metrics such as climate sensitivity. Furthermore, we will encourage modelling participants to publish individual papers that describe their own simulations in detail, including how the boundary conditions were implemented. In this respect, we are basing our dissemination strategy on that of PlioMIP (Haywood et al., 2013); see their Special Issue at http://www. geosci-model-dev.net/special_issue5.html.

Data availability
The boundary conditions for the standard DeepMIP palaeoclimate simulations are supplied in the Supplement of H14 (Herold et al., 2014); see Table 2. For availability of boundary conditions for DeepMIP sensitivity studies, also see Table 2. Data held in both the CMIP6 and DeepMIP Model databases, when these are operational, will likely be freely accessible through data portals after registration. As stated in Sect. 4.2.8, we strongly recommend that model output is uploaded to the CMIP6 database. If the CMIP6 database cannot be used, the variables in Tables A1-A3 should be submitted to the DeepMIP Model Database, which will be set up if and when required. Climatological averages of the final 100 years of the simulation should be supplied for each month (12 fields for each variable). In addition, for the highest priority variables, all months of the simulation should be supplied.
Furthermore, as many groups are interested in hydrological extremes, groups should aim to produce 10 years of hourly precipitation, evaporation, and runoff data.  (10 m  Author contributions. A first draft of this paper was written by Dan Lunt and Matt Huber. It was subsequently edited based on discussions at a DeepMIP meeting in January 2016 at NCAR, Boulder, Colorado, USA, and following further email discussions with the DeepMIP community. All authors contributed at the meeting and/or in the subsequent email discussions.