Interactive comment on “ Climate SPHINX : evaluating the impact of resolution and stochastic physics parameterisations in climate simulations

Rhis paper describes the model configuration and experiment setup of the project Climate SPHINX (Stochastic Physics HIgh resolutioN eXperiments), which explores the impact of stochastic physics in an ensemble of 30-year climate integrations at five different atmospheric horizontal resolutions (from 125km up to 16km) within the EC-Earth model. This suite of simulations is impressive and, the setup thoughtful, in particular that of the coupled simulations, and the description of the experimental details excellent.


Introduction
The simulation and prediction of climate is one of the scientific and computational grand challenges.In order to make quantitative projections of future climate, it is necessary to use climate models that simulate all the important processes governing the evolution of the climate system.Over the past few decades, climate models have developed considerably -increasing both in complexity and resolution -as computational power has increased.Yet there is a notable difference in the horizontal resolution 2 The EC-Earth Global Climate Model We use version 3.1 of the state-of-art atmosphere-ocean Earth System Model EC-Earth (Hazeleger et al., 2010(Hazeleger et al., , 2012)).EC-Earth is based on the Integrated Forecast System (IFS, cycle 36r4) (ECWMF, 2009) atmospheric circulation model, developed by the ECMWF.In order to represent land surface dynamics, IFS integrates the Hydrology Tiled ECMWF Scheme of Surface Exchanges over Land (H-TESSEL) land surface scheme (Balsamo et al., 2009).
When used in coupled mode, the NEMO 3.3.1 oceanic circulation model (Madec, 2008) which includes the LIM3 sea ice model (Vancoppenolle et al., 2012) is used.The atmospheric and oceanic components are coupled through OASIS3 (Valcke, 2013).

The IFS atmospheric model
IFS is the current atmospheric model developed at ECMWF, and given the predominance of atmosphere-only simulations run within SPHINX it is the core of the project.The version that is part of EC-Earth is derived from cycle 36r4 and has been tuned and improved for climate purposes by the EC-Earth Consortium.IFS uses a combination of spectral and reduced Gaussian grids (where, in the latter, the number of longitudinal grid points decreases towards the poles).Physical parameterisations and advection are computed on the reduced Gaussian grid and then, using the spectral transform, semi-implicit time stepping is performed in the spectral space.
Traditionally the spectral harmonic at which truncation occurs defines the horizontal resolution: IFS uses a linear triangular truncation for which for a specified number of N harmonics retained corresponds to 2(N+1) grid points in the longitudes along the Equator.If the resolution is T255, this means that post-processed output will have 512x256 grid points on a regular Gaussian grid, which corresponds to a resolution of about 80 km at the equator.
For Climate SPHINX, the horizontal resolutions T159, T255, T511, T799 and T1279 have been explored.Conversely, the number of levels is fixed in all the configurations at 91: these are hybrid levels with the last full level at 0.01 hPa.The description of the main parameterisation schemes can be found in Beljaars et al. (2004): the parameterisations are in general independent of resolution, with the only exception of the convective adjustment time, which decreases with increasing resolution as reported in Table 1.

The stochastic physics parameterisation
We consider two complementary approaches to stochastic parameterisation, both developed at ECMWF for use in the IFS.The first is the Stochastically Perturbed Parameterisation Tendencies (SPPT) scheme (Palmer et al., 2009), which uses multiplicative noise to represent model uncertainty due to the parameterisation process: where ∂X ∂t is the modelled total tendency in X.This is the sum of D, the dynamical tendency, K, the horizontal diffusion, and each P i term, with P i being the tendency from the ith physics scheme.The zero mean random perturbation, e, is constant in Geosci.Model Dev. Discuss., doi:10.5194/gmd-2016-115, 2016 Manuscript under review for journal Geosci.Model Dev.
the vertical, but µ tapers the perturbation to zero close to the surface and in the stratosphere.The scheme acts on the tendencies of the physical fields (i.e.temperature, winds and specific humidity) resulting from the five main parameterisation schemes: radiation, turbulence and gravity wave drag, non-orographic gravity wave drag, convection, and large scale water processes.
All variables are perturbed with the same random number.
The perturbation, e, is generated using a spectral pattern generator (Berner et al., 2009), ensuring that it smoothly varies in space, while the patterns evolve in time following an AR(1) process.The perturbation at each timestep is the sum of three independent random fields which represent uncertainties on different temporal and spatial scales.The fields have horizontal correlations of 500 km, 1000 km and 2000 km and temporal decorrelations of 6 h, 3 days and 30 days respectively, with associated standard deviations of 0.52, 0.18 and 0.06.While the smallest scale (500 km and 6 h) dominates on weather forecasting timescales, it is expected that the larger scales will be important on climate time scales.
In contrast, the Stochastic Kinetic Energy Backscatter (SKEB) scheme aims to represent a physical process that is otherwise absent from the model (Shutts, 2005;Palmer et al., 2009).Kinetic energy loss is common in models due to numerical integration schemes and the parameterisation process (Berner et al., 2009).To counteract this, the SKEB scheme represents upscale transfer of kinetic energy by randomly perturbing the streamfunction.
Similar to SPPT, the SKEB scheme uses a spectral pattern generator to generate a spatially and temporally correlated perturbation field which is added at each timestep to the deterministic streamfunction tendency, where ψp is the streamfunction tendency after the perturbation, ψdyn is the tendency before perturbation and f is the additive perturbation field.The perturbation field is expressed in spherical harmonics, and each coefficient is evolved separately in time following an AR(1) process.A tuning parameter for the SKEB scheme, the backscatter ratio, is set to increase following resolution increase (see Table 1) in order to improve the slope of the kinetic energy spectrum.
Hereafter, SPHINX simulations where stochastic parameterisation is operational will be defined as "stochastic" runs, while simulation where the scheme is deactivated will be mentioned as "deterministic" runs.

Model tuning
With respect to the previous version (v3.0.1),EC-Earth 3.1 shows a reduced radiative imbalance and an improved hydrological cycle.However it still exhibits a cold bias in both its atmosphere-only and coupled configuration and a small imbalance in precipitation minus evaporation (P-E).In a "present day" AMIP configuration, the 3.1 version is too cold, extracting heat from the underlying sea surface temperatures (SST) by about 1.5 W/m 2 and showing unrealistically high values for net SW and LW fluxes at TOA (around 243-244 W/m 2 ).Thus the first goal of the tuning was to provide reasonable radiative fluxes at TOA and at the surface for the standard deterministic version of the model (T255, see next paragraph for further description on the configurations adopted).
To improve the radiation budget, some of the convection and microphysical parameters from a more recent version of IFS (cy40r1) were retrieved.In addition to this, two standard tuning parameters have been modified (see Mauritsen et al., 2012): the The optimal choice of tuning parameters provides reasonable fluxes at the TOA (around 240 W/m 2 ) and a positive flux at surface of about 0.6 W/m 2 , in accordance with the best estimates from observations (Wild et al., 2013).It is important to note that the tuning of the radiative fluxes has been carried out only for the T255 deterministic model version.
Conversely, stochastic simulations deserved further tuning: indeed, the standard SPPT and SKEB schemes were designed for use at NWP timescales.When implemented in a climate model, the SPPT scheme in particular showed a strong imbalance which involves the water cycle, with a negative P-E that was ten times larger than in the deterministic model, associated with an anomalous latent heat flux and a negative net surface flux of about 2 W/m 2 .This was mainly because the SPPT scheme was not designed specifically to conserve water vapour, leading to a water vapour sink in the atmosphere.A fix has been implemented, requiring that the global average of the tendencies (i.e.winds, temperature and more importantly specific humidity) before and after the SPPT perturbation is conserved.The new scheme removes the imbalance in P-E, which is now equal to that in the run where the SPPT scheme is disabled.However, the SPPT scheme leads to a negative bias in the surface heat fluxes of about 0.8 W/m 2 , likely caused by a different distribution of the clouds.
The main radiative fluxes resulting after the complete tuning procedure are reported in Table 2.As shown in this table, the radiative balance of the model at higher resolution (and with stochastic physics) shows larger TOA SW and LW with increasing resolution.Net surface fluxes are highly variable, with higher values for coarser resolutions.
Further tuning has been performed in order to produce a realistic Quasi-Biennial Oscillation (QBO) at higher resolutions.
The EC-Earth 3.1 non-orographic gravity waves scheme is characterised by a momentum flux that is continuously launched at in the mid-troposphere to simulate the effect of gravity waves.The latitudinal profile of this momentum flux governs the correct parameterisation of gravity waves: a too high amplitude of the momentum flux will disturb the QBO in equatorial zones, particularly at high resolutions, while a too low value will lead to unrealistic eddy-driven jets, especially in the southern hemisphere, where orographically induced wave drag is low.With the current latitudinal profile, the QBO was simulated only at standard resolution (T255 with 91 vertical levels).Following advice from ECMWF staff, a resolution-dependent parameterisation of non-orographic gravity wave drag replaced the version-dependent parameterisation present in EC-Earth 3.1.Namely, instead of using a low momentum flux average value (GFLUXLAUN=0.02)with a positive Gaussian peak at 50°S, we use a higher value (GFLUXLAUN=0.0375)which is reduced with a Gaussian shape at the equator.This negative peak is slightly deeper for stochastic runs than for deterministic simulations to compensate the effect of the stochastic noise.The average value of the momentum flux was further reduced with increasing resolution (starting from T799) according to the ECMWF specification for IFS cy40r1 (see Table 1).
The new non-orographic gravity wave scheme allows now the simulation of the QBO at all the resolutions explored in Climate SPHINX, without deteriorating the jet streams.Given these positive results, the new parameterisation will be implemented in the upcoming EC-Earth 3.2 version.
3 Science configuration: the SPHINX v1.0 protocol The following sections describe the scientific configuration, including the simulations performed, the initial and boundary conditions, the SST and SIC used that together define the SPHINX v1.0 protocol.
PDA and FSA are atmosphere-only simulations: 20 ensemble members are run at T159, 20 at T255, 12 at T511, six at T799 and two at T1279 for both PDA and FSA experiments.For each resolution, half of the ensemble members have the stochastic physics parameterisations activated.The number of ensemble members run and their resolution is also reported in Table 1.
The atmosphere-only experiments extend for 30 consecutive years, from 1979 up to 2008 for PDA, while FSA experiments are run from 2039 up to 2068.
PFC simulations are run with IFS at the T255L91 configuration, coupled with NEMO using the ORCA1 grid (a tripolar grid with resolution of 1°longitudinally and refinement to 1/3°at the Equator) with 46 vertical levels.Six ensemble members are run, three with the stochastic parameterisation active and three control members, from 1850 up to 2100.

Initial conditions
The Initial Conditions (ICs) in both the PDA and FSA experiments are taken from the ECMWF ERA-Interim Reanalysis for 01/01/1979.A first experiment is run at each resolution for a few days.The ICs for the different ensemble members are extracted using the midnight values (00:00) of the first 10 days, and then reassigned to the 1st of January.
The same ICs are used also for FSA: in order to account for the land-surface adjustment to the new forcing, a 1-year spin up has been carried out for FSA (which is therefore starting from 2038).
For PFC simulations, given the different expected climatologies of integrations with/without stochastic physics, two 320year spin-ups are carried out in coupled mode to equilibrate the ocean to the atmospheric forcing.Having spun up, three oceanic states -from spin-up year 300, 310 and 320 -are coupled with three different atmospheric ICs: these are run in coupled mode for a further ten years with fixed greenhouse gas (GHG) forcing for the year 1850.In this way the phase space distance between the simulations is 20 years and the atmosphere and land surface have had enough time to adjust to the new oceanic state.

Forcing and boundary conditions
Greenhouse gases (GHGs) and ozone concentrations as well volcanic aerosol are set according to the CMIP5 Historical forcing for PDA experiments.For the FSA experiments, the RCP8.5 scenario is used.PFC simulations use the historical CMIP5

Present-day SST and SIC
Given that both FSA and PDA simulations are atmosphere-only runs, a special effort has been taken to provide reliable SSTs in order to fully exploit the high resolution.
For PDA, SSTs have been obtained from the daily SST and sea ice concentration (SIC) HadISST2.1.1,a pentad based dataset with resolution of 0.25°x0.25°forSSTs (Kennedy et al., 2016) and 1°x 1°for SIC (Titchner and Rayner, 2014).These are bilinearly interpolated onto the required reduced Gaussian grid for each resolution: climatologies for SST and SIC for the 1979-2008 period can be seen in the upper panels of Figure 1.
A number of inconsistencies are found between the land-sea mask of IFS and of HadISST2.1.1:these are due to slightly different coastlines and a different representation of the lakes.For the different coastlines, linear extrapolation from HadISST2.1.1 has been performed.For the interior (i.e.lakes), a methodology similar to the one used in ERA20CM dataset (Hersbach et al., 2015) has been adopted: one-month lagged 2m temperatures from the ERA-Interim monthly climatology of 1979-2008 are used as SST.Where the temperature is below zero, SIC is set to one, otherwise it is left at zero.This is interpolated in time on a daily basis and in space on the needed grid to create a smoothed seasonal cycle for lakes.
For all PDA simulations, well mixed greenhouse gases, stratospheric ozone and volcanic aerosol concentrations have been set according to the historical scenario of the CMIP5 protocol (Taylor et al., 2012).

Future scenario SST and SIC
The creation of SST and SIC for the FSA experiment is more complex.We would like to account for the high-resolution However, this reconstruction misses an important element: there is no information on the sea ice cover in the future.To account for this, we took data from the CMIP5 EC-Earth simulations as a reference for SIC.CMIP5 EC-Earth simulations show a considerable cold bias in SST with respect to HadISST2.1.1,but they show good ice coverage, especially for the Northern Hemisphere (see average NH and SH hemisphere SIC in Figure 3).
Considering that an ensemble mean would be unrealistic, especially for a field with a large spatial variance as sea ice, we select a single ensemble member representative of the ensemble.Member "r8i1p1" has been chosen to characterise the ensemble, since its climatology shows the smallest SIC Root Mean Square Error (RMSE) when compared to the ensemble mean climatology in the time window 2038-2068.
As a last step, we must evaluate SST for points where SIC coverage has disappeared in the future scenario.The lack of information about historical SST under sea ice results in undefined SST at these points using the methodology outlined above.OpenMP/Shared memory parallelisation, which has been tested without showing any significant computational benefit.
An accurate scaling of the performance was performed during the first months of the simulations.However, a conservative choice has been undertaken, after considering that the walltime needed to run the simulations was not the main concern for the project success.The number of cores assigned to each experiments have been selected following the resolution of the model considered.Although stochastic physics experiments showed about 5-10% decrease in performance (according to different resolutions), the same number of cores has been retained.
In summer 2015 a new Supermuc-II platform based on Haswell Xeon Processor E5-2697 v3 processors was made available by the LRZ.The new HPC granted a reduction of about 5% of the total core hours used, without affecting the walltime.About 75% of the simulations have been run using the Haswell nodes.Details on the processor decomposition, computational costs and data outputs are reported in Table 3.

Data output and postprocessing
In Climate SPHINX, IFS has been set up to provide output in GRIB format every 3 hours:; however, the four T1279 simulations alone sum up to about 200 TB of raw data output.Summing together the restarts files and the output of all experiments the total amount of space occupied at the peak of the project (February 2016) reached about 1 PB.In order to reduce the size of the output and to increase the data accessibility to a larger audience, automatic post-processing routines have been implemented.
At the end of each simulation leg, a script aimed at post-processing is launched: the script handles both the spectral and reduced

Results overview
In this section we present a brief overview of the preliminary findings of the Climate SPHINX project.Considering that the number of aspects of climate that could be analyzed in such a large dataset is extremely dispersive, we decided to focus on a few selected features of the climate variability.In fact, larger benefits from increasing resolution and using stochastic physics are expected in terms of variability rather than in terms of mean state.More specifically, we investigate the improvements and/or deteriorations following resolution increases and including stochastic parameterisation of three different phenomena in the present-day climate (i.e. in PDA experiments): the distribution of the intensity of tropical rainfall, the tropical variability related to the Madden-Julian Oscillation and the mid-latitude variability associated with atmospheric blocking.

Tropical rainfall variability
Climate models generally have too little tropical variability on timescales of several days (e.g.Hung et al., 2013).One aspect of the variability of particular interest is the occurrence of heavy precipitation events, which can result in flooding, affect disease incidence and reduce crop yields (IPCC, 2015).Changes in the frequency of these events can also affect trends in total precipitation due to non-linearity in land surface processes (Saeed et al., 2013).
Figure 5a shows the frequency distribution of daily-mean precipitation rates averaged over 2.5°x2.5°gridboxes between 10°S-10°N over the period 1998-2008 in data from the Global Precipitation Climatology Project (GPCP) (Huffman et al., 2001), the Tropical Rainfall Measuring Mission (TRMM) 3B42 Version 7 product (Huffman et al., 2007) and one ensemble member for each PDA run.The results are very similar in separate subperiods, so sampling variability is small.At all resolutions, rain rates below 15 mm/day occur too often in the model data and rain rates between 25-60 mm/day occur too infrequently compared to both observational datasets.Figure 5b shows the ratio of the frequency in each rain rate interval as a fraction of that in GPCP for each resolution.At rain rates near 30 mm/day, the simulated frequencies are between about 35-50% of the frequency in GPCP.At higher rain rates, the frequency differences between TRMM and GPCP become comparable in size to or larger than the differences between the modelled frequencies and the observational datasets.We do not know of a reason to strongly prefer one dataset over the other, so we consider the model bias to be uncertain at these rain rates.
The frequency of rain rates above 30 mm/day in the T159 and T255 models are below about 40% of that in GPCP.At T511, T799 and T1279 the relative frequency difference compared to GPCP and TRMM decreases as the rain rate increases, and becomes comparable to that in GPCP in the 60-65 mm/day interval, though still much smaller than that in TRMM.Therefore increasing the model resolution from T159 to T511 improves the simulated frequency of heavy rainfall events compared to observational datasets, with the further improvements caused by increasing the resolution to T799 or T1279 being considerably smaller.
Figures 5c,d show the same data for the stochastic PDA runs.Stochastic physics has a similar effect at all resolutions.
Frequencies of rain rates between 5-15mm/day are reduced by about 10% compared to those in the deterministic models, reducing the model bias.Frequencies above about ∼20 mm/day are substantially increased, by a larger factor at larger rain rates, up to a factor of ∼2.5 at rain rates around 60 mm/day.This reduces the difference from GPCP and TRMM up to rain rates of 45 mm/day at all resolutions.
The higher resolution stochastic models have rain rate frequencies between those of GPCP and TRMM at rates above 45 mm/day, so they seem consistent with the observations given the observational uncertainty.The T255 stochastic model has rain rate frequencies closer to those in GPCP than any of the deterministic models in all but two of the 5 mm/day rain rate intervals shown.One hypothesis to explain this effect is that the stochastic perturbations sometimes reduce the amount of water vapour converted into rain, so that more is available for producing heavy rainfall events at later times.
Therefore stochastic physics brings this aspect of the simulations into better agreement with observations, suggesting that including a representation of unresolved variability and model error is important for simulating the statistics of extreme tropical precipitation events.

The Madden-Julian Oscillation variability
The Madden-Julian Oscillation (MJO) is the dominant mode of variability in the tropical region on sub-seasonal timescales (Madden and Julian, 1994).It is characterised by a strong interaction between tropical convection and the large-scale environment, manifest as a coherent eastward propagating pattern of precipitation followed by subsequent rainfall suppression (Kim et al., 2014;Raymond et al., 2015).It is a challenge for the current generation of global climate and weather models to represent the dynamics and thermodynamics of the MJO realistically (Slingo et al., 1996;Lin et al., 2006;Kim et al., 2009;Sperber et al., 2011;Klingaman et al., 2015).
We here use the Wheeler and Hendon (2004)  and the phase Φ of the MJO is defined as: MJO occurrence is defined when the MJO A is > 1. Conventionally, eight phases of the MJO are defined (Gottschalck et al., 2010).We reduce the eight phases to four phases corresponding respectively to the MJO being active in the Indian Ocean, Maritime Continent, West Pacific and Western Hemisphere.More importantly, increasing horizontal resolution does not seem to improve the representation of the phenomenon significantly.This may be explained considering that the simulation of the MJO in GCMs is influenced primarily by the representation of mesoscale dynamics and of convection.The simulation of mesoscale dynamics can be helped by increasing the resolution, while improvements in convection are driven by changes in physical parameterisations of the model.Yet, the coupling between the mesoscale dynamics and the convection is key for convectively coupled waves in the tropics (Raymond et al., 2015).
Therefore, increasing resolution alone may not be sufficient to improve the simulation of the MJO.
Conversely, the stochastic physics parameterisation improves the MJO frequency in all regions at all resolutions but T1279.
It must be noted that this latter run was done with only one ensemble member compared to the other runs with 3 or more ensemble members over the same period, therefore a sampling error due to natural variability should be considered.Above all, the best results are obtained for the T255 with stochastic physics, suggesting that the tuning of the mean state of the model might play a relevant role for a better MJO simulation.
Additionally, the stochastic physics climate runs show an improvement in the representation of the MJO propagation over the Maritime continent (not shown).The lack of propagation of the MJO over the Maritime continent into the West Pacific region is a known problem in GCMs (Zhang et al., 2013).Such improved propagation in stochastic runs indicates either that there is an impact of the stochastic physics on the mean state in the region or that the variability in the region helps maintain the intraseasonal signal.The reasons for the change in MJO representation due to stochastic physics will be explored further in a more detailed future study by the authors.

Mid-latitude atmospheric blocking variability
One of the most important challenges for the current generation of climate models is the simulation of atmospheric blocking (Anstey et al., 2013;Masato et al., 2013;Dunn-Sigouin and Son, 2013).Blocking is a recurrent weather pattern typically occurring at the exit of the Atlantic and Pacific jet stream, more frequently during the winter season but observed throughout the year (Rex, 1950;Tibaldi and Molteni, 1990).It is characterised by a high-pressure, long-lasting low vorticity anomaly that "blocks" the mid-latitude westerly flow, diverting synoptic disturbances poleward or equatorward (Tyrlis and Hoskins, 2008;Davini et al., 2012).A blocking event can last several days or even weeks, and it may be associated with cold spells in winter and heat waves in summer (Sillmann et al., 2011;Dole et al., 2011).
Blocking here is diagnosed using the simple index introduced by D'Andrea et al. ( 1998), an extension of the more known Tibaldi and Molteni (1990) index.This 1-d blocking index detects the reversal of the zonal flow measuring the geopotential height gradient at 500 hPa at 60°N, providing a binary blocking timeseries for each longitude.Given that in this specific diagnostic no statistically significant difference emerges when comparing deterministic and stochastic simulations, the two simulations are combined together to provide an unique ensemble.
The upper panel of Figure 7 shows the blocking frequency for the ERA-Interim Reanalysis (black) and the ensemble mean of the different horizontal resolutions (colours) of PDA experiments over the DJF period.The common negative bias over the Atlantic and Pacific basins is clearly evident.Increasing the horizontal resolution leads to benefits over both the basins, with marked improvements especially for the Atlantic: here T799 and T1279 runs show values comparable to the reanalysis.
The largest improvement is however seen upgrading from T255 to T511, where the bias -measured as the relative difference between the blocking frequency averaged between 10°W and 30°E -is reduced from the 18% to 3%.
Those clear improvements in blocking frequency are interestingly reflected by a change in the mean state.A simple way to represent the flow variability is to highlight a few isopleths of geopotential height, as done in the lower panel of Figure 7. Indeed, the higher resolution models show a strengthened pattern of the dominant Northern Hemisphere planetary waves, with marked ridges over the Rockies and Europe.Especially the former over the Rockies (Brayshaw et al., 2009) suggests an important role of orography resolution in the representation of the eddy-driven jet stream, and indirectly, of Euro-Atlantic blocking frequencies.
Indeed, improvements following resolution increase for Atlantic blocking in GCMs have been associated with both improved transient eddy activity -that should sustain the blocking persistence (Shutts, 1983;Berckmans et al., 2013) -and with higher orography variance -which affects the mean state through planetary waves shaping (Jung et al., 2012;Berckmans et al., 2013).
Conversely, Pacific blocking has been shown to be phenomenologically different (Pelly and Hoskins, 2003;Davini et al., 2012) and to be strongly affected by tropical dynamics (e.g.Renwick and Wallace, 1996), therefore it is not surprising that the latter would be less affected by horizontal resolution changes.
A more detailed analysis of blocking and mid-latitude variability will be carried out by the authors in future studies.

Conclusions
In the present work we have described the scientific configuration and technical setup/tuning of the EC-Earth Earth System Model used for the Climate SPHINX project, which defines the SPHINX v1.0 protocol.More than 120 climate simulations have been run making use of more than 20 million core hours and producing about 140 TB of post-processed data.Climate SPHINX includes both present day (PDA simulations, 1979(PDA simulations, -2008) ) and future scenario (FSA simulations, 2038-2068) atmosphere-only simulations according respectively to CMIP5 historical and RCP8.5 forcing.These have been run at five different horizontal  (Mizielinski et al., 2014) or ATHENA (Kinter III et al., 2013), this demonstrates the ability of the climate community to exploit the more recent HPC machines.
Details on the tuning procedure (aimed at providing a correct radiation budget in the standard configuration T255)have been presented.Moreover, a comprehensive description of the methodology adopted for the creation of the present day and future scenario SST and SIC (starting from the HadISST 2.1.1 dataset) has been described.A novel method aimed at estimating SST where SIC have disappeared in future climate simulations has been introduced.
More importantly, Climate SPHINX post-processed outputs are freely accessible to the climate community.This has been possible thanks to an EUDAT pilot project which makes available a THREDDS server operational at CINECA from which data can be easily downloaded.
Preliminary results show the importance of both resolution and stochastic perturbations on the representation of the climate variability, although different phenomena show different sensitivities.Tropical rainfall variability seems to benefit of both increased horizontal resolution and stochastic parameterisation, whereas the Madden-Julian Oscillation shows improvements only when the stochastic perturbations are added.In general -in the tropics -applying stochastic schemes at low resolution leads to interesting improvements: on the other hand, increasing resolution beyond T511 does not seem to further improve the tropical variability.
Conversely, in the mid-latitudes, where we analyzed atmospheric blocking frequencies, no statistical difference is found between stochastic and deterministic runs.However increased horizontal resolution seems here extremely important to decrease the blocking bias: this is true especially over the Euro-Atlantic sector, where the T799 resolution (∼25 km) reduces it to negligible values.
To summarise, the best improvements are seen upgrading from T255 to T511, whereas minor improvements are observed using higher resolutions.This may be associated with the absence of specific tuning for both deterministic and stochastic higher resolution configurations, which can affect the mean climate and consequently partially deteriorate the climate variability.
Indeed, such tuning does not involve only the surface and TOA radiative fluxes, but also dynamical components of the climate model.Some schemes, e.g.deep and shallow convection parameterisations, may be satisfactory at coarse resolutions but may perform poorly at finer ones.
Climate SPHINX puts further attention on the controversy related to deciding whether to increase resolution or increase the size of ensembles -whilst keeping the same computing time available (e.g.Buizza et al., 1998).Indeed, running 30 years of one member at T1279 on the SuperMUC Petascale System costs about 1.4 million core hours; with the same amount of time it would be possible to run 9-10 simulations at T511.However, the benefits of the two pathways may be different: while a single member with 16-km resolution can provide local information at a topographic scale, useful for instance for hydrological models -particularly in areas with complex topography -by contrast many ensemble members at 40-km resolution can provide a correct assessment of the natural variability, a key element for instance for mid-latitude climate (Deser et al., 2012).However, we must keep in mind that the computational constraints would become particularly relevant for coupled simulations, in which the computing time devoted to the oceanic model and -above all -to the spin up of the coupled system will inflate considerably the number of core hours needed.Stochastic physics parameterisations, especially at lower resolution, seem able to provide an interesting alternative to tackle such controversy, improving model performance without increasing the nominal resolution and the overall computational cost.

Data availability
Post-processed data have been transferred from LRZ to CINECA via GridFTP, where they have been permanently stored.More important, free data accessibility to the climate user community is granted through a dedicated THREDDS Web Server hosted by CINECA (https://sphinx.hpc.cineca.it/thredds/sphinx.html),where it is possible to browse and directly download SPHINX data.Details on the the data accessibility and on the Climate SPHINX project itself are available on the website of the project (http://www.to.isac.cnr.it/sphinx/).
The set up of this infrastructure for data sharing has been possible thanks to DATA SPHINX, an EUDAT Data Pilot project, which will allow long-term storage and sharing among a wide scientific user community of high-resolution climate model output data.DATA SPHINX aims at building a repository serving the climate change impact modelling community, providing selected variables at high temporal and spatial resolution, with a focus on climate extremes and the hydrological cycle in areas with complex orography.
Geosci.Model Dev.Discuss., doi:10.5194/gmd-2016-115,2016 Manuscript under review for journal Geosci.Model Dev.Published: 23 June 2016 c Author(s) 2016.CC-BY 3.0 License.specification from year 1850 up to year 2005 included: after that, the forcing is taken from the RCP8.5 scenario.Albedo, land use and vegetation patterns are set using the standard configuration of EC-Earth 3.1, which uses a MODIS-derived fixed climatological seasonal cycle for snow-free albedo and the Leaf Area Index.The average yearly solar irradiance was set at 1368.2 W/m 2 with intrannual variations, following the standard EC-Earth 3.1 setup.All the simulations of PDA and FSA experiments use this setup.For the PFC simulations interannual variations following CMIP5 prescriptions (i.e. the 11-year solar cycle) have been added.
Geosci.ModelDev.Discuss., doi:10.5194/gmd-2016-115,2016   Manuscript under review for journal Geosci.Model Dev.Published: 23 June 2016 c Author(s) 2016.CC-BY 3.0 License.The SST and SIC changes between FutureHadiSST2.1.1 and HadISST2.1.1 are reported in Figure1: as expected larger warming and sea ice retreat is seen in the Northern Hemisphere high latitude.Figure3reports the timeseries and trends for SST (between 45°S and 45°N) and SIC (for both Northern and Southern Hemisphere) for both FutureHadiSST2.1.1 and HadISST2.1.1.4 Technical configuration 4.1 High-Performance Computing details Simulations have been run on the 6.8-Petaflop SuperMUC IBM Petascale System at LRZ.The initial setup and configurations have been performed on the Supermuc-I platform, based on Sandy Bridge-EP Xeon E5-2680 8C processors.For processor decomposition the Message Passage Interface (MPI) parallelism paradigm has been used.EC-Earth allows also for Gaussian data from IFS and extracts and converts the requested variables from the default ECMWF format to a user-friendly, CMOR-like format on regular Gaussian grid.With this automatic procedure, more than 140 TB of post-processed data has been produced.A significant reduction of the data volume was obtained making use of the NetCDF-4 Zip format (HDF5).Monthly (MON), daily (DAY), 6-hour (6HRS) data for different subsets of variables has been produced.More than 50 fields have been stored at monthly frequency.In order to further reduce the space requirements, daily and 6-hour 3D fields have been degraded to the spectral resolution of T255.Additional data at 3-hour frequency have been stored for the Euro-Cordex domain (3HRS-CDX) and for a sub-domain including India Tibet and Pakistan (3HRS-ITP).Total precipitation has been saved also over the global grid at full resolution with 3 hours frequency (3HRS).Finally, synoptic monthly means have been stored for the main radiative variables (SMON).A few fields that are nonlinear functions of the output (e.g.specific humidity) have been computed from the original 3-hour output and then averaged at the required frequency in order to record them accurately.In addition to the atmospheric data, about 10 TB of oceanic output have been stored for PFC simulations.Data at daily and pentad frequency have been retained.All the data, including raw output, post-processed data and restart files, have been archived on the tape archives of the Tivoli Storage Management Infrastructure (TSM) of the LRZ.
technique to identify the dominant modes of variability in zonal winds and Outgoing Longwave Radiation (OLR) in these model runs.Combined Empirical Orthogonal Functions (CEOFs) of intraseasonal OLR, U850 (zonal winds at 850 hPa) and U200 (zonal winds at 200 hPa) are computed for each of the runs.The first two leading modes (RMM1 and RMM2) correspond to MJO signatures in the tropical wind field and OLR.The amplitude A of the MJO is defined as: A = (RM M 1) 2 + (RM M 2) 2 (3) Geosci.Model Dev.Discuss., doi:10.5194/gmd-2016-115,2016 Manuscript under review for journal Geosci.Model Dev.Published: 23 June 2016 c Author(s) 2016.CC-BY 3.0 License.
Figure 6 shows the frequency of occurrence vs. the mean amplitude of the MJO in the four different regions around the Tropics for all the different runs (colours) and for ERA-Interim Reanalysis (grey, Dee et al., 2011) over the 1980-2001 period.Overall the frequency of occurrence of the MJO in the different regions in the Tropics for the different model resolutions is underestimated with respect to that of ERA-Interim.The MJO amplitude in the model simulations is lower than reanalysis over the Indian Ocean and the Western Hemisphere.
Geosci.Model Dev.Discuss., doi:10.5194/gmd-2016-115,2016 Manuscript under review for journal Geosci.Model Dev.Published: 23 June 2016 c Author(s) 2016.CC-BY 3.0 License.resolutions -spanning from 125 km up to 16 km -with several ensemble members.Furthermore, a smaller set of transient coupled simulations (PFC simulations, 1850-2100) at T255 ORCA1 (∼80 km for the atmosphere and about 1°for the ocean) has been run.Each deterministic experiment included in Climate SPHINX has a counterpart where the sub-grid unresolved scale has been parameterised with two different stochastic physics schemes (namely the SPPT and SKEB schemes).This makes Climate SPHINX the first climate dataset which includes a large number of ensemble members with a stochastic parameterisation at different horizontal resolution: along with other high-resolution simulation campaigns such as UPSCALE Geosci.ModelDev.Discuss., doi:10.5194/gmd-2016-115,2016   Manuscript under review for journal Geosci.Model Dev.Published: 23 June 2016 c Author(s) 2016.CC-BY 3.0 License.

Figure 2 .
Figure 2. Scheme representing the methodology adopted to create the FutureHadiSST 2.1.1.The new dataset is a combination of detrended daily variability from HadiSST 2.1.1,CMIP5 EC-Earth mean change and CMIP5 EC-Earth RCP8.5 trend.

Figure 4 .
Figure 4. Scheme representing the methodology adopted to fill the "bare points", i.e. the points where sea ice have retreated in the CMIP5 EC-Earth RCP8.5 simulation.Each line represent a SST profile from the equator to the pole.

Figure 5 .
Figure 5. (a) and (c) show the frequency of occurrence of daily-mean rain rates averaged over 2.5°x2.5°gridboxes between 10°S-10°N in different datasets in 5 mm/day intervals, with rates below 0.1 mm/day omitted.(a) shows data for GPCP, TRMM and the deterministic SPHINX PDA simulations and (c) shows the same for the PDA simulations with stochastic physics.Note that the vertical axis is logarithmic.(b) and (d) show the the rain rates in each simulation as a fraction of that in GPCP for the deterministic and stochastic runs respectively.Horizontal dashed lines indicate a fraction of 1, which would correspond to perfect agreement with GPCP.The frequency in (a) and (c) corresponds to that for an individual grid box, if all grid boxes were statistically equivalent.Data are shown for 1998-2008, the time period common to all datasets, and for the first ensemble member of the model datasets only.

Figure 6 .
Figure 6.MJO frequency of occurrence vs. mean amplitude for the PDA experiments in the four different phases given the MJO amplitude to be > 1.The four phases are classified as Indian Ocean, Maritime Continent, West Pacific and Western Hemisphere, and their geographical location is shown by the boxes at the bottom of each panel with anomalous positive/negative precipitation patterns (green/yellow regions).Colours indicate the ensemble mean of the different resolutions as shown in the legend, where the circles are the deterministic runs and the diamonds the stochastic runs.ERA-Interim is reported in grey.Statistics are shown for the period 1980-2001.

Figure 7 .
Figure 7. Upper panel: ensemble mean blocking frequencies following D'Andrea et al. (1998) for the different PDA experiments.Members of deterministic and stochastic experiments have been combined together for each resolution.ERA-Interim for the 1979-2008 period is shown as comparison in black.Lower panel: DJF climatological mean for geopotential height at 500 hPa for the ensemble mean of PDA experiments.Only 5200, 5300, 5400 and 5500m isopleths are reported.

Table 1 .
Resolution dependent scientific configuration for EC-Earth in the Climate SPHINX experiments.The same number of ensemble members has been run for PDA and FSA experiments.T255C is the coupled configuration used for PFC simulations.Resolution is estimated at the Equator.The number of members indicate the deterministic and stochastic members.Backscatter ratio (tuning parameter for SKEB stochastic scheme), convective adjustment time (tuning parameter for deep convection) and momentum launch (tuning parameter for nonorographic gravity waves) are unit-less.

Table 2 .
Radiative fluxes expressed in W/m 2 for the reference experiment (i.e. the first simulation run) at different resolution for PDA simulations.D stands for deterministic simulation, S for stochastic.Fluxes have been tuned for T255D.

Table 3 .
Resolution dependent technical details for EC-Earth in the Climate SPHINX experiments.T255C is the coupled configuration used for PFC simulations.Walltime has been measured on the Supermuc-II Haswell platform, and it is evaluated for deterministic simulations; stochastic simulations walltime is about the 5% higher.Geosci.Model Dev.Discuss., doi:10.5194/gmd-2016-115,2016 Manuscript under review for journal Geosci.Model Dev.Published: 23 June 2016 c Author(s) 2016.CC-BY 3.0 License.