Historical greenhouse gas concentrations for climate modelling (CMIP6)

Abstract. Atmospheric greenhouse gas (GHG) concentrations are at unprecedented, record-high levels compared to the last 800 000 years. Those elevated GHG concentrations warm the planet and – partially offset by net cooling effects by aerosols – are largely responsible for the observed warming over the past 150 years. An accurate representation of GHG concentrations is hence important to understand and model recent climate change. So far, community efforts to create composite datasets of GHG concentrations with seasonal and latitudinal information have focused on marine boundary layer conditions and recent trends since the 1980s. Here, we provide consolidated datasets of historical atmospheric concentrations (mole fractions) of 43 GHGs to be used in the Climate Model Intercomparison Project – Phase 6 (CMIP6) experiments. The presented datasets are based on AGAGE and NOAA networks, firn and ice core data, and archived air data, and a large set of published studies. In contrast to previous intercomparisons, the new datasets are latitudinally resolved and include seasonality. We focus on the period 1850–2014 for historical CMIP6 runs, but data are also provided for the last 2000 years. We provide consolidated datasets in various spatiotemporal resolutions for carbon dioxide (CO2), methane (CH4) and nitrous oxide (N2O), as well as 40 other GHGs, namely 17 ozone-depleting substances, 11 hydrofluorocarbons (HFCs), 9 perfluorocarbons (PFCs), sulfur hexafluoride (SF6), nitrogen trifluoride (NF3) and sulfuryl fluoride (SO2F2). In addition, we provide three equivalence species that aggregate concentrations of GHGs other than CO2, CH4 and N2O, weighted by their radiative forcing efficiencies. For the year 1850, which is used for pre-industrial control runs, we estimate annual global-mean surface concentrations of CO2 at 284.3 ppm, CH4 at 808.2 ppb and N2O at 273.0 ppb. The data are available at https://esgf-node.llnl.gov/search/input4mips/ and http://www.climatecollege.unimelb.edu.au/cmip6 . While the minimum CMIP6 recommendation is to use the global- and annual-mean time series, modelling groups can also choose our monthly and latitudinally resolved concentrations, which imply a stronger radiative forcing in the Northern Hemisphere winter (due to the latitudinal gradient and seasonality).


Introduction
Emissions from the burning of fossil fuels, deforestation, agricultural activities and the production of synthetic greenhouse gases (GHGs) are the primary reasons for the observed increases in GHG concentrations, defined as mole fractions in dry air. The elevated GHG concentrations induce a radiative forcing that in turn would cause more than the observed recent global warming if it were not for the cooling effect by aerosols (Fig. TS.10 in IPCC WG1 AR5;IPCC, 2013). An accurate quantification of anthropogenic and natural climate drivers is crucial for general circulation and Earth system models (ESMs). Simulations by these models for the historical time periods, e.g. since 1850, can only be meaningfully compared to observations (e.g. surface temperature, ocean heat uptake) to the degree that input forcings are an accurate representation of the past. The difficulty with many anthropogenic climate drivers is that their global-mean magnitude, their latitudinal gradient and seasonal cycle are uncertain further back in time, even for the main GHGs carbon dioxide (CO 2 ), methane (CH 4 ) and nitrous oxide (N 2 O). Systematic observational efforts started in 1957-1958, measuring CO 2 at the South Pole and Mauna Loa observatories (Keeling et al., 2001). Measurements of archived air, firn air and ice cores from both polar regions provide records for the pre-observational time. To date, reconstructions of millennial global-mean time series based on ice and firn data have been performed, e.g. for CO 2 over recent millennia (Ahn et al., 2012;MacFarling Meure et al., 2006;Rubino et al., 2013). For the more recent past, several studies investigated firn and ice data to constrain halo-carbons (Buizert et al., 2012;Martinerie et al., 2009;Mühle et al., 2010;Sturrock et al., 2002;Trudinger et al., 2016), some of them with hemispheric resolution. In terms of latitudinally resolved monthly data, there have only been a few synthesis products, namely for CO 2 , CH 4 and N 2 O over the instrumental record over the past 20-40 years (NOAA, 2013;NOAA ESRL GMD, 2014a, b, c). For this recent past, the World Data Centre for Greenhouse Gases (WD-CGG) (ds.data.jma.go.jp/gmd/wdcgg/) also provides a synthesis with global and hemispheric means for CO 2 , CH 4 and N 2 O (Tsutsumi et al., 2009). In light of the observational gaps further back in time, some studies, such as Keeling et al. (2011), used linear regressions between fossil fuel use and latitudinal CO 2 concentration trends to separate natural from anthropogenically induced effects, which allows us to infer latitudinal gradients back in time.
In previous climate model inter-comparison projects (Meehl et al., 2005), global-mean concentrations have been prescribed (Meinshausen et al., 2011), with some models constraining internally generated fields of GHG concentrations to match those global-mean values. Here, we update those global-mean and annual-mean GHG concentration time series for the historical period over years 0-2014, with "historical" simulations in the CMIP6 model intercomparison  focussed on the most recent period, 1850-2014. In addition, we provide hemispheric and latitudinal monthly-resolved fields for 43 GHGs in total. In the past, the large latitudinal and seasonal gradient of GHG radiative forcing has not been consistently applied to model radiative forcing and climate change. The new datasets provide a more consistent starting point for climate model experiments. The monthly and latitudinal resolution of this new GHG dataset is designed to have a similar resolution to the monthly solar forcing (Matthes et al., 2016) and monthly and latitudinally resolved ozone and aerosol abundances. Many GHGs also have significant longitudinal (land-ocean) and diurnal variations but we do not attempt to resolve them. Neither do we provide vertical gradients of the GHG concentrations, and we only discuss possible vertical extension methods (Sect. 4.1) in case models do not have their own methods to derive vertical gradients.
In this study, we compile one possible reconstruction of latitudinally and monthly resolved fields, as well as global annual means of surface GHG concentrations for 43 gases from year 0 to 2014, as input for the forthcoming model inter-comparison experiments that are part of the Phase-6 Coupled Model Intercomparison Project (CMIP6) . Specifically, we provide the pre-industrial control runs at 1850 forcing levels ("picontrol"), the experiment with abruptly quadrupled CO 2 concentrations ("abrupt4x"), the standard experiment of a 1 % annual CO 2 concentration increase ("1pct2co2"), and the historical runs that are driven with best-guess estimates of historical forcings since 1850. Species that are radiatively less important than CO 2 , CH 4 and N 2 O ("importance" here being measured as radiative forcing exerted in year 2014 compared to 1750) are provided individually as well as aggregated as HFC-134a and CFC-12 equivalent concentrations. The description of the datasets geared towards CMIP6 modelling groups is provided in Sect. 4, including a description of available data formats and CMIP6 minimum recommendations.
The design principle for this long-term dataset is to provide a plausible reconstruction of past GHG concentrations to be used in climate models. Using various gap-filling procedures, reconstruction and extensions, this dataset aims to reflect observational evidence of both recent flask and in situ observations from the worldwide network of NOAA ESRL and AGAGE stations, as well as Antarctic and Greenland ice core and firn data over the last 2000 years, where available. Furthermore, many detailed literature studies (Arnold et al., 2013Aydin et al., 2010;Ivy et al., 2012;Martinerie et al., 2009;Montzka et al., 2015;Mühle et al., 2010;Oram et al., 2012;Sturrock et al., 2002;Trudinger et al., 2004Trudinger et al., , 2016Velders and Daniel, 2014;Vollmer et al., 2016;Worton et al., 2006) for radiatively less important species are compared with our data product in the fact-sheet figures for the specific gases (Table 12 and Figs. S1-S40 in Supplement), or synthesized where direct observational records from the above networks were not available.
The predominant climate effect of GHG increases is captured by the global-and annual-mean concentrations throughout the atmosphere. The surface global-and annualmean concentrations provided here, in combination with the models' approximations for the vertical concentration profile, are the minimum standard for CMIP6 models. Assimilating a latitudinally and seasonally resolved data product serves two purposes. Firstly, to derive the global and annual means from sparse observations rests on knowledge or assumptions about spatial and seasonal distributions. Secondly, to open the opportunity for some modelling groups to go beyond the prescription of global-and annual-mean concentrations.
Undoubtedly, some of the assumptions stretch into unknown territory, such as the seasonality of the CO 2 concentrations in pre-observational times or the time variability of latitudinal gradients, let alone the higher-frequency fluctuations of global-mean concentrations during the time when only ice core data are available. Errors in the historical forcing do propagate and can hinder the comparison between observations and models. This study therefore had to find a workable compromise between providing a complete dataset that covers the whole time and space domain and being as close as possible to sometimes sparse observations. Hence, the remaining uncertainties in concentration gradients should be kept in mind, although they might not be of primary concern in regard to the inter-comparison aspect of the multimodel ensemble runs. Thus, while our CMIP6 community dataset will improve on the global-and annual-mean timeseries prescribed for the last set of CMIP5 experiments on a number of key aspects, many research questions remain open.
The underlying reasons for meridional gradients of annual-mean concentrations are manifold (Keeling et al., 1989a, b;. For one, the sources of anthropogenic GHGs from fossil fuel burning and cement production or industrial activities are not evenly distributed with latitude, but concentrated in the mid-northern land masses. In the case of CO 2 , emissions from deforestation are not uniformly distributed with latitude either. The pattern of land-use-related emissions is even less stationary, with CO 2 uptakes and sources predominantly focussed in the midnorthern latitudes up until earlier in the 20th century, shifting more towards lower latitudes in recent decades (Hurtt et al., 2011). This study uses an approach based on simple regressions that implicitly rest on the assumption of a fixed pattern approximation (such as Keeling et al., 2011). One complication to retrieving the latitudinal pre-industrial CO 2 concentration profile is that CO 2 fertilization and temperature effects on the carbon cycle, over both ocean and land, change both the magnitude and spatial patterns of natural CO 2 fluxes. Lastly, both the diurnal and seasonal cycle of photosynthesis and its covariance with vertical atmospheric mixing can have a pronounced effect on measured surface concentrations (the so-called "rectifier" effect), increasing annual mean northern hemispheric CO 2 surface concentrations by up to 2.5 ppm (Denning et al., 1999).
To dissect and analyse the different causes for temporal and spatial heterogeneity in surface concentrations, a rich body of literature has analysed observed latitudinal and seasonal gradients with various inversion techniques. Recent research provides a clearer picture in regard to the causes of the change in seasonality of CO 2 concentrations (Forkel et al., 2016), a topic researched already in 1989 (Kohlmaier et al., 1989) based on the CO 2 fertilization effect on northern hemispheric terrestrial biota. Generally, the research into meridional and seasonal variations employs various atmospheric inversion techniques Mansbridge, 1991, 1989;Enting et al., 1995;Enting, 1998;Rayner et al., 1999) to match observed concentrations with source and sink pattern estimates (Baker et al., 2006;Enting et al., 1995;Gurney et al., 2002Gurney et al., , 2003Gurney et al., , 2004Keeling et al., 1989a, b;Peylin et al., 2013;Rayner et al., 1999;Tans et al., , 1990a. Similarly to CO 2 , the spatial variation in CH 4 concentrations is used for model inversions to infer sources and sinks (Fung et al., 1991;Kirschke et al., 2013).
There is a substantial lack of observational evidence of both seasonality and latitudinal CO 2 gradients in preindustrial times. Given that atmospheric CO 2 is not well preserved in the Greenland ice , the pre-observational north-south gradient cannot be inferred or derived from the Greenland and Antarctic ice core records. Alternatively, understanding biospheric sink and source dynamics could provide vital evidence to infer pre-industrial surface concentration patterns. In this study, we do not employ any such inversion models or results, and only note that our pre-industrial meridional and seasonal variations should be regarded as highly uncertain. However, some plausibility of the CO 2 gradients is gained by comparison with some model studies (Sect. 5). High-latitude records of CH 4 are available from both hemispheres (Mac-Farling Meure et al., 2006;Mitchell et al., 2013;Rhodes et al., 2013), allowing us to estimate pre-industrial large-scale CH 4 concentration gradients.

Methods
To achieve the goals of this study, several analytical steps were taken to assimilate the observational data. Global-mean and annual-mean concentrations are of primary interest, but the discussion also covers latitudinal and seasonal variations. The assimilation procedure for sparse observational data requires this spatio-temporal heterogeneity to be accounted for to derive global and annual means.
All concentrations given here are dry air mole fractions and we use "mole fractions" and "concentrations" interchangeably and synonymously with "molar mixing ratios". For simplicity, we denote the dry air mole fractions "µmol mol −1 ", "nmol mol −1 " and "pmol mol −1 " as parts per million (ppm), parts per billion (ppb) and parts per trillion (ppt), respectively. Note that dry air mole fractions are independent of temperature and pressure, while volume mixing ratios (e.g. ppmv) for mixtures of non-ideal real gases are not, and at standard temperature and pressure conditions can differ significantly from their corresponding mole ratios.

Summary of assimilation approach
We perform three consecutive steps to synthesize the global mole fraction fields over the full-time horizon from year 0 to year 2014. First, we aggregate the available observational data over the recent instrumental period. Second, we estimate three components of the global surface concentration fields from these data, namely global-mean mole fractions, latitudinal gradients and seasonality. Third, we extend those components back in time with -inter alia -ice core or firn data. The full historical GHG concentration field can then be generated by the time-varying components.
Under this basic assimilation model, the concentration C (l, t) at any point in time t and in a latitudinal band l can be written as follows: C (l, t) = C global (t) +Ŝ l,m (y) +L l (y) , where C global (t) is the global-mean dry air mole fraction at time t,Ŝ l,m is the seasonality in each latitude l and month m, andL l (y) is the latitudinal annual-mean deviation in year y at latitude l. With this assimilation model, and the optimal low rank approximations of seasonality and latitudinal gradients, a regularization of the data is performed by a principal components analysis, which creates a degree of robustness against data gaps or outliers. Other methods, like a harmonic representation of station data, have, in principle, a similar smoothing and regularization effect (Masarie and Tans, 1995), although quantitative differences exist (Sect. 5.4).
A detailed data-flow diagram of how the historical GHG mole fractions are derived in this study is provided in Fig. 1. The subsequent section will describe the method step-by-step as indicated by the green circles in Fig. 1 and also tabulated for the three main GHGs in Table 1.

Step 1: aggregating raw station data
Atmospheric measurements are taken in remote environments or locations that are closer to pollution sources, in continental or marine areas, at different times of the day or night, at different altitudes, and in different seasons of the year, often using different calibration scales. This poses challenges for any synthesis of observational data.
The observational station data over the recent decades used in this study are predominantly sourced from the networks operated by NOAA (Earth System Research Laboratories: ESRL) and AGAGE. In general, we use monthly station data provided by the respective networks as a starting point. In the case of the AGAGE network, monthly averages are provided with and without pollution events (http://agage.eas. gatech.edu/data_archive/agage/ and http://cdiac.ornl.gov/ftp/ ale_gage_Agage/AGAGE/). We chose the monthly averages that include pollution events (file-endings ".mop", with the exception of CH 2 Cl 2 , in which case data issues warranted the use of monthly station averages without pollution events). The approach that we do not restrict our source data to background conditions is consistent with our approach elsewhere -and the NOAA network monthly station averages -which do not screen out pollution events (although the dominant number of NOAA flask measurements will likely be biased towards background conditions rather than pollution events owing to their location and sampling protocols at most sites focussed on collecting background air). In total, CO 2 data from 81 stations from the NOAA flask network and 3 stations from the NOAA in situ data stations are used (  For CH 4 , 87 sampling stations from the NOAA flask network and 5 stations from the AGAGE in situ network are compiled (Table 3). For N 2 O, data from flask and in situ measurements at 13 stations of the NOAA HATS global network are combined with data from 5 stations from the AGAGE network (Table 4). For other gases, the AGAGE and NOAA coverage and time frames vary, with individual station's codes provided in the "f" panels of the individual gases' fact sheets (Figs. S1-S40). We provide references to the used NOAA and AGAGE data in Table 12. Calibration scales, i.e. the standardized gas mixtures that allow us to calibrate the instrumentation used for in situ or flask measurements, differ between the NOAA and AGAGE networks. Gas measurements on different measurement scales, even when using the same scales by different laboratories, are subject to uncertainties (Hall et al., 2014). For halocarbons, the difference in calibration scales has been estimated as small, but not negligible, i.e. within 2.5 %, often within 1 % (Rhoderick et al., 2015).  1968-2014. 1984observations Version: 2015-08-03, (2014 scores optimised monthly station averages to match observations) (Dlugokencky, 2015b;NOAA ESRL GMD, 2014a, b, c of Law Dome record surface air fossil fuel and Rubino et al., 2013) and(0-1966) and Mauna Loa temperature change industry emissions annual-mean MLO station data record  since pre-industrial (Boden et al., 2013). (Keeling et al., 1976 (Dlugokencky, 2015a). relative optimized to match seasonality.
Before Updated Law Dome Optimized to match The score for EOF1 is 1985 (Etheridge et al., 1998;smoothed Law Dome regressed against global MacFarling Meure et al., 2006) record and NEEM annual fossil fuel and and NEEM (Rhodes et al., 2013). firn data. industry emissions (Gütschow et al., 2016 While we use the station data that have already been converted to the latest scales of the respective networks, some older comparison data products use previous scales (like the one published in the latest ozone assessment report; WMO, 2014 Etheridge et al. (1998, for piecewise Rubino et al. (2013), third-degree polynomial MacFarling Meure et al. (2006) smoothing over remainder of years 0 to 1966. * See station descriptions here: http://www.esrl.noaa.gov/gmd/dv/site/site_table.html.
without the need for a conversion factor (WMO, 2012). The Law Dome data used here (Etheridge et al., 1998Mac-Farling Meure et al., 2006;Rubino et al., 2013) have been updated for minor dating changes and placed on current NOAA scales (http://www.esrl.noaa.gov/gmd/ccl/index.html). Apart from those three main gases, we do not apply further scale conversions. Thus, given that our results are based on a mixture of the AGAGE and NOAA networks, they are de facto a weighted average between the respective two standard scales (SIO and NOAA) for each gas. The effective weight in this "weighted mean" depends on the station numbers and each network's station distribution, given that our assimilation method implicitly gives less weight to stations that are geographically close, i.e. in the same latitude-longitude box. This mixture of scales is different from previous studies that either applied empirical scale conversions (so that globalmean or station averages are identical) or used both scales in parallel to estimate a measurement uncertainty (WMO, 2014), for example when estimating emissions with inverse techniques. Mathematically, our approach is similar to an approach where a station-by-station scale conversion would be applied towards an intermediate scale between NOAA and AGAGE. However, for some applications, this approach is clearly a limitation as it hides the uncertainty and would for example warrant a new data assimilation if one network updates its scales (Sect. 6). The reason this "weighted mean" approach is chosen in the context of this study is that we intend to reconstruct a single concentration history making use of the station data from both major measurement networks without giving preference to one or the other measurement scale. Given that different scales between the two major networks result in differences that are generally less than 2 % (and are often for radiatively less important substances), this "middle of the road" approach seems justified given the other uncertainties in climate model forcings (vertical distributions, radiative forcing routines, other radiative forcings such as aerosols). Any conversion to a single scale would ease comparisons, but would not be able to address the inherent measurement uncertainty, and might even face a stronger bias (if the two scales SIO and NOAA are equally plausible representations of the "truth") (Sect. 6).
However, in regard to the time of the day, month or year, we do not apply interpolation or adjustment techniques other than a simple monthly binning of all available data (see Sect. 2.1.2). The spatial and temporal coverages of the raw  Lang (1990aLang ( , b, 1992

Step 2-4: binning and spatial interpolation
We employ a simple monthly mean binning of all available data, separately averaged for each station. Stations with more than one measurement program, e.g. with flask and in situ programs, are treated as distinct stations. Thus, the monthly average of an in situ data series with 1000 measurement points gets the same weight as the monthly average from a flask measurement program with few observations. In each latitudinal-longitudinal box, all available monthly mean station data are averaged, with the mean being assigned to the grid box centre before employing a 2-D spatial interpolation to extend available data points to longitudinal and latitudinal grid points that do not have observed data for any particular month. Our method provides equal weight to each station within a longitude-latitude box, no matter whether the station reports a few flask measurement samples or sub-hourly in situ instrument readings in each month. The chosen assimilation grid has 72 boxes with 12 equal-latitude bands of 15 • and 6 longitudinal bands of 30 • . Following the temporal monthly binning and subsequent spatial linear interpolation, we average all data across the longitudes to obtain 12 latitudinally resolved monthly time series of surface concentrations.

Step 5: global-mean mole fractions
The annual global mean concentration C global (y) is derived as the area-weighted arithmetic mean of the binned latitudinal data (small grey "5" in Fig. 1). In addition to the annual global mean, a time series of monthly values is derived as a smooth spline interpolation between the annual data points, with the constraint of being mean-preserving, i.e. that the average of the 12-monthly values is again the global annual average value initially derived. Thus, the trend in the mole fraction data is reflected in the global-mean time series from month to month.
with imax being 1 or 2 if only the leading or the two leading EOFs are taken into account, respectively.

2.1.5
Step 7-10: seasonality The seasonality fulfils the condition that the sum of seasonal variations at each latitude is zero over the year, i.e. 12 m=1Ŝ l,m = 0.
This seasonalityŜ l,m (t) at time t is calculated for most gases as the relative seasonality dŜ l,m dC global , i.e. the monthly deviation in mole fraction divided by the global-mean mole fraction, multiplied by the global-mean mole fraction at time t (steps 7 and 10 in Fig. 1).
An exception is the case of CO 2 (steps 8 and 9 in Fig. 1). In this case, the seasonality pattern over the observational period is held fixed as absolute mole fractions, i.e. not relative to the global mean. However, the residuals between this fixed seasonality and the seasonality, which is derived from the observations by subtracting the latitudinal averages, are used for a singular value decomposition. Let R l,m (t) be the residuals at latitude l and month m at time t; the optimal lower rank representation of this seasonal change is then given by the first EOF of the gram matrix 1/n × R R, with n being the number of observational data points. The derived score, i.e. the projection of the residuals onto the first EOF, is regressed against a time series P , a composite of global-mean CO 2 concentration and historical observed global-mean surface air temperatures. This simplified choice is made because previous studies identified warmer temperatures and elevated CO 2 mole fractions as dominant reasons for increased seasonality (Forkel et al., 2016;Graven et al., 2013;Welp et al., 2016), although anthropogenically induced cropland productivity increases are also suggested to play some role (Gray et al., 2014). Specifically, P is assumed to be a composite of the product and the sum of normed global-mean surface air temperature and normed CO 2 mole fraction deviations from pre-industrial levels. The temperature and mole fraction deviations are normalized such that the 2000-2010 deviation from the 1850-1880 base period is set to 1. Thus, the regressor P can be described as follows: with T being the temperature deviation from the 1850-1880 period, specifically And C being the normed mole fraction deviation. Note that this regressor P is one of multiple options that were tested and could be regarded as a plausible regressor for seasonality changes. Specifically, we tested global-mean CO 2 concentrations, global-mean annual average surface air temperatures and lagged averages of surface air temperatures as regressors (see Fig. 5). The R-squared values of the regressions over the 1984-2014 period are relatively similar across all regressors, around 0.8. The marked difference is that the regression with only CO 2 concentrations would result in a stronger reduction of seasonality around 1940-1960 and before 1900. By 1850, the reduction of summertime CO 2 concentrations in the zonal band around 52.5 • N would be around 8.6 ppm compared to 2014 (multiply the differences of the seasonality scaling difference between 1850 and 2014, about 21, with the 0.41 ppm maximum of the EOF pattern, shown in Fig. 9a.2). In contrast, the other regression options would limit the maximal seasonality change to about 5.7 ppm, closer to the maximal seasonality change detected within the period 1984-2014, of 4.5 ppm (cf. Fig. 5e). Given the uncertainty in regard to pre-1960 seasonality, we opted for the more conservative extrapolation method that implies a less significant change outside the observational period and chose the regressor with the least variability, namely our composite regressor combining temperature and CO 2 concentrations. Despite the differences in the regressors, it should be noted that early CO 2 observations are too sparse to come to a definite conclusion in regard to which regressor is best suited -given that the induced differences around the 1960s and 1970s are fairly small compared to the noise in the observations (see Fig. 5f and g). Furthermore, seasonality changes in the case of CO 2 depend on a number of factors, inter alia complex interaction of CO 2 fertilization of temperate, seasonal gross primary productivity, the influence of temperature, precipitation on biomass growth and respiration, and directly human-induced changes in land use areas and their productivity. Therefore, this extension of the observed seasonality changes beyond the observational period based on a regression with temperatures and CO 2 concentrations is just that: a plausible extrapolation that needs to be refined by further research to replace this study's ad hoc assumption.
The measured seasonality of CH 4 and N 2 O over the observational time period is found to be closely approximated by our default assumption of a seasonality that is proportional to global-mean mole fractions. For several other substances, however, seasonality has been assumed to be zero -either because the diagnosed seasonality was very small or due to a lack of observational data.   Figure 5. Comparison of various scaling options for the change of seasonality of CO 2 concentrations over time. The first EOF of the residual fields of observations minus the mean 1984-2014 CO 2 seasonality ( Fig. 9a.2) is scaled with an EOF score. Before 1984, this EOF score is regressed against a composite of global-mean CO 2 concentrations and global-mean surface air temperatures (see text and panel b).
Alternative regressors include global-mean CO 2 concentrations (a), lagged averages of monthly global-mean surface air temperatures (c) and raw global-mean annual average surface air temperatures (HadCRUT4v) (Morice et al., 2012) (d). The regressed EOF score back in time is shown in (e). A comparison to the first CO 2 measurements of higher northern latitudes at so-called Station P (STP) and Point Barrow in Alaska (PTB), where the seasonality change is most pronounced, is provided in (f) and (g), respectively (see text for discussion).

2.1.6
Step 11-13: extension of latitudinal gradients and global means with ice core and firn data Historical GHG records from ice and firn provide highlatitude estimates of atmospheric GHG mole fractions before the instrumental record from air sampling stations. We rely mainly on the Law Dome data (Etheridge et al., 1998MacFarling Meure et al., 2006;Rubino et al., 2013), updated for minor dating changes and placed on current NOAA scales, and, for northern hemispheric CH 4 , Greenland NEEM ice core data (Rhodes et al., 2013). Although we did not directly use their data, we acknowledge multiple other efforts, including but not limited to Mitchell et al. (2013), Bauska et al. (2015), Schilt et al. (2010b and Sowers et al. (2003) (Fig. 6). Law Dome atmospheric composition records have the advantage of a very narrow air age spread that provides measurements with high temporal resolution and mean air ages up to the 1970s, where they overlap with the beginning of atmospheric observations for many gases.
Having obtained estimates of the latitudinal gradients over the observational period and having derived approximations back in time by regressing latitudinal gradients EOF scores with emissions (step 11 in Fig. 1, Table 4), we can estimate global-mean mole fractions based on the Law Dome data for both CO 2 and N 2 O (step 12 in Fig. 1). In the case of CH 4 , the advantage is that there are northern hemispheric data available from NEEM (Greenland) (Rhodes et al., 2013) over the past 2000 years. This NEEM record hence allows an optimization of both the EOF scores and global means at past time points to match both the Law Dome and NEEM records (step 13 in Fig. 1). Some data gaps in the NEEM record are filled by linearly interpolating the optimized EOF scores of the latitudinal gradient. With an interpolated EOF score, the global-mean mole fraction can then be directly inferred from the Law Dome record. All optimizations are performed by minimizing area-weighted squared residuals.
The Law Dome ice core data are smoothed with a piecewise local third-degree polynomial median regression, using ad hoc expert judgement assumptions of errors and smoothing window widths specific to each gas in order to approximately reflect their long-term median evolution. In the case of CO 2 , a random error of 2 ppm was assumed, a percentage age error (reaching a maximum of 60 years at age 2000 years before present) with a bagging of 250 ensembles, a kernel width of 120 years, minimal number of data points of 7 and maximum of 25 (Fig. 8a). Likewise, CH 4 Law Dome data ice core data are smoothed with a third-degree polynomial median regression with a maximum kernel width of 100 years, 4 minimal data points (a constraint that overwrites the maximum kernel width, if necessary) and 10 maximal data points. As for CO 2 , 250 ensembles were averaged, after adding noise of 3 ppb, and an age uncertainty of 50 years per 2000 years. For N 2 O, a kernel width of 300 years was chosen with a minimum number of 7 and maximum number of 15 data points to be included in the piecewise third-degree polynomial regression. As for CO 2 and CH 4 , 250 ensembles were used for bagging after injecting a random noise of 3 ppb and an age-dependent x axis uncertainty of 90 years per 2000 years. The higher age uncertainty for N 2 O in comparison to CO 2 and CH 4 was chosen to account for the larger age gaps in the N 2 O Law Dome data that required a stronger horizontal smoothing for the median regression to converge. For CO 2 , the slightly higher age uncertainty in comparison to CH 4 was chosen so that the smoothed record displays a comparable time evolution to the WAIS CO 2 record (Fig. 6).
The Greenland NEEM ice core CH 4 data (Rhodes et al., 2013) exhibits some outliers in the recent period ( Fig. 6d) due to the incursion of modern air into still-open pores of shallow ice. Spikes in deeper ice are likely due to impurities. Hence we use the 5-year smoothed data provided by Rhodes et al. (2013) as a proxy for Greenland atmospheric background mole fractions (open red circles in Fig. 6b and d). We used the NEEM CH 4 firn measurements from Buizert et al. (2012Buizert et al. ( ) (2008, with effective ages from Ghosh et al. (2015) based on the iterative dating method of Trudinger et al. (2002b), corrected for the effect of gravity (as applied in other firn data) and put onto the NOAA 2006 primary calibration scale.

2.1.7
Step 14: extension of latitudinal gradients and global means with literature data For several gases, including ODSs, halons and PFCs, the available AGAGE and NOAA station data is spatially sparse. Before the start of systematic instrumental measurements, we use literature studies which make use of various data sources, such as air sample archives or firn records (step 14 in Fig. 1). Specifically, if a global mean is provided, we use that global mean in conjunction with our derived and regressed latitudinal gradients. In the case of hemispheric data points, we adapt the latitudinal gradient to match the literature studies, as in the case of C 4 F 10 , C 5 F 12 , C 6 F 14 , C 7 F 16 or C 8 F 18 , where we based both the global mean and latitudinal gradients on the data of Ivy et al. (2012). Other key studies used were Velders and Daniel (2014), the data underlying the WMO Ozone Assessment Report (2014), Arnold et al. (2013Arnold et al. ( , 2014, Trudinger et al. (2004), Mühle et al. (2010Mühle et al. ( , 2009), Montzka et al. (2011), updated time series by  (updated at ftp://ftp.cmdl.noaa.gov/hats/Total_Cl_ Br/), the recent study by Vollmer et al. (2016) in regard to Halons and by Trudinger et al. (2016) in regard to PFCs, and others (Arnold et al., 2013Ivy et al., 2012;Montzka et al., 2015;Mühle et al., 2010;Oram et al., 2012;Trudinger et al., 2016;Velders and Daniel, 2014;Vollmer et al., 2016;Worton et al., 2007), as indicated in the gas-specific fact-sheet figures (Figs. S1-S40 with references provided in Table 12). In the case of N 2 O and CH 2 Cl 2 , we assumed a constant latitudinal gradient back in time before ongoing measurement records are available (Figs. 12 and S7, respectively).

Step 15: extrapolation
For some limited data segments, an extrapolation has been used: either a piecewise smoothing spline to converge concentrations back to zero or pre-industrial background concentrations, e.g. before the WMO (2014) or Velders and Daniel (2014) data started in 1978 or 1951, respectively. The three radiatively most important fluorinated species, CFC-12, CFC-11 and HCFC-22 (Table 5), follow the global mean concentrations provided by Velders and Daniel (2014), in conjunction with separately derived latitudinal gradients and seasonality. Furthermore, a linear extrapolation was applied when there were not sufficient 2014 data available. 2.1.9 Step 16-19: creating the composite surface concentration field Following Eq. (1), the surface mole fraction fields over the full-time span are now synthesized from the lower rank  1850-2014 (c, d, e). The shown data is for CO 2 : Mauna Loa data by Keeling et al. (1976); the Law Dome ice record (Etheridge et al., 1998;MacFarling Meure et al., 2006;Rubino et al., 2013), updated for minor dating changes and placed on current NOAA scales; NOAA ESRL station data (NOAA, 2013;NOAA ESRL GMD, 2014a, b, c); the EPICA composite data (Ahn and Brook, 2014;Bereiter et al., 2015;Bereiter et al., 2012;Lüthi et al., 2008;MacFarling Meure et al., 2006;Marcott et al., 2014;Monnin et al., 2004;Petit et al., 1999;Rubino et al., 2013;Schneider et al., 2013;Siegenthaler et al., 2005); and the WAIS data (Bauska et al., 2015). For CH 4 , the shown data is the Law Dome data (Etheridge et al., 1998;MacFarling Meure et al., 2006), the instrumental data from the NOAA and AGAGE networks (see Table 3), NEEM ice core measurements (Rhodes et al., 2013), the EPICA Dronning Maud Land ice core record (Barbante et al., 2006;Capron et al., 2010;Schilt et al., 2010b), and the long record by Loulergue et al. (2008) as well as the GISP2D, WDC05A and WDC06A records by Mitchell et al. (2013). In case of N 2 O, the shown data is the Law Dome record (MacFarling Meure et al., 2006), the Talos Dome record (Schilt et al., 2010b), the GISPII record (Sowers et al., 2003) and the EPICA Dome C record Schilt et al., 2010a;Spahni et al., 2005;Stauffer et al., 2002) in addition to the H15 ice core record from Antarctica (Machida et al., 1995), the South Pole firn record (Battle et al., 1996), the Law Dome firn record "Park" (Park et al., 2012) and a modelling synthesis by Ishijima (2007). For data sources behind "this study's" composite product, see Tables 2, 3 and 4. Table 5. Options for reducing the number of GHGs to be taken into account in climate models to approximate full radiative forcing of all GHGs. The GHGs are ranked by their radiative forcing, with CO 2 having the highest radiative effect change between 1750 and 2014. The stated percentages in this table's rows are cumulative, i.e. the radiative forcing stated in a row is the radiative forcing of the GHG stated in that row plus the sum of all higher-ranked GHGs in the rows above, expressed as percentage of total anthropogenic GHG radiative forcing. In Option 1, a climate model explicitly resolves actual GHG concentrations. With 8 and 15 species, 99.1 and 99.7 % of the total radiative effect can be captured, respectively. In Option 2, only CFC-12 is modelled next to CO 2 , CH 4 and N 2 O; all other gases are summarized in a CFC-11-equivalence concentration. In Option 3, all ODSs are summarized in a CFC-12-equivalence concentration, and all other fluorinated substances are summarized in HFC-134a-equivalence concentrations. Note that below shares are approximations, as linear radiative forcing efficiencies are assumed here for all gases, and also for CO 2 , N 2 O and CH 4 . representations of seasonality, latitudinal gradient and the smooth monthly representation of global-mean mole fractions. As per the original station data aggregation, the latitudinal resolution is 15 • and the time resolution is monthly. In order to assist with application in climate models with finer grids, we also produced a finer grid interpolation to 0.5 • latitudinal resolution using a mean-preserving smoothing. This finer grid interpolation should not be mistaken as a mole fraction field containing actual information at 0.5 • level. The purpose is simply to offer a smooth interpolation that avoids errors that will arise from, for example, a linear interpolation between the provided 15 • latitude points, as the mean across those (linearly) interpolated values would not match the original field. The mean-preserving smoothing code is available from the authors on request. Finally, the 15 • fields are aggregated into global, Northern and Southern Hemisphere monthly and annual means.

2.1.10
Step 20: aggregating equivalent mole fractions It is computationally inefficient to model the radiative effect of 43 individual GHGs in today's ESMs or general circulation models. Climate models use different pathways to approximate the radiative effects of the full set of GHGs. As one strategy, only the radiatively major GHGs are explicitly modelled, such as CO 2 , CH 4 , N 2 O, CFC-12 and CFC-11, which together cause 94.5 % of the GHG warming effect (measured in radiative forcing) in 2014 relative to 1750 and 98 % of the total radiative effect compared to the full set of 43 GHGs (Table 5). Alternatively, radiatively minor GHGs can be approximated by equivalent GHG concentrations of a marker gas. In this way, the radiative effect of the group of gases is expressed by a single gas mole fraction. One definitional issue is whether the radiative forcing since 1750, i.e. only the changes since pre-industrial levels, are expressed by the marker gas (here called "marginal equivalence" C eq,i ).
In this case, the marker gas' concentrations C eq,i are sought that would exert the same aggregate radiative forcing since 1750 as the group of summarized gases. Thus, let C j (t) be the concentration (mole fraction in dry air) of a GHG and C 0,j the pre-industrial level, i.e. in year 1750, which is routinely used as base year for radiative forcing (IPCC, 2013). A marker equivalence mole fraction by gas C eq,1 for group C j with j = 1, . . .n is then given by the following: with R j (C) being the radiative forcing function relating concentrations C(t) at time t to radiative forcing for gas j , in the linear case R j (C) = C · E j , with E j being the radiative efficiency. R −1 i (F ) is the inverse of this radiative forcing function, so that the concentration C that corresponds to a forcing F is given by C = R −1 i (F ). In contrast, equivalent concentrations can express the radiative effects of the summarized GHGs including their natural background levels (here called "full equivalence" C eq,i ).
While the former definition, "marginal equivalence", is often used to express the total GHG forcing in CO 2 equivalence concentrations, the latter "full equivalence" is the more appropriate quantity to drive climate models, given that natural background concentrations of not-explicitly-considered gases should nevertheless exert a radiative effect even in a pre-industrial control, even though that radiative effect does not count under a radiative forcing definition that looks at changes from 1750. In the linear case, in which case radiative forcing is proportional to the gas' concentrations, Eq. (7) can be written as follows: with r eff i being the radiative efficiency of gas i (W m −2 per ppb).
Thus, climate models have the option to reduce the complexity of 43 GHGs and the associated computational burden by reducing the number of GHGs that are taken into account. With the top 5 GHGs, CO 2 , CH 4 , N 2 O, CFC-11 and CFC-12, climate models would capture 98 % of the total radiative effect in year 2014 and 94.5 % of the radiative forcing since 1750, i.e. the change of the radiative effect between 1750 and 2014 (see Table 5). As an alternative, there is the option to use equivalent concentrations. For two such equivalence options, this study provides input datasets. Modelling groups should indicate the combination of files they employed: 1. Option 1: climate models implement a subset of 43 GHGs.
2. Option 2: climate models implement the four most important GHGs with their actual mole fractions explicitly, namely CO 2 , CH 4 , N 2 O and CFC-12, and summarize the effect of all other 39 gases in an equivalence concentration of CFC-11. For this purpose, we provide CFC-11 eq. concentrations ("full equivalence").
3. Option 3: like option 2, but with a different split up of gases other than CO 2 , CH 4 and N 2 O. Climate models implement the three most important GHGs with their actual mole fractions explicitly, namely CO 2 , CH 4 and N 2 O, and summarize the radiative effect of the ODSs in a CFC-12 eq. concentration and the radiative effect of all other fluorinated gases as a HFC-134a eq. concentration. For this purpose, we provide CFC-12 eq. and HFC-134a eq. concentrations ("full equivalence").

Data analysis for comparison with CMIP5 ESMs
We compare our results to various other datasets (see Sect. 5), inter alia to CO 2 fields from CMIP5 ESMs (Sect. 5.3).
Here, we briefly describe the analytical steps that we performed for retrieving the ESM data. We analyse 10 CMIP5 ESMs that have an interactive carbon cycle model and provided the mole fraction of carbon dioxide in the air as function of different pressure surfaces for the esmhistorical experiment. We diagnosed those esmhistorical experiments in terms of the simulated CO 2 mole fraction at surface pressure (1 bar = 100 000 Pa) for 10 CMIP5 ESMs, for which data were available: (1) BNU-ESM (BNU, China),

Results
Here, we describe the historical concentrations of the main GHGs and provide a fact sheet for all 43 individual gases.

Carbon dioxide
The 800 000-year EPICA composite ice core record (Ahn and Brook, 2014;Bereiter et al., 2015Bereiter et al., , 2012Lüthi et 1950Lüthi et 1955Lüthi et 1960Lüthi et 1965Lüthi et 1970Lüthi et 1975Lüthi et 1980Lüthi et 1985 1990 300  (Keeling et al., 2001) for each 15 • latitudinal band. Also, the Law Dome ice record data is shown (k) with our third-degree polynomial smoothing. This study's monthly CO 2 zonal means were derived from station data from 1984 onwards. Before that, this study used Mauna Loa MLO annual average and smoothed Law Dome data (see Table 1 and Sect. 2). The shown comparison with monthly Scripps station data before 1984 is a qualitative validation of the applied methodology to regress latitudinal gradient and seasonality changes to times before 1984. See text. Figure 8. Historical GHG concentrations from 1750 to 2014 as global-mean (right panels), northern hemispheric (middle panels) and southern hemispheric averages (right panels). The top row comprises all GHGs, the middle row comprises HFCs, PFCs, SF 6 , NF 3 and SO 2 F 2 .
From the year 0 to 1000, our piecewise fit of the third-degree polynomial of Law Dome ice core data allows a derivation of global mean concentrations of around 278.6 ppm (minimummaximum range of 277.0-280.2 ppm).
Our smoothed Law Dome results do not reflect the higherfrequency variations suggested by the individual data points MacFarling Meure et al., 2006;Rubino et al., 2013) and are comparable to the frequency spectrum that would result from a smoothed median estimate of WAIS data by Bauska et al. (2015) and Ahn et al. (2012). The WAIS record is generally 3-6 ppm higher than the Law Dome record and is also higher than South Pole and EPICA DML ice cores (Ahn et al., 2012) and the Dronning Maud Land ice (Rubino et al., 2016). The cause for this difference is not yet known (Fig. 6b). The differences between the WAIS and the Law Dome record persist in 1850-1890 with subsequent data points being more aligned with each other (Fig. 6c). CMIP6 modelling groups might want to test an alternative dataset that captures those higher-frequency characteristics of the Law Dome record (data can be generated by the authors on request). In that higher-frequency dataset, the minimum of global mean CO 2 concentrations is close to 270 ppm around the year 1610. The smoother version provided for CMIP6 has its minimum in year 1666 at 276.27 ppm (Fig. 6b). The reason for the 1610 dip in the Law Dome record and why this does not show in the WAIS record is not yet fully understood. The current understanding of how the age kernel (to estimate the distribution of age of air at the time of bubble trapping) is different for the two sites cannot yet explain this difference in concentrations around 1610.
In regard to the latitudinal gradient, we explored various options. If we regress the scores of the first EOF of the latitudinal gradient ( Fig. 9d) against global fossil CO 2 emissions, the pre-industrial latitudinal minimum of surface CO 2 concentrations would be estimated in the mid-northern latitudes (approximately 1.8 ppm below the global mean), where the maximum was observed in recent decades (e.g. 4.8 ppm above the global mean in 2010). Previously a similar regression approach between concentrations and CO 2 emissions was used by Keeling et al. (2011) to separate the anthropogenic from the natural component in the concentration difference between Mauna Loa and the South Pole. This approach is not perfect due to the covariance of regional fossil fuel emissions with natural sinks over the same period, different patterns of anthropogenic land-use emissions and a latitudinal gradient component that merely results from seasonal CO 2 exchange (e.g. Denning et al., 1995). Nevertheless, this regression approach can provide a first indication of the influence of anthropogenic emissions on the latitudinal gradient. Furthermore, this approach would result in an approximately 0.4 ppm higher pre-industrial Antarctic CO 2 concentration compared to the global mean, coinciding with the assumption taken by Rubino et al. (2013). The Scripps station data not used in this calibration (i.e. all Scripps stations except for Mauna Loa) turn out to be matched rather closely by our approach over the 1950-1990 period (see Fig. 7).
However, given the evidence by CMIP5 ESMs of a slight tropical local maximum (Fig. 9b) and large uncertainties regarding pre-industrial sinks and source distributions and hence the latitudinal gradients of CO 2 , we assumed a zero pre-industrial latitudinal gradient. Thus we performed a zerointercept regression of the scores of the latitudinal gradient EOF1 with global fossil CO 2 emissions and converged the score of the second EOF towards zero, resulting in a flat latitudinal gradient in pre-industrial times.
The second EOF of the latitudinal gradient of CO 2 does not exhibit the same linearity over time as the first EOF, and the reasons are currently unknown. Potential candidates for this pronounced spike (Fig. 9c) of mid-northern latitude concentrations in the case of CO 2 are a shift in station sam-pling locations with more "polluted" land stations coming online after 1995, the "rectifier" effect due to an enhanced seasonal cycle (Denning et al., 1995) and the rise of Chinese emissions (the onset around year 2003 of the recent surge in Chinese CO 2 emissions is approximately coinciding with the respective EOF score becoming strongly positive; Fig. 9d; Francey et al., 2013). One suggested explanation for this 2010 change in north-south gradients are changes in interhemispheric transport (Francey and Frederiksen, 2016). Recently, i.e. after 2010, this spike in mid-latitude northern concentrations seemed to somewhat subside again according to our analysis (see scores for EOF1 and EOF2 in Fig. 9d). Future research could address the underlying reasons of this change in latitudinal patterns, and a physical explanation will allow a more appropriate backward extension in time.
The diagnosed average seasonality of atmospheric CO 2 concentrations over the observational period reflects the standard carbon cycle pattern of strong CO 2 uptake in spring and release in autumn due to photosynthesis and heterotrophic respiration in the Northern Hemisphere's ecosystems. Our EOF analysis of the residuals shows ( Fig. 9a.2 and 9a.3) that the seasonality has increased over recent decades in line with previous studies, which explore the link to increased ecosystem productivity (Forkel et al., 2016;Graven et al., 2013;Welp et al., 2016) and increased cropland productivity (Gray et al., 2014). Specifically, our analysis shows a slight shift of the seasonality to earlier months in the year, i.e. the negative and positive deviations of the EOF pattern are shifted by a month compared to the average seasonality (cf. Fig. 9a.1 and 9a.2). The strongest change in CO 2 seasonality is derived for the latitudinal bins centred at 37.5 • N and up to 67.5 • N bins with a maximum strengthening of negative deviations in the latitudinal band centred at 52.5 • N in July by around 4 ppm over 1984-2013 (4 ppm results from multiplying the EOF pattern value in July in the 52.5 • bin with the EOF score difference of around 10, see Fig. 9a.2 and a.3). The maximum strengthening of the seasonal cycle happens in July in the 52.5 • latitudinal band; however, the maximum seasonal cycle deviation is still observed slightly later in August and also extends slightly more towards the northern latitudes ( Fig. 9a.1).
In 1850, the start of the historical CMIP6 simulations, the estimated global-mean CO 2 concentration is 284.32 ppm, rising to 295.67 ppm in 1900, 312.82 ppm in 1950, 369.12 ppm in year 2000 up to 397.55 ppm in 2014 (Table 6). Here and elsewhere (e.g. Table 6) we provide more significant figures than is customary -not to claim a 5-digit precision of the data, but to avoid unnecessary (even if small) step changes in concentrations between the pre-industrial run and the historical and other runs. Our methodology does not include a formal uncertainty analysis. As a minimum uncertainty for the 1850 pre-industrial values, we refer to the 1.2 ppm variability stated by , also used in Rubino et al. (2013) and Trudinger et al. (2002a), as minimum uncertainty for that period.     (c) The first and second EOF of latitudinal variation. The second EOF exhibits a strong signal around mid-northern latitudes (d), the EOF scores derived from the observational data (dots) and regression (dashed line) as well as the ultimately used EOF score (solid line). The second EOF's score indicates that the mid-latitude northern spike was only a recent phenomenon and the score is here assumed to linearly converge to zero. The first EOF's score is more linearly increasing, and regressed against global fossil emissions.  Global-mean surface CO 2 concentration growth slightly flattens off in the 1930s, and a stronger flattening occurs during World War II until the 1950s (Bastos et al., 2016). The increase from 1970 onwards has a slightly positive curvature (accelerating trend) with small deviations around 1973, 1981 and the temporary flattening of CO 2 concentrations after the 1991 Pinatubo eruption (Jones and Cox, 2001;Peylin et al., 2005) (Figs. 9 and 10).

Methane
Over the 800 000 years before year 0, atmospheric CH 4 concentrations varied between 348.7 and 728.4 ppb according to the EPICA ice core composite (Barbante et al., 2006;Capron et al., 2010;Loulergue et al., 2008) (Figs. 6c and 11). The Law Dome record (Etheridge et al., 1998;MacFarling Meure et al., 2006) indicates an onset of increasing concentrations around the year 1720 ( Figs. 6d and 11). From year 1850 with slightly higher than 800 ppb concentrations, a slight rise is observed until the 1950s, when CH 4 concentrations markedly increase first in the latter half of the 1950s, then again from 1965 onwards. The Greenland firn and ice core data (Rhodes et al., 2013) are more difficult to interpret because part of the record is affected by high-frequency ice core CH 4 signals, possibly of non-atmospheric origin. CH 4 spikes are accompanied by elevated concentrations of black carbon, ammonium and nitrate, suggesting that biological in situ production may be responsible -particularly in the later years of the record since 1940. We use the 5-yearly average measurement values, that have outliers removed (Rhodes et al., 2013), and which approximate the lower bounds of the raw data points until 1942. Using these values, we can then infer global gradients back in time and derive an estimate of global-mean concentrations. These global-mean concentrations are estimated to be around 30 ppb higher than the Law Dome record by 1850, with the difference growing to 45 ppb by 1940s, increasing further from there (Fig. 6d). This approximately matches the findings by Mitchell et al. (2013) of interpolar differences between about 35 and 45 ppb between 800 BC and 1700 AD.
Our analysis of CH 4 concentrations in the recent decades is based on a large number of stations (Table 3 and Fig. 11f). While the annual increase of global CH 4 concentrations slowed over the 1980s and slowed markedly after 1992 towards stabilized concentrations between 1999 and 2005, CH 4 increased again after 2006 at about 5.4 ppb yr −1 ( Fig. 11f; Nisbet et al., 2016Nisbet et al., , 2014. We retrieve a recent seasonal cycle of CH 4 that is similar in the latitudinal-temporal seasonality pattern as that of CO 2 (Fig. 11a). Each hemisphere exhibits its lowest CH 4 concentrations just after the summer solstice, up to 1.6% or 28 ppb lower than the global mean in the case of the high-latitude northern summer (Fig. 11a). Quantifying the underlying reasons is beyond the scope of this study, although the seasonally varying atmospheric sink by OH oxidization is likely the main contributor to that seasonal pattern -in combination with seasonally varying natural and anthropogenic sources.
The latitudinal annual-mean gradient of CH 4 concentrations is separated into its first two EOFs, with the first EOF being a continuous north-south gradient of about 90 ppb in the recent observational period (combination of EOF and its score, see Fig. 11c and d). The second EOF is a distinct midnorthern latitude local maximum with a high-latitude low, showing a slight but marked rise in 2008 within the 1985-2014 observational data window. Quantifying the reasons for this hump are again beyond the scope of this study, with the Table 6. Historical: global-and annual-mean surface concentrations for the historical CMIP6 experiments. The year-to-year and monthly resolved global, hemispheric and latitudinally resolved concentrations for 43 GHGs and three aggregate equivalent concentrations are provided in the accompanying datasets over the time horizon year 0 (1 BC) to year 2014 AD. The complexity reduction options for capturing all GHGs with fewer species than 43 are indicated in the table as Option 1, Option 2 and Option 3, with "x" denoting relevant columns under each option (Sect. 2.1.10).
Years CO 2 CH 4 N 2 O CFC-12 eq. HFC-134a eq. CFC-11 eq. CFC-12 Other      possibility of a shift in locations of sampling stations or coalseam gas-fracking-related fugitive emissions being possible contributors. We optimize the first EOF, which is the general north-south gradient, to match the firn and ice-core Greenland and Antarctic Law Dome data. The second EOF of the latitudinal gradient is kept constant at its 1985 value. As a result of the constant extrapolation of the second EOF, and the optimization of the first EOF's score (Fig. 11d), we yield a total annual-mean meridional gradient for recent decades that features around 80 ppb higher surface CH 4 concentrations in mid-to-high northern latitudes compared to the global mean and around 60 ppb lower CH 4 concentrations at the high southern latitudes (Fig. 11b). In pre-industrial times, our approach of regressing the score of EOF1 with global emissions (Gütschow et al., 2016) suggests that this gradient is smaller, with only approximately 20-30 ppb higher northern and 20 ppb lower southern latitude surface concentrations (Fig. 11b). These mean interpolar differences and their variations have earlier been quantified by Etheridge et al. (1998) and Mitchell et al. (2013), yielding similar results (between 30 and 60 ppb) compared to our 40-50 ppb estimate.

Nitrous oxide
N 2 O concentrations from ice cores dating back 800 000 years Schilt et al., 2010b) varied approximately between 200 and 300 ppb, with most recent glacial concentration minima of 180 ppb around 23 000 years ago (Sowers et al., 2003) (Fig. 6a). The ice core record over the last 2000 years indicates a marked difference between the Law Dome and GISPII record (Sowers et al., 2003), with the latter being up to 10 ppb lower. Here, as with CH 4 , we use again a median quantile piecewise polynomial regression on the Law Dome record, assuming constant N 2 O concentrations between year 0 and the first Law Dome data point in year 154. In contrast to CH 4 , there is not a monotonic increase of concentrations, but rather an initial slight decrease until year 630 down to a minimum concentration of 265 ppb in our smoothed time series, with a subsequent slow increase until the 9th century AD, then a slight decrease until 1650 in the smoothed global-mean mole fraction. A temporary local maximum indicated by individual Law Dome data in the 15th century is not resolved by our smoothing, and a similar spike in the 17th century is only just reflected (Fig. 6b). Several data points indicate a small decrease after a 1750 maximum with a minimum in 1850 of around 273.02 ppb. This maximum around 1750 and subsequent minimum around 1800-1850 is also apparent in the H15 ice core record by Machida et al. (1995) (we scale-corrected the Machida data downwards by 1 ppb as in Battle et al., 1996) (Fig. 6b). After 1850After , N 2 O concentrations increased markedly, reaching 1900After , 1950After , 2000After and 2014.0 ppb, respectively (Table 6). Comparing the different firn and ice records, the 1920-1940 period seems particularly uncertain, with some high measurements close to and beyond 290 ppb from both Law Dome and H15, while some of the Law Dome data is still at levels around 285 ppb or even 280 ppb in the case of H15 (Fig. 6e). The South Pole firn data (Battle et al., 1996) suggest lower N 2 O concentrations in the 1920s and around 1960 -compared to both the smoothed Law Dome data (thin dashed line in Fig. 6e) and consequently our even higher global-mean estimate. Although the Ishijima estimate (Ishijima et al., 2007) (their Fig. 6a) around 1952 is almost identical to our global mean, their modelling study suggests slightly lower values around 1960 before being closely matched again from 1970 onwards. The Law Dome firn record (Park et al., 2012) suggests slightly higher N 2 O concentrations for the high southern latitudes compared to our global mean (Fig. 6e).
The variability of our derived N 2 O global-mean concentrations, in particular the steps in the 1920s and 1940s, reflect the smoothing algorithm choices to noisy data (Sect. 2.1.6), but should not be over-interpreted. Our algorithm does not, for example, include information on the lifetime of N 2 O that would guard against inferring too-rapid declines of N 2 O mole fractions and mole fraction growth rates. The fit of the smoothing algorithm was chosen to balance the resolution of smaller-scale features with the uncertainty present in the input data sources for the full-time horizon from year 0 to year 2014. Given overall uncertainties (Fig. 6e), a smoother representation between 1900 and 1980 seems equally justified.
Compared to CH 4 and CO 2 , the seasonality and latitudinal gradient of N 2 O are relatively small. The N 2 O seasonality is only 0.1 % of global mole fractions and is almost symmetric and seasonally time-synchronized between the Northern and Southern Hemispheres, with minima in the Southern Hemisphere late autumn and northern Hemisphere summer-autumn (Fig. 12a). The seasonality is currently of the same size as the underlying trend, leading to global mean N 2 O mole fractions increasing in the latter months of any year, with a subsequent flattening in the first half of any calendar year (e.g. Fig. 12h). Given a counterintuitive slight decrease of the north-south gradient with flat or slightly increasing global N 2 O emissions (Gütschow et al., 2016) in recent years (Fig. 12d), we assumed constant scores for the latitudinal gradient EOFs for times before 1996 (Fig. 12d). Due to measurement fluctuations in the first years when systematic measurements started in 1978 that are larger compared to the recent period, we chose to interpolate N 2 O global-mean mole fractions over 1966-1987. For the period between 1978 and 1987, this interpolation is closely aligned with a smooth representation of the atmospheric measurements (Fig. 12f,

Ozone-depleting substances (ODSs) and other chlorinated substances
ODSs, i.e. the substances destroying ozone and being controlled under the Montreal Protocol, also have a large warming effect (Velders et al., 2007). In particular, CFC-12 and CFC-11 are important GHGs, as well as the replacement substance HCFC-22, which, unlike CFCs, continues to increase in the atmosphere, albeit at a declining rate. The radiative forcing of CFC-12 alone since 1750 is equivalent to that of N 2 O, which is usually considered the third most important GHG after CO 2 and CH 4 ( Table 5). The impact of ODSs on climate is somewhat complicated by their destruction of stratospheric ozone, which induces dynamical effects on circulation patterns, and has a net cooling effect on the global climate. The latest estimates suggest that this cooling might offset roughly two-thirds of the warming of the entire class of ODSs . Note that here we also consider methylene chloride and methyl chloride, although these chlorinated substances are not controlled by the Montreal Protocol and are hence often not termed ODSs (WMO, 2014). The most abundant ODSs in the atmosphere (in 2014) were CFC-12 (520.6 ppt), CFC-11 (233.1 ppt) and HCFC-22 (229.5 ppt), with their mole fractions being about 6 orders of magnitude lower than current measurements for CO 2 (Table 7). In addition, methyl chloride CH 3 Cl has a high mole fraction (539.54 ppt), although it is not considered an ODS here as it is not controlled by the Montreal Protocol. Out of the 17 considered chlorinated and ODSs, only 6 have currently increasing concentrations. Those are the three HCFCs, of which the increase in HCFC-22 alone has offset the reducing radiative forcing of all other ODSs over the past decade (Fig. 8m). The other three substances that are still increasing are Halon-1301, methylene chloride (CH 2 Cl 2 ) and chloroform (CHCl 3 ). Chloroform had been decreasing in the 1990s and stabilized in the 2000s, but again recently showed an increase (Fig. S11).
Four of the considered chlorinated and ODSs are assumed to have natural emissions and hence above-zero preindustrial concentrations. We estimate those pre-industrial natural background concentrations using a simple budget equation under the assumption of a constant lifetime (IPCC, 2013) of 1 year for CH 3 Cl and 0.8 years for CH 3 Br -minimizing the error term when taking into account anthropogenic emission and atmospheric concentration estimates over 1950 to 1990 by Velders and Daniel (2014). Specifically, methyl chloride (CH 3 Cl) is assumed to have pre-industrial global-mean concentrations of 457 ppt, and methyl bromide (CH 3 Br) with that of 5.3 ppt. Chloroform (CHCl 3 ) is assumed to have a pre-industrial concentration of about 6 ppt, approximately in line with findings by Worton et al. (2006) and the estimation by Aucott et al. (1999) that in 1990 CHCl 3 was at about 8 ppt, with 80 % of emissions assumed to be of natural origin. Lastly, in the absence of other information (a good understanding of the natural versus anthropogenic source fraction or historical industrial production records) the available firn measurements (e.g. Trudinger et al., 2004) supplying information about methylene chloride (CH 2 Cl 2 ) mole fractions in the early 20th century are used to suggest a 6.9 ppt pre-industrial mean concentration with a strong latitudinal gradient that results in northern (southern) hemisphere average concentrations of 12.8 (1.0) ppt. The transition of concentrations of some species between the observational station data and pre-industrial levels are also uncertain. For CH 2 Cl 2 , our derivation is in line with the smooth trajectory of Trudinger et al. (2004), indicating an almost monotonic transition between 1997 values and pre-industrial concentrations (Fig. S7f). Our assimilation approach (which is based on the Walker et al., 2000 data) causes our carbon tetrachloride (CCl 4 ) reconstruction to have a near-zero pre-industrial concentration of 0.025 ppt (0.025 % of its peak value of 100 ppt). We note that Walker et al. (2000) suggest zero pre-industrial concentrations before 1910, although the lowest empirical evidence from firn records suggest < 5 ppt  or 3-4 ppt as measured by S. Montzka for 1863 firn air and reported in Liang et al. (2016).
The seasonal cycle of ODSs and other synthetic GHGs can be influenced by seasonally varying stratospherictropospheric air exchanges, interhemispheric transport, tropopause heights, emissions and, for those substances with OH-related sinks, the seasonally varying OH concentrations. For 11 out of the 17 considered ODSs we find some indication of seasonal cycles based on the analysed station data, namely for CCl 4 , CFC-11, CFC-12, CFC-113, CH 2 Cl 2 , CH 3 Br, CH 3 CCl 3 , CH 3 Cl, CHCl 3 , Halon-1211 and HCFC-22. Our analysis indicates that HCFC-141b also shows some signs of a seasonal cycle, although we here assumed a zero seasonal cycle due to data sparsity (see Fig. S16a). We find the strongest seasonal cycles in the case of the short-lived species CH 3 Cl, CHCl 3 , CH 3 Br and CH 2 Cl 2 with absolute maximal seasonal deviations of −11, −12 and ±9, −32 % compared to the annual mean, respectively. For the radiatively important and longer-lived species CFC-12, CFC-11 and HCFC-22, the seasonal cycle is much smaller, with ±0.2, ±0.4, ±0.8 %, respectively.
Similar to the seasonality, the latitudinal gradient is found to be especially pronounced for the short-lived substances. Specifically, CH 2 Cl 2 with a lifetime of 0.4 years, CH 3 Br with a lifetime of 0.8 years, CH 3 CCl 3 with a lifetime of 5 years, CH 3 Cl with a lifetime of approximately 1 year and CHCl 3 with a lifetime of 0.4 years show substantial latitudinal gradients due to spatially heterogeneous sinks and sources (lifetimes following Table 8.A.1 in IPCC WG1 AR5, 2013). While chemicals with predominantly anthropogenic sources normally exhibit the highest mole fractions at midnorthern to high northern latitudes, the observations for several substances with substantial natural sources exhibit highest mole fractions in the tropics or lower northern latitudes in the recent observational period (e.g. CH 3 Cl in Fig. S10b and c).

Other fluorinated GHGs
The 23 other gases in this study are the hydrofluorocarbons (HFCs), which have recently been added to the substances controlled under the Montreal Protocol (Kigali amendment in October 2016), and those substances whose production and consumption is not controlled under the Montreal Protocol, namely perfluorocarbons (PFCs) as well as sulfur hexafluoride (SF 6 ), nitrogen trifluoride (NF 3 ) and sulfuryl fluoride (SO 2 F 2 ). Except for the latter, the emissions of all these species are controlled under the Kyoto Protocol and covered by most "nationally determined contributions" (NDCs) under the Paris Agreement. However, currently the aggregated greenhouse effect of this group of synthetic GHGs is still almost a factor of 10 smaller compared to the ODSs (cf. Fig. 8g and m). In contrast to the ODSs, nearly all of these other fluorinated gas concentrations are still rising; the exception is HFC-152a, which has stopped growing since 2012 and may now be in decline, (Fig. S33f). Thus, a primary concern with these gases is the potential for substantial climate forcing in the future if uncontrolled growth continues.
The most abundant of these gases is the refrigerant HFC-134a, with 2014 concentrations estimated to be 80.5 ppt, followed by HFC-23 (26.9 ppt), HFC-125 (15.4 ppt) and HFC-143a (15.2 ppt). At the other end of the concentration spectrum, we include results from Ivy et al. (2012) for some PFCs that exhibit low concentrations of 0.13 ppt (C 5 F 12 and C 7 F 16 ) or 0.09 ppt (C 8 F 18 ) ( Table 7). The only fluorinated gas considered to have substantial natural sources, and hence a preindustrial background concentration, is CF 4 with an assumed pre-industrial concentration of 34.05 ppt (see Fig. S26), in line with findings by Trudinger et al. (2016) and Mühle et al. (2010).
For a number of substances, especially the PFCs with lower abundances, there were not sufficient data available to estimate the seasonality of atmospheric concentrations. We consider seasonality only for 3 of the 23 species. HFC-134a has a somewhat atypical pattern of lowest mole fractions in the spring Northern Hemisphere (−2.6 % compared to annual mean) as other gases normally show a summer or autumn low point of concentrations. This spring minimum results from a seasonality of sources of this refrigerant (Fig. S31a), although seasonality in loss also likely plays a role (Xiang et al., 2014). Secondly, the short-lived HFC-152a (lifetime 1.5 years) shows seasonal variations of up to ±13 % while the very long-lived SF 6 (lifetime of 3200 years) exhibits a much smaller seasonality of up to ±0.5 %.
For most of the considered substances, the latitudinal gradient is rather small. Exceptions are the shorter-lived species like HFC-32, whose concentration has risen relatively quickly since 2000 due to rapidly increasing northern hemispheric sources (Fig. S28b), HFC-152a and some other shorter-lived HFCs. For the three heavier PFCs with very low abundances of well below 1 ppt in 2014, namely C 6 F 14 , C 7 F 16 and C 8 F 18 , we incorporated hemispheric data from Ivy et al. (2012). Before about 1990, those three gases are suggested to have reversed latitudinal gradients with higher southern hemispheric concentrations. Due to the very low mole fractions near the limit of measurement, future studies may need to confirm whether those reverse gradients existed (and if so, why). Given the negligible radiative forcing from these gases to date, this uncertainty does not affect the overall results.

The CMIP6 recommendation and data format
We present the community CMIP6 datasets of historical GHG mole fractions. In conjunction with other data, these GHG surface mole fraction datasets are to be used in the historical concentration-driven runs for the Climate Model Intercomparison Project Phase 6 (CMIP6) . Depending on the specific CMIP6 experiment, different protocols and recommendations can apply. Modellers should hence also check the experiment specific descriptions (see the special issue available at http://www.geosci-model-dev. net/special_issue590.html), including protocols regarding the other important forcing input datasets like aerosols, their emissions and optical properties, and land-use patterns, but also short-lived GHGs like tropospheric and stratospheric ozone for models without interactive ozone chemistry.
The historical GHG concentrations of this study are specifically designed to be useful for the historical run, as well as the idealized runs of abrupt4x, 1pctCO2 and picontrol. Also, the PMIP4-related last-millennium experiment will be based on the GHG concentrations of this study (Jungclaus et al., 2017;Kageyama et al., 2016).
Regarding the historical runs of the DECK simulations, the CMIP6 recommendation as decided by the CMIP Panel is as follows: "In the CO 2 -concentration-driven historical simulations, time-varying global annual mean mole fractions for CO 2 and other long-lived GHGs are prescribed. If a modelling center decides to represent additional spatial and seasonal variations in prescribed GHG forcings, this needs to be adequately documented" .
This study provides the data for both the global annual mean mole fractions as well as the mole fraction histories that take latitudinal and seasonal variations into account (see data description further below). CMIP6 modelling groups should indicate which time and space resolution of the data version they applied. All data are freely available via the PCMDI servers (https://esgf-node.llnl.gov/ search/input4mips/) as netcdf files. The data is also available via ftp servers, and in multiple data formats (netcdf, csv, xls and MATLAB mat) as described at climatecollege.unimelb.edu.au/cmip6. In terms of the spatio-temporal resolution, four files for each of the 43 GHGs and the three equivalence species CFC-12 eq., HFC-134a eq. and CFC-11 eq. (Sect. 2.1.10) are provided as follows:  The area-weighted mean over 15 • latitudinal bands is the same as the files under variant I above; III. global and hemispheric means with monthly resolution (filename-code "_GMNHSHmeanXmonth"); IV. global and hemispheric means with annual resolution (filename-code "_GMNHSHmeanXyear").
Given that climate effects will vary depending on whether global, annual-mean or seasonally varying latitudinally resolved surface mole fractions are prescribed, modelling groups are asked to document which dataset(s) they choose. The CMIP6 recommendation for the picontrol experiment are to use the 1850 GHG mole fractions with annual means as provided in Table 8 (CO 2 annual-mean mole fractions of 284.32 ppm, CH 4 mole fractions of 808.25 ppb and N 2 O mole fractions of 273.02 ppb). Other gases are covered, depending on the choice of the modelling group by either following Option 1, Option 2, or Option 3 described in Table 5, or an equivalently suited method that aggregates the radiative effect of the remaining 40 GHGs or a large fraction thereof.
The abrupt4x experiment should keep all GHG mole fractions unchanged from the picontrol run except for the CO 2 mole fractions, which should be increased instantaneously in year 1 (= 1850) of the experiment to 4 times the 1850 value, namely to 1137.27 ppm (Table 10).
The 1pctCO2 experiment should also keep all GHG mole fractions unchanged from the picontrol run except for CO 2 mole fractions. Starting in year 1 of the experiment, CO 2 mole fractions should increase by 1 % per annum, reaching slightly over double the CO 2 mole fractions in year 70 (or 1920, if the start year is set to 1850) with 570.56 and 1264.76 ppm in year 150 (or year 2000) ( Table 9).
As with the abrupt4x and 1pctCO2 scenarios, the historical experiment should diverge from the picontrol run. GHGs should then follow the historical observations as derived in this study, reaching, for example, CO 2 mole fractions of 397.55 ppm in 2014, and CH 4 and N 2 O mole fractions of 1831.47 and 326.99 ppb, respectively. Modelling groups should document which spatial and temporal resolution (see above) of the provided data they use, as the climate effect will likely be different with different resolutions.
The future concentration pathways, the so-called "SSP-RCP" scenarios, considered under ScenarioMIP (O'Neill et al., 2016) are planned to provide the same data formats and spatio-temporal resolutions. The methodological approach to derive and adapt both seasonality and latitudinal gradients in this study was designed such that a future extrapolation will be possible.

The vertical dimension
The purpose of our reconstructions is to provide radiative forcing for climate models. This radiative forcing depends on the vertical as well as horizontal distribution of a gases' mole fraction. Our reconstructions describe only surface concentrations and modellers need some method for calculating the 3-D distribution. If the model is capable of calculating tracer transport, includes any sinks and sources in the free atmosphere, and has an appropriate treatment of the boundary layer, we recommend using this study's surface reconstruction as a mole fraction lower boundary condition for a mass balance inversion. If this is not possible, we propose a simple equation to reflect the relaxation of horizontal gradients with height and the upward propagation of mole fraction changes from the surface.
In the case of CO 2 , there are no sinks in the middle and upper troposphere or stratosphere and only slight sources due to the oxidization of CH 4 and carbon monoxide (CO). Evidence from ESMs (Fig. 13) indicates an almost well-mixed tropospheric column in the tropics and little or partly reversed vertical gradient in the southern troposphere, while the annualmean gradient in the Northern Hemisphere is -depending on the season -variable. The annual average vertical gradient in the Northern Hemisphere is decreasing in all CMIP5 ESMs analysed here (Fig. 13). Table 9. 1pctCO2: global-mean annual-mean surface CO 2 concentrations for idealized CMIP6 experiments 1pctCO2. All other gases, as in picontrol run (see Table 8 Fig. 14a and b). Note that tropospheric columns of non-CO 2 gases are -for simplicity -assumed to be well-mixed. The assumed age of air at the 1 hPa level for CO 2 is 5 years.
In order to enable the implementation of surface mole fractions in models that do not have an inherent transport model to capture vertical gradients, we offer here simplified parameterizations as default options. While an assumption about a well-mixed atmospheric vertical column seems to be a justifiable simplification, these simple vertical extensions could increase the realism, vertical heating struc-ture and overall climatic effect. Specifically, modelling teams could use the following approximation to extend surface concentration fields (at the 1000 hPa level) towards higher tropospheric and stratospheric levels. First, a bell-shaped concentration distribution is assumed at the 100 hPa level for the  C (l, 100 hPa, t) = C (global, 1000 hPa, t) ...
with C (global, 1000 hPa,t) indicating global-average, annual-average concentrations at the surface 1000 hPa level at time t. Ideally, a smoothed mean-preserving monthly time series of these annual-average global averages is used to prevent step changes from calendar month 12 to 1. Equivalently, C (global, 1000 hPa,t − 5 yrs) indicates the global-average, annual-average surface mole fraction 5 years earlier. The sin(l) 2 2 factor depends on the latitude l and results in the bell-shaped concentration curve with concentrations at the tropical 100 hPa level to be identical to the global average surface concentrations, while the polar mole fractions are effectively of a medium age (2.5 years in the case of linearly increasing concentration history). Having defined this 100 hPa concentration level, the tropospheric mole fractions at latitude l and pressure level p (with p > 100 hPa) can then be assumed to be a simple linear interpolation between the surface mole fraction level at latitude l and the 100 hPa level, so that C (l, p, t) = C (l, 100 hPa,t) + (C (l, 1000 hPa,t) −C (l, 100 hPa,t)) · (p − 100 hPa) (1000 hPa − 100 hPa) .
Above 100 hPa, i.e. in the tropical upper troposphere and stratosphere, the mole fraction is a simple linear interpolation between the 100 hPa level and the top-of-the atmosphere 1 hPa level that is assumed to have a median age of air of 5 years, so that for p < 100 hPa C (l, p, t) = C (global, 1000 hPa,t − 5 yrs) ...
This equation captures the general form of the vertical CO 2 mole fraction gradient observed in CMIP5 ESMswith the 100 hPa being an approximate division line of the vertical CO 2 gradient in all CMIP5 models (see bold red line in Fig. 13). The annual-average vertical gradient in the Northern Hemisphere will somewhat reduce the effect of the strong surface latitudinal gradient. The idealized shaped of the above parameterization for a hypothetical flat surface mole fraction of 100 ppm is shown in Fig. 14b. Assuming linearly increasing surface mole fractions from a South Pole minimum towards a 3 ppm higher North Pole maximum will -under this simplified parameterization -result in an almost zero vertical tropospheric gradient in the Southern Hemisphere (Fig. 14a).
The stratospheric concentration can then be modelled for p < p tropopause as follows: C (l, p, t) = C global, 1000 hPa, t − 1 yrs · p p tropopause (l) s , with C (global, 1000 hPa, t) being the global-mean and annual-mean surface mole fraction of the previous year, p/p tropopause (l) being the ratio of the pressure at level p and the tropopause pressure at that latitude, and s being a gasdependent scaling factor (Table 11).

Discussion
We compare our results with a number of other data products. First, a comparison with the previous CMIP5 recommendation for historical GHG concentrations is provided (Sect. 5.1). Second, we analyse and compare our CMIP6 recommendations to what the ESMs from the previous CMIP5 intercomparison produced in terms of CO 2 concentration fields in the emissions-driven runs (Sect. 5.3). Third, we compare our datasets to the other global-mean, hemispheric and latitudinally resolved datasets, namely the NOAA Marine Boundary Layer product and the WDCGG time series (Sect. 5.4).

Comparison to CMIP5 input datasets
For the CMIP5 inter-comparison, GHG concentrations were specified for historical times until 2005, followed by the Representative Concentration Pathways (RCPs) and their extensions until 2300. The recommendations for GHG concentrations were global-and annual-mean time series (Meinshausen et al., 2011), not including a seasonal cycle or latitudinal gradient. Those historical time series were composite products of existing ice core and instrumental data annual means (see references in Meinshausen et al., 2011). Global, annual-mean CO 2 concentrations over 1975 to 2005 were very close (< 0.7 ppm different) to our current recommendations for CMIP6. The CMIP5 time series did not show the slight maximum in CO 2 concentrations around 1973 (difference 1.2 ppm), and was generally lower between 1940 and 1956 at about the time of the World War II, when CO 2 concentrations briefly plateaued (differences between 1.0 and 2.3 ppm) (Fig. 15). While the CMIP5 historical GHGs were an ad hoc extension to the RCP pathways, our CMIP6 recommendation advanced the integration of historical data by accounting for latitudinal gradients (ice core data in CMIP5 has not been adjusted for the latitudinal gradients) and by taking into account a large array of additional data beyond a single network average for more recent times.
Recommended global-mean CH 4 concentrations for CMIP5 were generally lower than derived here, up to 50 ppb around 1910 and between 25 and 30 ppb more recently (2000)(2001)(2002)(2003)(2004)(2005). The primary reason is that the CMIP5 data did not take into account the strong latitudinal gradient of CH 4 concentrations. For N 2 O concentrations, the CMIP5 historical time series did not capture some higher-frequency variability, which caused the CMIP6 recommendation for the picontrol 1850 global-mean concentration being lower by around 2.5 ppb, and N 2 O concentrations in the 1910s being higher by up to 2.3 ppb (Fig. 15).
Overall, CMIP5 and CMIP6 recommendations are relatively similar. The 1850 picontrol values at the time of CMIP5 were slightly higher for CO 2 and N 2 O (0.14 % or 0.4 ppm and 0.87 % or 2.4 ppb, respectively), countered to some degree by slightly lower values for CH 4 (2.18 % or 17.3 ppb). This is equivalent to a small net change in base year radiative forcing of 0.0065 W m −2 , when applying linear radiative efficiencies of IPCC AR5 (Appendix 8.A in IPCC WG1 AR5).

Comparison to CO 2 station data between 1958 and 1984
As our data synthesis used monthly station data only from 1984 onwards (except for Mauna Loa annual averages back to 1958), a comparison to available station data from before 1984 is useful to qualitatively validate the extension method applied in this study. While latitudinal gradients (or rather their first two EOFs) and seasonality changes are extended by regression (Sects. 2.1.4-2.1.7), the CO 2 fields' global-mean has been optimized to match both the annual average Mauna Loa record and the Law Dome ice record, specifically our smoothed version thereof (see Table 1 and Fig. 7k). Thus, it is informative to compare our data product to available station data from the period before 1984 both in terms of seasonality and the absolute amplitude (which is derived from the global-mean and the regressed latitudinal gradient) (see Fig. 7). We here use the Scripps CO 2 data series, available at http://scrippsco2.ucsd.edu/data/ atmospheric_co2/sampling_stations. In general, the comparison suggests that this study's data product matches earlier station data rather closely, thereby validating our chosen extension approach to some degree. There are two noteworthy issues arising from this comparison though. For high southern latitudes, both Law Dome as well as SPO in situ and flask station data are available. It seems that our CMIP6 high-latitude data in the Southern Hemisphere could be ∼ 1 ppm too low over the period 1959-1972 (Fig. 7l). Earlier, in 1958, and subsequently from 1973 onwards, the match is rather close between SPO station data at −90 • and our latitudinal average for the −90 to −75 • zonal mean. Given our data product matches the MLO record quite closely (somewhat by design, given the optimization to match the annual-average MLO record over that time), this points to a slightly exaggerated latitudinal gradient between 1959 and 1972.
The second issue relates to a bump in the concentration series centred around 1974. In our data assimilation, this bump is a propagation of an anomaly in the MLO record over that time and seems to a lesser degree to also show up in other Northern Hemisphere records. However, the southern hemispheric SPO record does not show (or only minimally shows) this slight upwards aberration from 1972 to 1974 and subsequent slowing and stagnating growth from 1974 to 1976 (while the lower precision Law Dome data would be consistent with that MLO pattern, see Fig. 7k).
To what extent this bump has been present in the Southern Hemisphere is unknown, although earlier studies  relate the increased atmospheric CO 2 concentrations to decreased oceanic uptake during the El Niño back then. Such a process explanation would suggest that the atmospheric signal is also present throughout large parts of the Southern Hemisphere, while a predominantly extra-tropical land-related respiration increase during El Niño could imply that the signal is predominantly present in the Northern Hemisphere. In summary, our assimilation's hemispheric upwards anomaly around 1974 of around ∼ 2 ppm could largely be an artefact of our methodology, which propagates the MLO anomaly globally under the assumption of exogenously emission-regressed latitudinal gradients.

Comparison to CMIP5 ESM CO 2 concentration fields
Several ESMs during CMIP5 used prescribed CO 2 emissions instead of CO 2 concentrations and derived CO 2 concentration fields endogenously. For the year 1875, we see that models vary greatly, with some showing reverse latitudinal gradients with higher concentrations in the south (e.g. CanESM2), almost no gradient (CESM1-BCC), a local maximum in the tropics with lower poleward concentrations (MIROC-ESM) and very heterogeneous fields with high concentrations over the tropical rainforests (NorESM1-ME) (see Fig. S41). Similarly, for 1990 (Fig. S42), the fields are dissimilar, with some models exhibiting very strong north-south gradients (MPI-ESM-LR), while others show only subtle gradients (CanESM2), although all models indicate an increase of northern hemispheric concentrations compared to the global mean between 1875 and 1990 (Fig. S45).
Though not as strong as NorESM1-ME, most models show a slight tropical maximum in the latitudinal gradient (exceptions are CanESM2, MIROC-ESM) during both 1875 and1990 (Figs. S46 andS47). The high-latitude southern concentration deviations from the global-mean in the 1875 time slices have different signs across the models, with some indicating clearly lower concentrations (BNU-ESM, MPI-ESM-LR, NorESM1-ME) and others suggesting slightly positive concentrations (CanESM2, MIROC-ESM in 1875). The average of three CMIP5 ESMs with full CO 2 data coverage at the surface 1000 hPa level and global mean CO 2 mole fraction values in line with observational records (CanESM2, MPI-ESM-LR and NorESM1-ESM) shows a latitudinal gradient for 1990 comparable to the observed one derived in this study (Fig. 9b). Thus, given that the preindustrial latitudinal gradient is almost flat for the models with the highest skill to replicate current observations, we assumed constant mole fractions with latitude for pre-industrial times.
In general, all ESMs show climatological seasonal cycles of CO 2 concentrations similar to the seasonality derived in this study (Fig. 9a). The climatological 1861-1890 average concentrations across the models clearly exhibit higher seasonality in the Northern Hemisphere, especially above 40 • N. While the seasonality in some models is weaker, especially CESM1-BCC, others show variations of up to ±10 ppm (MPI-ESM-LR). In addition, the latter model exhibits a larger Southern Hemisphere seasonality than other models and what we observe. As expected from our analysis of observational data, this seasonality strengthens up to 1990 across all models (cf. Figs. S43 and S44). The latitudinal spread of the northern hemispheric minimum extends southwards towards the equator in August, September and October as we observe (Fig. 9a), with the exception of the BNU-ESM (Fig. S44), which indicates a northward propagation of the minimum summer concentration values.
Overall, the basic features of the latitudinal gradient and seasonal cycle are represented in the ESMs as seen in the observational data. However, the variation across the models is substantial. This difference of several parts per million (ppm) in the latitudinal gradient or seasonal cycles will lead to follow-on differences in the climate response observed in those models.
As common input for the CMIP5 concentration-driven experiments, all models were provided with the same historical global-and annual-mean CO 2 concentrations. Some models had the capability to nudge internally generated CO 2 concentration fields to match the prescribed annual and global mean CO 2 concentrations. Nevertheless, the differences in those internally generated fields can be substantial, as our analysis from CMIP5 shows, and different from the observations.
For future model inter-comparisons, it seems preferable that any concentration-driven runs would use the same starting point. Of course, the longer-term aspiration has to be that emission-driven ESMs reliably reproduce observational concentration patterns. For CMIP6, modelling groups are requested to document their choice of concentration input data, specifically in relation to the chosen temporal and spatial resolutions.

Comparison of global means to NOAA marine boundary layer products and WDCGG
The primary observational data product with coverage across all latitudes is the marine boundary layer (MBL) or GLOB-ALVIEW fields (NOAA, 2013;NOAA ESRL GMD, 2014c), produced by the NOAA based on the Cooperative Global Air Sampling Network (Conway et al., 1994;Dlugokencky et al., 1994b;Trolier et al., 1996) for CO 2 , CH 4 and N 2 O (available at http://www.esrl.noaa.gov/gmd/ccgg/mbl/ mbl.html, with N 2 O data from P. Tans, personal communication, 2016). The aggregation method used to produce this dataset is to first fit parametric functions to the weekly data of each station, thereby providing a gap-filling method. In a next step, the procedure fits smooth weekly latitudinal distributions to the various station data points .
These latitudinal distributions are then combined into a 2-D field of latitude versus time, comparable to this study's data product. The time period of these NOAA MBL data products is 1979-2014for CO 2 , 1983-2014for CH 4 and 2001-2014 for N 2 O. The four main methodological differences between the NOAA MBL data product and ours are as follows: (1) the NOAA data product has a higher resolution in time (weekly instead of monthly) and latitudes; (2) the NOAA MBL data product includes only a subset of the NOAA network data (sites within the marine boundary layer), while this study mixes both NOAA and AGAGE network data in the case of CH 4 and N 2 O; (3) this study characterizes the global fields by lower rank representations (EOFs) of annual mean latitudinal gradients and seasonality, while the NOAA product derives latitudinal gradients (and seasonality thereby only implicitly) directly from the observations at each time step; and (4) this study is extended by ice core and firn data, regressions, and extrapolation or interpolation to span the full-time period between year 0 and 2014. Thus, this study seamlessly merges in situ observational, air archive, ice and firn data to generate a comprehensive data product.
For several applications, the NOAA data product has clear advantages. However, with the task of producing a continuous data product beyond the instrumental observations, this study had to choose a method that was readily extendable. Hence, this study chooses the characterization of global fields into global means, latitudinal gradients and seasonality. This implies a high degree of regularizations by relying on EOFs and corresponding scores. By regression, these EOF scores for latitudinal gradients or seasonality changes can be easily extended to cover the full-time period of interest. Hence, our method allows an estimate of global-means even if there is only a single data point (such as a Law Dome ice core record for a specific year), under the assumption that latitudinal gradients and seasonality are captured by the derived EOFs and regressed EOF scores.
Global-average time series of monthly GHG mole fractions are also provided by the World Data Center for Greenhouse Gases (WDCGG) (Tsutsumi, 2009). The WDCGG product uses similar smoothing techniques to the NOAA product, but include, like this study, a broader set of measurement stations, both in terms of regional coverage (including continental stations) and different networks that use different calibration scales, sampling, gas handling, etc.
We compare the results of this study and NOAA MBL and WDCGG products. Overall, our monthly hemispheric averages of CO 2 closely match the NOAA MBL product. The NOAA MBL product (which is not the same as NOAA network monthly averages) suggests a slightly faster increase of northern hemispheric concentrations in the latter months of each calendar year (cf. thick and thin orange lines in Fig. 16a). Specifically, this difference results from the midlatitude northern hemispheric bands from about 1995 onwards (with monthly-average differences of up to 4 ppm) where our study is higher than the NOAA MBL product. This could be because this study does not screen out land stations closer to the pollution sources, as the NOAA MBL product does.
Likewise, the WDCGG includes a broader set of stations and matches very closely with our global-mean time series, with our study being very close to WDCGG or in between NOAA MBL and WDCGG (Fig. 16a). Given that the difference between the NOAA study and our study has a strong seasonality, the nature of those pollution sources and how they become mixed in the atmosphere, if these effects contribute to the differences, could be a combination of fossil-fuel-related and (more seasonally varying) biospheric sources (Fig. 17c). The southern hemispheric means of our study and NOAA MBL are very closely matched (cf. thick and thin blue lines in Fig. 16a). Consequently, the globalmean concentrations from NOAA MBL and our study are closely matched, although again our data suggests NH autumn concentrations rising slightly faster than the NOAA MBL product, reflecting the northern hemispheric difference (cf. thick and thin black lines in Fig. 16a).
For CH 4 , the differences between this study and the NOAA MBL data are more systematic and stronger (∼ 10 ppb), with generally higher surface CH 4 concentrations implied by this study (Fig. 16b). Again, this study's global mean matches the WDCGG closely or sits in between the NOAA MBL and the WDCGG data products. There are some differences in the seasonality compared to the NOAA MBL product though. The seasonal variation is similarly shaped between our study and the NOAA MBL for the Southern Hemisphere, although there seems to be a slight phase shift of about a month with the NOAA MBL product in the Southern Hemisphere, assuming a slightly earlier increase and decrease and slightly higher amplitude (Fig. 16b). This phase shift of the Southern Hemisphere, together with sometimes lower peak northern hemispheric concentrations in the NOAA MBL product, suggests global-mean NOAA MBL CH 4 concentrations that show a double peak within any year, while our data assimilation and the WDCGG product suggests a smoother single-peak oscillation of globalmean CH 4 concentrations (Fig. 16b). This peak results from the mid-northern latitudes, where in the summer months, our study suggests up to 40 or 50 ppb higher concentrations (Fig. 18c).
For N 2 O, the WDCGG global mean and our data match very closely, with our implicit smoothing due to our lower rank representation of seasonal cycles and latitudinal means resulting in a smoother global mean compared to WDCGG (Fig. 16c). Similarly, the draft data product of the NOAA MBL indicates almost identical mole fractions to our concentration fields over the available time period from 2001 to 2014, with maximal differences being 0.8 ppb (Fig. 19).
In summary, our dataset closely matches the global means of WDCGG in many years, but provides a complete 2-D field of mole fractions. In comparison to the NOAA MBL prod-ucts, there is one more systematic difference. Our CMIP6 GHG concentration fields are meant to represent the mean monthly state of the latitudinally averaged surface atmosphere, including land and polluted areas, i.e. not confined to areas with background concentrations (Sect. 6). This is a key difference to the NOAA Marine Boundary Layer product, which is a consistent background concentration product, resulting in slightly lower global-mean concentration estimates.

Comparison to mid-troposphere CO 2 concentrations by NASA Aqua satellite
Since its launch in 2002, the Aqua satellite and its infrared sounder provides an additional independent data product to estimate tropospheric CO 2 mole fractions. Rather than at ground level, this sensor provides an estimate of tropospheric concentrations with a maximum sensitivity around 7 km height, i.e. in the mid-troposphere. In the tropics and the parts of the Southern Hemisphere that are covered by the Aqua satellite product, the agreement between our data and the AIRS level 3 data (available at ftp://acdisc.gsfc.nasa.gov/ ftp/data/s4pa/Aqua_AIRS_Level3/AIRX3C2M.005/) is encouraging, although the overall gradient is lower in line with 3-D atmospheric transport model results (Olsen and Randerson, 2004). In the Northern Hemisphere, the difference in the phase and amplitude of the seasonal cycle is most apparent, with satellite data showing a later onset of the autumn concentration increase by about 4 months, while the drawdown of concentrations seems closer in phase between midtroposphere and surface concentrations (Fig. 16a). Overall the amplitude is less than half of the surface hemispheric mean amplitude, leading to seasonally higher winter and lower summer concentrations of our surface data product in the Northern Hemisphere by up to 10 ppm (Fig. 17e). This systematic difference between ground-level and midatmosphere concentrations, supported by 3-D transport modelling studies (Olsen and Randerson, 2004), has ramifications for the implementation of vertical concentration profiles in climate models. Without taking into account the dampened seasonal cycle and latitudinal gradient in the mid-and higher troposphere, the models could overestimate the variations in the radiative effects, if our latitudinally and monthly resolved surface concentration fields are prescribed. On the other hand, if global-annual mean values are prescribed, the radiative forcing effect variations over latitudes and within a year will obviously be underestimated.

Comparison to other literature studies
Our GHG derivations over the recent instrumental periods are based on the AGAGE and NOAA station-by-station data and we extended our 2-D concentration field results back in time by using, for example, global-mean estimates of previous studies (Sect. 2). The AGAGE and NOAA networks themselves publish global-mean results, and WMO as well as other literature studies produce composite long-term globalmean and/or hemispheric concentration estimates. Thus, while often not entirely independent, as the studies use the same original data sources or we rely on some studies' previous derivations, we here provide a comparison to a selection of the literature. Specifically, in addition to the comparisons with NOAA marine boundary layer, WDCGG and NASA Aqua satellite data, we discuss some instances where our results show substantial differences compared to earlier studies that have derived hemispheric or global means from instrumental data Rigby et al., 2014), from firn data Trudinger et al., 2016) or are themselves composites of multiple data sources (Martinerie et al., 2009;Velders and Daniel, 2014;WMO, 2014). The comparisons are shown in the panels (f), (g), and (h) of the fact sheets for each gas (Figs. 9,11,12, and S1-S40) with the comparison data described in Table 12. High-latitude Northern Hemisphere data for atmospheric mole fractions is reported in the supplement of Buizert et al. (2012), provided by Vas Petrenko and Patricia Martinerie (Table 12). For CO 2 , the Petrenko dataset has, as expected for the high northern latitudes, a very strong seasonal cycle, consistent with our less pronounced northern-hemispheric-average cycle, as the data represents higher northern latitudes (Fig. 9f, g, and h). The long-term concentration trend over time in the Petrenko CO 2 record seems similar to the global CMIP5 dataset which in turn was based on previous Law Dome data, indicating a slight local maximum in 1890 and lower 1940s plateau (cf. Figs. 9g and 15).
For CH 4 , the Petrenko record shows a comparable, yet again stronger, seasonality. The annual means are very comparable to our derivation (compare the high-latitude red circles, indicating annual-mean station averages of our analysis and Petrenko data as shown in Fig. 11f), although there are some steps in annual means in the Petrenko dataset around 1956 and 1975, which are not present in our dataset (Fig. 11f). For earlier times, i.e. between 1860 and the 1920s, the Petrenko annual mean is closer to our global mean, not the high-latitude estimates, as our study assumes a large latitudinal gradient based on the NEEM and Law Dome data differences (Sect. 2) (Fig. 11g).
For CCl 4 , the Martinerie data show a lower increase from 1955 to the late 1960s and strong increase around 1970. The firn data by  suggest an earlier start of atmospheric concentration increases around 1890, and then slightly lower levels over 1960-1990compared to the WMO (2014 and Velders and Daniel (2014) time series which we use as an optimization target for our 2-D fields. The difference between the Butler and Velders datasets can probably be explained by the wider firn air age distribution in the study by Butler. The findings by Sturrock et al. (2002) suggest an onset of detectable atmospheric concentrations around 1920 (Fig. 5f therein). The NOAA global mean that is available from 1992 onwards (Montzka et al., 1999 updated at http: //www.esrl.noaa.gov/gmd/hats/combined/CCl4.html) and indicates initially slightly higher global mean estimates than our derivation, which is for the instrumental period based on 6 AGAGE and 13 NOAA HATS stations (Fig. S1f, g, h).
For CFC-11 (Fig. S2g), the NOAA Montzka-ODS reconstruction of the global-mean is slightly higher (1 ppt) than ours, which is almost identical to WMO (2014) and data by Velders and Daniel (2014). Those differences presumably result from differences in station coverage, different calibration scales, and air sampling and analysis techniques between the NOAA and AGAGE networks. The seasonalities show comparable amplitudes, as they do for CFC-12 (Fig. S3h). With CFC-115, our study follows the historical shape of the WMO (2014) record, with Velders and Daniel (2014) being slightly lower (∼ 0.5 ppt) (Fig. S6f).
For CH 2 Cl 2 , the in situ instrumental record we use only reaches back to 1994, although the Cape Grim air archive record goes back to 1978. From 1994 to 2003, the northern latitude measurements imply a mole fraction reduction from 40 to 30 ppt, whereas the southern hemispheric measurements are almost flat during that time (also shown in Trudinger et al., 2004) (Fig. S7f). We note that there are substantial uncertainties in the pre-1995 concentrations, as for example Koppmann et al. (1993) reported 18 and 36 ppt average concentrations for the southern hemispheric and northern hemispheric measurements from a 1989 Atlantic transect ship measurement campaign (not shown in the figure). This could imply a global average value of approximately 27 ppt in 1989, instead of the 20 ppt assumed in this study -although different calibration scales might contribute to this difference. Recent seasonality and increases of CH 2 Cl 2 are closely matching other time series, such as the AGAGE and NOAA results from GCMS measurements (Fig. S7f). However, there is a slight offset in the absolute level, possibly caused by our study not sorting out data points from so-called pollution events in the case of AGAGE data for CH 2 Cl 2 , whereas NOAA results are from flasks collected only in baseline-air conditions .
For CH 3 Br, our CMIP6 recommendations match the NOAA very closely (Montzka et al., 2003 updated on ftp://ftp.cmdl.noaa.gov/hats/methylhalides/ch3br/flasks) and AGAGE global means (2014) after 1995. Before then, the  global-mean firn reconstruction coincides closely with our southern hemispheric mean. The 2004 firn reconstruction by Trudinger et al.(2004) is close to the southern hemispheric mean, but shows somewhat more variation than the smooth exponential increase assumed by this study, WMO (2014), and Velders and Daniel (2014).
For CH 3 CCl 3 , the overall agreement between the different (although not independent) studies considered here is excellent, for example the high northern latitude data from Martinerie (Buizert et al., 2012;Martinerie et al., 2009) in the South Pole firn data reconstruction , approximately in line also with the findings by .  (Cunnold et al., 2002(Cunnold et al., , 1997Fraser et al., 1996;Prinn et al., 1990Prinn et al., , 2001Prinn et al., , 2005Reimann et al., 2005;Simmonds et al., 1998)   are updated from data displayed in Fig. 1 in Montzka et al. (1999), with CH 3 Br data published in Montzka et al. (2003) and with HFC data published in Montzka et al. (2015). The global-mean annual average concentrations that were used as default recommendation for concentration-driven runs in the CMIP5 experiment (Meinshausen et al., 2011).

CMIP5 CTRL. Many
The global-mean annual average concentrations in 1850 that were recommended as picontrol concentrations in the CMIP5 experiment (Meinshausen et al., 2011).
As briefly discussed in Sect. 3.4, the CHCl 3 history in this study relies on the Worton et al. (2006) reconstruction, whose shape is similar to Trudinger et al. (2004), although the latter indicates lower global mean concentrations and not the diminishing latitudinal gradient suggested by Worton et al. (2006). As with other gases (e.g. CH 2 Cl 2 ), the implied pre-industrial value of around 6 ppt should be investigated in the future (Fig. S11).
For Halon-1211, the recent study by Vollmer et al. (2016) and the earlier study by Sturrock et al. (2002) (not shown) suggest slightly higher initial concentrations (around 1975-1988) compared to the initially lower and then larger exponential increase we assumed by following Velders and Daniel (2014). We follow the global-mean derivation in the CSIRO inversion from Vollmer et al. (2016) in the case of Halon-1211. After 1990 the southern hemispheric reconstruction by the Bristol and CSIRO inversions  are slightly lower and hence the latitudinal gradient slightly larger than what we derived from the AGAGE and NOAA station data, but the differences are small (Fig. S12f). The Cape Grim measurements analysed on the UEA volumet-ric scale (Newland et al., 2013) are also in good agreement with the small offset to our global mean, consistent with the derived latitudinal gradient (Fig. S12f). Similarly to Halon-1211, the very early concentration increases of the Halon-1301 between 1970 and 1978 are higher in the  study than in Velders and Daniel (2014), and again the more recent years from 2007 onwards (Fig. S13h) are higher in Vollmer et al. (2016). In those latter years, our aggregation of AGAGE and NOAA station data, however, suggests slightly lower concentrations, although the absolute difference (0.05 ppt) is within the measurement uncertainty and the overall agreement is very good. The Newland et al. (2013) study of southern hemispheric concentrations at Cape Grim would suggest slightly lower concentrations, although part of the slight offset could be related to differences in scales. However, our Halon-1301 record suffers from a potentially inadequate scaling of the latitudinal gradient. A low gradient around 2000-2002 ( Fig. S13d and f) results from our scaling with global emissions that are assumed to drop in that period (Velders and Daniel, 2014) although subsequent station data suggest again a slightly stronger gradient. Furthermore, a second issue with our Halon-1301 record is a slight drop in the monthly data in year 2014 (Fig. S13f), which is likely an artefact of our assimilation procedure, to be corrected by assimilations that consider observational data beyond 2014.
Halon-2402 is likely the most obvious example where a shifting spatial coverage density of measurements can lead to small jumps in latitudinal gradients or global means ( Fig. S14f and h). The overall mole fractions are very small and the early agreement between the WMO (2014) time series and the Vollmer et al. (2016) findings is very good. In 2009, when data coverage increased, the latitudinal gradient is suggested to suddenly decrease, which is likely an artefact of the assimilation procedure that is only able to cope with time-varying data coverage to a certain degree (Sect. 2). However, overall, the implied shifts of 0.02 ppt are negligible in the larger picture, and certainly negligible for radiative forcing, as the shift in southern hemispheric radiative forcing is equivalent to only about 0.000003 W m −2 (Fig. S14h). Halon-2402 is also an illustration of how big differences in some measurement scales can potentially be. The Cape Grim data analysed by Newland with a volumetric UEA scale indicates 10-15 % lower concentrations (Fig. S14f) (Newland et al., 2013).
For HCFC-142b our derived global-mean is in the middle of the AGAGE and NOAA network averages, despite our study including those data points that are subject to "pollution" events in the case of HCFC-142b, with large positive outliers (Fig. S17f), similar to in the case of HFC-134a (Fig. S31f). Pollution events might, however, be contributing to the difference between our HFC-152a globalmeans and the two independently derived network global means for AGAGE and NOAA, which largely exclude pollution events by using statistical methods or conditional sam-pling ) (see Fig. S33f). Two more issues can be observed with HCFC-142b data. Firstly, our end of 2014 concentrations are somewhat uncertain and in this case possibly incorrectly decreasing, which results from the smooth annual mean representation and our assimilation procedure. The differences are again very small and negligible in radiative forcing terms, but a smooth connection will have to be designed for the adjacent datasets representing SSP-RCP scenarios. Secondly, since 2010, our estimates for the HCFCs, namely HCFC-22 (Fig. S15f), HCFC-141b (Fig. S16f) and HCFC-142b (Fig. S17f), indicate smaller increases than implied by the post-2010 non-observational scenario data represented by Velders and Daniel (2014). As in the early study by Sturrock et al. (2002), our study represents the slow onset of HCFC-142b concentrations in between 1960 and 1990 as shown in WMO (2014) and Velders and Daniel (2014).
For the three main PFCs, i.e. CF 4 (Fig. S26), C 2 F 6 (Fig. S18) and C 3 F 8 (Fig. S19), we find a similar and good agreement of the main studies. The outliers are the previously recommended CMIP5 concentrations (Meinshausen et al., 2011) for these gases, which were at the time not yet based on either the Trudinger et al. (2016) or Mühle et al. (2010) studies. As mentioned above, the concentrations of the lesser important PFCs, C 4 F 10 (Fig. S20), C 5 F 12 (Fig. S21), C 6 F 14 (Fig. S22), C 7 F 16 (Fig. S23) and C 8 F 18 (Fig. S24) are based on the Ivy et al. (2012) reconstructions, with reversing latitudinal gradients in the case of C 6 F 14 , C 7 F 16 , and C 8 F 18 , which are unexplained so far and require further confirmation. Our historical c-C 4 F 8 concentrations are based on the study by Oram et al. (2012) with assumed conversions of the Cape Grim measurements to northern hemispheric and global averages.
For HFC-43-10mee, we based our trajectory on the Northern and Southern Hemisphere estimates of  with relatively small latitudinal gradient and hemispheric means being informed by the recently available observations since 2010 from the AGAGE Medusa instruments (Fig. S29f). Note that for HFC-365mfc data (Fig. S37), the difference between the station data and those published in Montzka et al. (2015) reflects a difference that is now much smaller after a calculation-related correction was applied to the NOAA calibration scale after the publication of Montzka et al. (2015). All studies are now in relatively close alignment with the shown AGAGE network average, the Vollmer et al. (2011) study and our derivation (which is slightly lower, < 0.1 ppt). In addition, the air archive and AGAGE network analysis by Vollmer et al. (2011) investigated the HFCs HFC-236fa, HFC-227ea and HFC-245fa. Those results are closely aligned with the ones constructed here based on the WMO AGAGE network average estimates (Figs. S35, S34, S36).
Like our study, there are also studies that assimilate a wide range of gases with latitudinal and seasonal variation. For example, the AGAGE network assimilation with a 12-box model and optimization approach to reconcile emissions and concentrations  produces four semihemispheric concentration time series with three vertical levels (Rigby et al., 2014). Those studies based on AGAGE data are more comprehensive than this one, as both emissions and concentrations as well as lifetimes are optimized and reconciled. In our case, we only assimilate AGAGE and NOAA observations to derive atmospheric mole fractions in 15 • latitudinal bands (Sect. 2).

Limitations
Even though the presented dataset of historical surface GHG concentrations is -to our knowledge -more comprehensive than other composite datasets before it, there are several key limitations.

Specific use of dataset
First, the dataset was assimilated from several sources to provide a common starting point for global climate models as part of the CMIP6 experiments. Thus, for example, the data was not designed as a starting point for inversion studies, which estimate emissions, or studies of biogeochemical processes. Those studies tend to require pure observations, or at least products with appropriate uncertainty information (including auto-correlations) attached to it, rather than partly interpolated composite products. As mentioned earlier, our assimilation does not incorporate early atmospheric CO 2 measurements from the South Pole, which might result in a systematic bias for that latitude for some years of ∼ 1 ppm (Fig. 7l). This warning in terms of our data use is especially important for the fine-grid interpolation we present. The 0.5 • mean-preserving smooth interpolation should not be misinterpreted to portray measurement information at such a fine scale.

No vertical and longitudinal resolution
The purpose of forcing climate models correctly would best be accomplished by vertically resolved latitudinal and longitudinal fields, which (in the case of CO 2 ) even include a diurnal cycle. Our latitudinally and monthly resolved dataset offers climate models an option to capture some key variability compared to the global-and annual-mean CMIP5 concentration recommendation (Meinshausen et al., 2011). However, a correct implementation of this additional monthly and latitudinal variability is also dependent on an appropriate propagation of the surface signal throughout the troposphere and stratosphere. For example, some studies (Olsen and Randerson, 2004) find that column CO 2 is found to only exhibit roughly half of the latitudinal gradient and seasonal variation compared to the surface concentrations. In the CESM1 model (Hurrell et al., 2013) with prescribed surface GHG concentrations, the vertical propagation of the CO 2 concentration is assumed to be constant. In the case of the other GHGs (CH 4 , N 2 O and CFCs) a constant concentration in the troposphere and a decrease of the concentration in the stratosphere is assumed in CESM1. In particular, the scale heights in the stratosphere of these trace gases depend on latitude, which produces a more realistic stratospheric distribution. We recommend vertical extensions to our surface concentration reconstructions only in the case that the model has no intrinsic transport model or extension parameterization. Furthermore, we do not include the longitudinal variation. Again, specifically for CO 2 , this longitudinal variation might be systematic given the land-ocean contrast. For example, the MPI-ESM-LR model indicates systematically higher surface CO 2 concentration over land, which in turn would have a radiative effect (Figs. S41 and S42).

Limited filtering of station measurements
Our assimilation procedure is a rather simple one and does not attempt to offset potential biases due to day-and nighttime sampling biases for CO 2 in the case of some flask measurements, or whether including pollution events would bias the latitudinal averages towards higher-than-current-average values. In a world with continuing point sources, screening out pollution effects might cause proposed averages to lag slightly behind the true average concentration. The question is whether the correlation between sampling locations and source locations will inherently bias the average concentrations towards higher-than-true-average values in our assimilation for species, where we include pollution events. For most substances, we do not find any systematic difference between the network averages from AGAGE or NOAA, although there are some species (e.g. HFC-152a, see Fig. S33) for which our higher concentration reconstructions could in part be explained by this different method.
The opposite might also be the case, i.e. that despite including some pollution events, there could still be an inherent underestimation of true zonal means. That is because the NOAA and AGAGE sampling stations, which we are sourcing our raw data from, tend to be biased towards remote, clean-air, or well-mixed conditions and this will have implications for our latitudinal gradient and seasonal cycle. Where there are continental sites, they are often at altitude, and when flasks are sampled, they are generally for midafternoon when mixing is largest. Hence the fitted latitudinal gradient for CO 2 at least might be closer to the NOAA marine boundary layer product than to a true zonal mean. Also, the seasonal cycle will be more representative of marine conditions than continental ones (where a diurnal rectifier could potentially dampen or offset seasonally low concentrations in summer in the case of CO 2 ). This bias towards remote measurements tends to increase the further back in time we go.

Calibration scales
Another limitation of our study is related to the different calibration scales of atmospheric gas measurements. In our data assimilation method with no scale conversion between the SIO and NOAA scales of the AGAGE and NOAA networks (Sect. 2), a time-varying difference between the scales or time-varying coverage from one network to another can lead to spurious trends in the derived concentrations. We argue that our "middle of the road" data assimilation method across the two networks is, however, one justifiable (albeit not the only viable) assimilation method. The reasons for our chosen approach are as follows: (a) uncertainties in absolute mole fractions estimates are small compared to other uncertainties that would affect the radiative forcing in climate models; (b) alternative "pure" scale data assimilation could only deal with the trend uncertainty, not with the uncertainty arising for absolute mole fraction values (assuming that both the SIO and NOAA scales are equally sound); (c) we intend to be "network"-neutral; and (d) a single "in-between" concentration estimate is likely the most appropriate for the primary application purpose (historical simulations of climate models) of the provided data. However, future researchers are encouraged to work directly with the principal investigators of the two networks to devise data assimilation methods that would be better suited for alternative applications, such as uncertainty estimates of inverse emissions etc. A clear limitation of our data product is hence our implicit "in between" scale, with time-varying influences from measurements under one or the other network. Thus, differences to "pure" SIO or NOAA scale will partly arise from this "scale" issue.

No uncertainty estimates
Another important limitation of our study is that we do not provide uncertainty estimates. This is primarily related to the fact that the purpose of this study was to provide a consolidated dataset for CMIP6 climate model experiments. Those model experiments can only be performed a limited number of times given today's computational resources. The experimental protocol hence does not foresee an ability to vary GHG mole fractions within its uncertainties, given that many aspects of climate models are affected by more substantial uncertainties, such as aerosols. The original AGAGE and NOAA (sometimes monthly averaged) sampling data points shown in the fact sheets (see panels f, g and h) can, however, provide an indication of uncertainties and the spread in observations.

Uncertain scaling of seasonality changes and latitudinal gradients back in time
Our choice of predictor for the CO 2 seasonality change (namely the product of CO 2 concentration and global-mean temperature deviation since pre-industrial times) is subjec-tive, and using only CO 2 concentration or temperature would have yielded a larger seasonality difference between current and pre-industrial times. Further research will be necessary to obtain an optimal proxy for presumed pre-observational CO 2 seasonality changes. Similarly, our common explanatory variable for regressions of latitudinal gradients, i.e. global emissions (Boden et al., 2013), is an approximation. Ideally, the time-changing latitudinal distribution of emissions would be considered in those backward extensions of the latitudinal gradient over time. More generally, further research into observational and modelling-derived constraints regarding pre-1950 latitudinal gradients of CO 2 could allow future studies to go beyond our simplified assumption of a zero pre-industrial gradient in light of the uncertainty.

Broad, but not comprehensive data coverage
For the recent instrumental period, our study is predominantly based on the NOAA and the international AGAGE network data. Consistent quality control and consistent scales are advantages of that approach. Ideally, however, our study should have started out from a yet more inclusive representation, e.g. including the multiple additional station datasets gathered and archived by the WDCGG that are part of neither the AGAGE nor NOAA networks. The WDCGG station raw data is available at http://ds.data.jma.go.jp/gmd/wdcgg/ cgi-bin/wdcgg/catalogue.cgi. While the methodology of our study could be maintained or built upon, we hence recommend for any future updates that those additional datasets are considered -with the appropriate quality control and scale conversion efforts.

Known issues
There is one known issue in the historical data series before the year 2002 for CF 4 , C 2 F 6 and C 3 F 8 . We use the Trudinger et al. (2016) datasets and our algorithm categorized them as mid-year values, but the data were estimates for start-of-year values. Thus, while Trudinger et al. (2016) is well aligned with the Mühle et al. (2010) over that time period (given that the same in situ and archive data was used), our historical time series suggest half a year's growth rate, i.e. up to maximum 0.63, 0.065 and 0.015 ppt, too-low mole fractions for CF 4 , C 2 F 6 , C 3 F 8 , respectively for the pre-2002 time frame. In terms of radiative forcing, this difference amounts to approximately 0.00022, 0.000016 and 0.0000043 W m −2 in the years with the maximal growth rates (1980, 1999 and 2002, respectively). Given that some CMIP6 models had started using the historical data by the time of discovering this error (which will have no significant effect on CMIP6 outputs), we opted for not revising this study's CMIP6 datasets.
Ice core measurements over the past 800 000 years reveal how atmospheric GHG concentrations of CO 2 , CH 4 and N 2 O varied. These variations indicate various feedback mechanisms connected to the glacial and inter-glacial cycles driven by Milankovich cycles. With the arrival of homo sapiens, the atmospheric composition changed, initially through activities such as deforestation and agriculture, and then through fossil-fuel driven industrial activities from the start of the industrial revolution. Unprecedented over the 800 000 years of the ice core record, CO 2 , CH 4 and N 2 O concentrations suddenly rose to record levels, with global-mean CO 2 reaching a historical mark of 400 ppm in 2015 (Fig. 6). Recently, synthetic GHGs arising from refrigerants, solvents, foamblowing agents and even gas-cushioned shoe soles added to the warming effect, the radiative forcing. As the IPCC AR5 found, the most likely warming contribution from these GHGs is now higher than the observed warming ( Fig. TS.10 in IPCC AR5; IPCC, 2013). That means that without the human activities that happen to cool the planet, namely the aerosols we emit, observed warming would have been even greater than what has already been experienced.
In this study, we compile a set of GHG histories over the last 2000 years -based on numerous efforts by the scientific community to retrieve firn samples and ice cores in the most remote places on Earth, unlock their secrets by analysing the enclosed air and by investing in a large network of in situ and flask measurement stations across the planet. Our understanding of past climate change is vital to developing scenarios of the future and designing humanity's response strategies in terms of mitigation and adaptation. The ongoing efforts to retrieve and monitor the composition of the planet's atmosphere efforts are sometimes threatened (Lewis, 2016). Without those efforts, the future ahead of us would remain shrouded in even greater uncertainty.
In this dataset, we attempted to provide a solid base for the next generation of climate and ESMs to further our understanding of past and future climate changes. Providing seasonal and latitudinal differences of the radiative forcing that drives the climate change across the globe, we can hope for an even more appropriate comparison between models and past land-ocean, regional land and oceanic temperature observations. Ignoring these seasonal and latitudinal differences can lead to different calculated climate impacts of GHG emissions. Thus, accurately including this variability is a necessary condition to accurately compare model calculations and observations and to understand the reasons for the differences. Those agreements and disagreements between what models and past observations tell us will then allow us to calibrate our understanding of the Earth system, its nonlinearities and its many feedback cycles, the human influences and natural variabilities -called "detection and attribution".
We have been engaging in a unique experiment with our climate. In order to stay below the warming limits, that were set forth in the Paris Agreement in 2015 (i.e. well below 2 and 1.5 • C relative to pre-industrial levels), the next generation of climate models and the examination of their response to climate drivers will be vital as an information basis for decision makers. This study into the main past driver of humaninduced climate change will hence contribute to our collective examination of the tremendous challenge in which we find ourselves.
Code availability. The MATLAB and R code that was used to assimilate the raw data is available from the authors on request.
Data availability. A supplementary data table is available with global-and annual-mean mole fractions. The complete dataset with latitudinally and monthly resolved data in netcdf format is available via https://esgf-node.llnl.gov/search/input4mips/. Additional data formats, i.e. CSV, XLS, MATLAB .mat files of the same data, are also available via www.climatecollege.unimelb.edu.au/cmip6. The respective raw data used in this study is available from the original referenced data providers on request or can be found at the web locations indicated in Table 12.
The Supplement related to this article is available online at doi:10.5194/gmd-10-2057-2017-supplement.
available NOAA network data: Ed Dlugokencky, Pieter Tans, David Nance, Bradley Hall, Geoff Dutton, James Elkins, Debra Mondeel, Carolina Siso, Ben Miller. We would also like to thank the Editor O. Morgenstern for helpful suggestions and Zebedee Nicholls for his comments. Ray Langenfelds and Paul Steele (CSIRO) are thanked for their long-term support of the Cape Grim, Cape Grim Air Archive and AGAGE activities.
Furthermore, we thank the ESM CMIP5 modelling groups who contributed to the 5th Climate Model Intercomparison project and whose data we analysed. Also, the reviewer comments from one anonymous reviewer and Piers Forster were most helpful in improving the quality of the paper.
MM thankfully acknowledges the support by the Australian Research Council Future Fellowship grant FT130100809. This work was undertaken in close collaboration with partners in the European Union's Horizon 2020 research and innovation programme CRESCENDO (grant no. 641816), of which the University of Melbourne is an unfunded partner. CSIRO's contribution was supported by the Australian Government, in part through the Australian Climate Change Science Program.
Edited by: O. Morgenstern Reviewed by: P. Forster and one anonymous referee