CHIMERE-2016: From urban to hemispheric chemistry-transport modeling

CHIMERE is a chemistry-transport model initially designed for box-modelling of regional atmospheric composition. In the past decade, it has been converted into a 3D eulerian model that could be used at a variety of scales from local to continental domains. However, due to the model design and its historical use as a regional model, major limitations had remained, hampering its use at hemispheric scale, due to the coordinate system used for transport as well as to missing processes that are important in regions outside Europe. Most of these limitations have been removed in the CHIMERE-2016 version, 5 allowing its use in any region of the world and at any scale, from the scale of a single urban area up to hemispheric scale, including or not polar regions. Other important improvements have been made in the treatment of the physical processes affecting aerosols and the emissions of mineral dust. From a computational point of view, the parallelization strategy of the model has also been improved in order to improve model numerical performance and reduce the code complexity. The present article describes all these changes. Statistical scores for a model simulation over continental Europe are pre10 sented, and a simulation of the circumpolar transport of volcanic ash plume from the Puyehue volcanic eruption in June 2011 in Chile provides a test case for the new model version at hemispheric scale.


Introduction
Deterministic chemistry-transport modelling is now widely used for the analysis of pollution events, scenarios and forecast (Monks et al., 2009). Numerous models exist and are used from local to global scale, both for gaseous and aerosols modelling 15 (Simpson et al. (2012); Inness et al. (2013) among many others).
While models were previously dedicated mainly to specific processes, the last generation of chemistry-transport models (CTMs) aims at representing the complete set of all processes leading to changes in the atmospheric composition in terms of aerosols and trace gases. For regional air quality in the troposphere, several CTMs are currently developed and are able to include all types of emissions: anthropogenic, biogenic, mineral dust, sea salt, vegetation fires and volcanos. Even though 20 Finally, Section 8 presents the conclusions of the present study, in terms of applications made possible by this new model version, as well as the outlines for future developments of the CHIMERE model.

Optimizations
Several technical changes were made in the CHIMERE code to improve code scalability: these changes regard the inclusion of preprocessors into the CHIMERE core (and therefore their parallelization) along with improvement of the parallelization in 5 order to improve code scalability.

Inclusion of preprocessors into CHIMERE core
Compared to the previous model version, several programs that used to be sequential preprocessors executed before running the model core have now been parallelized and included into the model core. This is the case of the interpolation and treatment of the input meteorological fields. In the new model version, these fields are read and processed at each hourly time step (instead 10 of being processed once and for all in a sequential way at the beginning of the run). This new design has no impact on the model outputs but has two advantages: 1. It allows a reduction of computation time by parallelization of this calculation step 2. It enables the possibility to develop an online coupled version of the model, in which case the meteorological fields would not be available before the beginning of the simulation. 15 Note that this "real-time" processing of the meteorological fields is only available for users who use meteorological fields from WRF. For users of other sources of meteorological data such as ECMWF products, offline meteorological preprocessors are still provided with the model.

Improvement of the parallelization
In 2006, the CHIMERE core was parallelized using a master/worker pattern. A cartesian division of the simulation domain 20 into several sub-domains is done, each sub-domain being attributed to one worker process. Each worker performs the model integration in its own geographical sub-domain as well as boundary condition exchanges with its neighbours in order to permit transport from one worker to the next. In addition, in former CHIMERE versions, a master process was needed in order to gather and scatter data from the workers and to perform initializations and file input/output. The use of a master process limited the efficiency of the parallelized code, since the master process did not perform any 25 computation except gathering and scattering the data to and from the workers, and that it totally centralized the input and output tasks, a bottleneck effect which limited the gains realized by parallelization, particularly when the simulation domains were very large and split between many workers. Therefore, in the CHIMERE-2016 version, this master process has been removed: using the parallel input/output routines of the Parallel-Netcdf library (Li et al., 2003), each worker process now reads the netcdf input files and writes the output data for its own sub-domain, removing the bottleneck effect due to the centralization of input/output tasks.
This induces some major simplifications of CHIMERE code, including reduction of inter-process communications related to the parallelization of the input/output processes, which were performed in a central way by the master process in previous  Historically, CHIMERE has been first designed as a box model for the region of Paris (Menut et al., 2000). Rapidly, it has 10 been transformed into a cartesian model on curvilinear Arakawa C-grids (Arakawa and Lamb (1977), see Fig. 1). However, the formulation of the transport scheme on these curvilinear grids up to CHIMERE-2014b was still based on a lat-lon formulation, which implied the impossibility to include poles in the domain.
These restrictions have been removed in CHIMERE-2016 by switching from a representation of the grid points in a spherical coordinate systems, singular at the pole, to a 3d cartesian coordinate system, which has no singularity at the poles. In the former 15 4 Geosci. Model Dev. Discuss., doi:10.5194/gmd-2016-196, 2016 Manuscript under review for journal Geosci. Model Dev. Published: 7 September 2016 c Author(s) 2016. CC-BY 3.0 License. CHIMERE versions the grid centers were represented by their geographical coordinates λ ij , φ ij , and the wind vectors by their projection on the local frame (u λ , u φ ) (Fig. 2). In the present version, the points are represented by their cartesian coordinates in the frame centered at the Earth center and with unit vectors (u 1 , u 2 , u 3 ), and the wind vectors are represented by their projections on these unit vectors.
This change in the internal representation of spherical geometry has only a small impact on the simulated values, in the 5 sense that it corrects some geometrical errors that appeared due to the assumptions made in the old coordinate system, but these differences have been found to be of very small amplitude, except in the vicinity of the pole where distorsions due to the lat-lon system became critical. The new coordinate system allows domains that include the pole, without the need for any particular filtering. This strategy permits to build regional domains from local to hemispheric scale anywhere on the globe, including one pole or even both of them, which opens possible application of CHIMERE-2016 for studies in the polar areas, 10 including circumpolar transport of polluted air masses, as will be shown in Section 7. An example grid on which CHIMERE-2016 can be run is shown on Fig. 3. This grid is a polar stereographic grid centered at the north pole, entirely covering the northern hemisphere, and with the four corners of the domains extending slightly into the southern hemisphere (as far south as 19.47 • S). With this projection and this number of points, the horizontal model resolution varies from 140x140 km 2 at the pole to 70x70 km 2 at the Equator. 15
In this new coordinate system, the transport is calculated as follows. First, the coordinates of every grid center M ij are converted from their geographical coordinates λ ij , φ ij to cartesian coordinates x ij 1 , x ij 2 , x ij 3 on a unit sphere as follows: The horizontal wind vector U ij at the grid center is initially represented by the two classical wind components: where the zonal and meridional wind components u ij and v ij are obtained from the meteorological inputs.

5
Since this representation splitting the horizontal wind into a zonal and a meridional component is singular at the geographical poles, before performing the transport operations, the horizontal wind is split into its three components on the cartesian frame (u 1 , u 2 , u 3 ) using the following formulae for projecting the wind on the cartesian frame (u 1 , u 2 , u 3 ): Once the cartesian coordinates of the grid centers x ij 1 , x ij 2 , x ij 3 and of the wind-speed vectors U ij 1 , U ij 2 , U ij 3 are computed at the grid centers, it is easy to obtain the values of the speed vectors at the staggered cells ( Fig. 1) with the following formulae: This new formulation with the use of cartesian coordinates instead of geographical latitude-longitude coordinates for the 20 transport of pollutants removes the constraints that prevented the use of CHIMERE on domains including a geographic pole and/or a date-change line. This new formulation has been tested on the case of the eruption of the Puyehue volcano, in June 2011, a case during which the ash plume from the volcano went around the south pole through the southern Atlantic, Pacific and Indian Oceans back to South-America after 15 days (Section 7). This case is a perfect testbed for the ability of the model so simulate circumpolar movements, and evaluate its ability to represent the location of an aerosol plume after several days/weeks 25 of travel.

Vertical mesh calculation
The vertical discretization of CHIMERE needs to obey twofold requirements. First, as it has been the case since the beginning of the development of the model, the vertical mesh needs to be very refined in the lowest atmospheric layers because these layers are critical for the modelling of boundary layer contamination, particularly in urban areas, but also in marine areas in order to model correctly the sea-salt emissions, and in arid areas, for mineral dust emissions. On the other hand, the CHIMERE 5 model is now used not only for studies at urban/regional scale, but also for studies at continental and, from the present version, hemispheric scale. Therefore, a relatively fine vertical resolution is also needed in the free troposphere to be able to simulate the transport of trace gases and aerosols over large distances avoiding excessive numerical diffusion. Therefore, due to this twofold requirement, the CHIMERE-2016 vertical mesh is defined as described below.
Regarding the vertical discretization, the user has three degrees of freedom: -The number of layers, typically from 8 to 20 layers for the most common configurations of the model.
-The pressure of the top of the model, p top , typically from 500 hPa for studies at urban/regional scales to 200 hPa for continental/hemispheric scale studies. 15 From these user-defined parameters, a preprocessing tool calculates a vertical grid as follows: -From the surface to 800 hPa, the layer thickness (in hPa) increases exponentially -From 800 hPa to the top of model, the layers are evenly distributed, with equal thickness for each layer.
This procedure outputs the pressure of the level tops, for a reference surface pressure p ref of 1000 hPa. However, the model levels need to adapt themselves to the variations of the surface pressure, essentially due to orography. This is ensured by scaling 20 linearly the pressure levels between the surface pressure and the pressure at the top of model, p top , producing two sequences of coefficients a i and b i , such that the pressure at the top of level i is given by p i = a i p ref + b i p surf . These coefficients are given by the following expressions: The linear scaling of the pressure levels by these two sequences of coefficients ensures that the pressure levels never cross each other, and that their relative thickness stays the same even above high topography, as shown in Fig. 4.
scaling and reformatting of the raw emissions from the EMEP emission inventory at 50 km resolution, but can be adapted by users to any other raw dataset they need to use. The main steps for this are described in Menut et al. (2012Menut et al. ( , 2013a: -A first step projects the annual masses from the "raw" EMEP grid to the CHIMERE grid. The spatial emission distribution from the EMEP grid to the CHIMERE grid is performed using proxys like population density, as described by Figs. 5a-d. Proxies used by emisurf for this process include land use data (either GLCF, USGS or GlobCover), large 5 point source database (such as the EPER database for Europe), etc.
-Second, monthly, weekly and hourly profiles are prescribed to convert annual totals to hourly fluxes used as input for CHIMERE. These factors are derived largely from data provided by the University of Stuttgart (IER) as part of the GEN-EMIS project (Friedrich and Reis, 2004), and are available as data files from the EMEP model website, www.emep.int.
-A last step consists in converting the species available in the raw data into the model species. should assigned to NO 2 for modern fleet (post 2010). For NMVOC, the VOC data used are derived from the detailed United Kingdom speciation given in Passant (2002). For SO x , 99% is assigned to SO 2 and 1% for primary sulphate to 15 account for very fast and local sulphate production. The lumping procedure accounts for the reactivity of VOC species following Middleton et al. (1990).
The vertical distributions were originally based upon plume-rise calculations performed for different types of emission source which are thought typical for different emission categories, under a range of stability conditions (Vidic, 2002), but have since been simplified and adjusted to reflect the more recent findings of (Bieser et al., 2011). The main changes have been for 20 the residential sector where now 100% of the emissions are placed in the lowest model layer, reflecting the large dominance of domestic combustion for this emission category. Also, emissions from large combustion facilities in SNAP sectors 1 and 4 corresponding to large industrial facilities burning fossil fuels are attributed to lower layers than in Vidic (2002), resulting in enhanced concentrations of primary species such as NO x and SO x in the boundary layer, in better agreement with routine surface observations, as discussed in Mailler et al. (2013).

Recent changes
The main recent changes have been focused on the use of proxies to better reallocate in space the raw emissions. This specialization can be performed from the raw gridded data or directly from the annual country totals (Terrenoire et al., 2015).
The European Pollutant Release and Transfer Register (E-PRTR) data are used to precisely place the emissions from the main industrial sources. E-PRTR is the Europe-wide register that provides easily accessible key environmental data from industrial   -Other countries are only covered by the world roadmap and population data.

Mineral dust emissions
Mineral dust modelling is an important process for climate processes but also for air quality regional modelling. For many 5 regions over the world, it becomes necessary to manage air pollution knowing the relative part of anthropogenic and natural contributions. For this, even over small regions, it is important to have the same level of knowledge on mineral dust emissions as for anthopogenic or biogenic emissions. In this new model version, many improvements were done for mineral dust emissions.
They are related to input databases, the emission schemes themselves and additional options to better take into the impact of meteorological conditions on emissions.  (Homer et al., 2004) and STATSGO-FAO soil dataset (Wolock, 1994). The roughness length is estimated using the global 6km horizontal resolution GARLAP (Global Aeolian Roughness Lengths from ASCAT and PARASOL) dataset (Prigent et al., 2012). 20 In addition to these changes, the possibility to evaluate the soil erodibility based on satellite data was added. In the previous versions, the erodibility was based on the landuse database: cropland, grassland, shrubland and barren or sparsely vegetated areas were considered as erodible. In this case, constant percentages are applied for each landuse. While this possibility is maintained in CHIMERE-2016, a new option uses the global erodibility dataset derived from MODIS, as presented in Grini et al. (2005), and already included in CHIMERE and used, as described by Beegum et al. (2016). A mix between the two 25 options was also added, making possible to use MODIS only over desert areas and the USGS landuses categories elsewhere.

The Kok's scheme for mineral dust emissions
In this model version, the Kok mineral dust emissions parameterization is proposed, in addition to the Marticorena and Bergametti (1995) and Alfaro and Gomes (2001) schemes.
The Kok scheme is fully described in the articles Kok et al. (2014b), Kok et al. (2014a) and Mahowald et al. (2014). The vertical dust flux is calculated as: where f bare and f clay represent the relative fraction of bare soil and clay soil content, respectively. The flux is calculated only if u * > u * t . The threshold friction velocity, u * t , is calculated using the Iversen and White (1982) or the Shao and Lu (2000) 5 scheme (a user's choice). The corresponding u * st is this friction velocity but for a standard atmospheric density ρ a0 =1.225 kg m −3 : u * st0 represents u * st for an optimally erodible soil and was chosen as u * st0 =0.16 m s −1 in Kok et al. (2014b). The dimen- The dust emission coefficient C d represents the soil erodibily as: with the constant dimensionless coefficients C e =2.0 and C d0 =4.4 10 −5 .
The vertical dust flux is integrated over the whole size distribution. This flux is thus redistributed into the model dust size distribution as: with V d the volume of mineral dust aerosols for each mean mass median diameter and λ=12.0 µm.

Impact of vegetation, humidity and rain on dust emissions
As possible emissions areas are extended, it is necessary to revise the potential impact of vegetation, soil humidity and rain on 20 emissions.
To take into account the impact of vegetation, representing the variations of the aeolian roughness length becomes necessary for simulations areas with a significant annual cycle of the vegetation. In this model version, the monthly vegetation fraction is used to ponderate the emission flux when vegetation grows.
It is also necessary to take into account the possibility to inhibit or moderate dust erosion in case of rainfall. This is done in the following two ways. First, during a precipitation event, the mineral dust emissions are completely cancelled. Second, dust erosion not reinitiated immediately after a precipitation event, the soil being potentially crusted (Ishizuka et al., 2008). To take into account this latter effect, a simple function describing a factor f p is applied to moderate the dust emissions fluxes when a precipitation is diagnosed as: with ∆t p the time since the last precipitation event and τ the period after which the surface mineral dust fluxes E dust is fully taken into account, considering that the inhibiting effect of precipitation is finished. For this study, ∆t p is in hours and τ =12.
This function is displayed in Fig. 6. In the absence of precipitation, the soil moisture may also inhibit mineral dust erosion. This effect is taken into account 10 using the Fecan et al. (1999) parameterization. This scheme considers that soil moisture will increase the threshold friction velocity, u T * , used to determine if erosion occurs or not. To distinguish between soil conditions, the dry and wet threshold friction velocities are defined, and noted u T d * and u T w * , respectively. u T w * is estimated as a possible increase of u T d * depending on the modelled gravimetric soil moisture w (in kg.kg −1 ): 15 In the model, the dry treshold friction velocity, u T d * is calculated following the scheme of Shao and Lu (2000). The f (w) factor is estimated as: where A and b are constants to estimate, and w corresponds to the minimum soil moisture from which the threshold velocity increases. The values of A, b and w are dependent on the soil texture. For A and b , the values are fixed to A=1.21 and b =0.68. Using measurements data, Fecan et al. (1999) showed that the value of w is mainly dependent on the clay content of the soil and proposed the following fit: Note that in equation 15, the gravimetric soil moisture w has to be expressed in %, w being in % in equation 16 (a conversion is done from kg/kg to %).

Primary particulate matter resuspension
The resuspension process is important for particulate matter and may induce a large increase of the emission flux in case of dry soils, for locations where traffic and industries produce particles that may be deposited on the ground and therefore become 10 available for resuspension. In this model version, the resuspension flux is active only for cells containing an urbanized surface.
The flux is affected to Primary Particulate Matter (PPM) emissions only and thus considered in the model as an anthropogenic process.
The formulation is derived from the bulk formulation originally proposed by Loosmore (2003). The resuspension rate λ, in s −1 , is expressed as: where τ is the time after the start of resuspension. This time is taken into account considering that particles are first deposited then resuspended. The detail of the processes leading to resuspension are essentially unknown, and we assume here that the available concentration of particulate matter depends only on the wetness of the surface. In this empirical view, the resuspension flux is assumed to be: where f (w) is a function of the soil water content and P is a constant tuned in order to approximately close the PM 10 mass budget over Europe estimated in Vautard et al. (2005). It was found to give a correct amount of additionnal P M 10 . In this model version, P is approximated as P = 4.72 10 −2 µgm −2 s −1 if we consider European mean conditions with a soil water content of 25% and a friction velocity of u * =0.5 m.s −1 .

25
The soil water function f (w) is estimated as: where w t = 0.1 is a soil moisture threshold below which resuspension is activated, and w s is the maximum of soil moisture ponderated by the ratio of water and soil densities, as: with w max = 0.3 is a constant value representing the maximum soil moisture value, D water is the water density (assumed to be unity) and D soil is the dry porous soil density. D soil is itself estimated as: Two gas-phase chemical schemes were implemented in the CHIMERE model. The most detailed chemical scheme, called MELCHIOR1, represents the oxidation of around 80 gaseous species according to 300 reactions. The other mechanism, called MELCHIOR2, is a reduced version of MELCHIOR1 developed using chemical operators (Derognat et al., 2003;Carter, 1990).
MELCHIOR2 represents the oxidation of around 40 gaseous species according to 120 reactions. These chemical mechanisms are described in detail in Menut et al. (2013b). Comparisons between MELCHIOR2 and tree detailed mechanisms (MCM, Jenkin et al. (2003) ; SAPRC99, Carter (2000) ; GECKO-A, Aumont et al. (2005)) show a good agreement between the 20 chemical schemes, with differences in HCHO yields under low and high NO conditions lower than 20% between the simulated results (Dufour et al., 2009). SAPRC99 chemical mechanism had already been used in CHIMERE for a few studies (Lasry et al., 2007;Coll et al., 2009) but had never been distributed in a previous CHIMERE releases.
Since the development of the MELCHIOR mechanisms in 2003, progresses have been made in atmospheric chemistry, particularly concerning the VOC ozonolysis. One of the most up to date chemical scheme currently available in the literature is 25 the SAPRC-07 (Carter, 2010a). This mechanism is widely used and evaluated against chamber data (≈2400 experiments). The detailed SAPRC-07 chemical mechanism contains 207 species and 466 reactions. This detailed mechanism has been used to develop several reduced mechanisms designed for CTM applications (Carter, 2010b). The less reduced mechanism, SAPRC-07A, has been implemented in the 2016 CHIMERE model. This chemical scheme contains 72 species and 218 reactions. Two CHIMERE simulations using SAPRC-07A and MELCHIOR2 chemical schemes respectively were compared with AirBase measurements of NO x and ozone over Europe during summer 2005. The two chemical schemes were found to provide a good correlation with ozone measurements (Pearson's correlation rate 0.71 for both mechanisms), with a slightly smaller bias for ozone concentrations obtained using SAPRC-07A (8.19 ppb versus 9.29 ppb, Menut et al. (2013b)).

The chlorine mechanism 5
Over the past decade, several studies have shown that halogens (chlorine, bromine, iodine) chemistry could influence ozone concentrations in the troposphere. A recent review (Simpson et al., 2015) presents the state of art on this topic. If the role of halogen chemistry was traditionally considered limited to the marine boundary layer, recent observations have shown that significant ClNO 2 concentrations (from 80 ppt to 2000 ppt) can occur in various environments. This compound can act as a nitrogen reservoir with a long lifetime allowing its long-range transport. In previous version of CHIMERE model, it was 10 possible to have the chemical composition (Na, Cl, H 2 SO 4 ) of sea salt emissions based on mean composition described in Seinfeld and Pandis (1997). The chlorine chemistry is not described in MELCHIOR chemical schemes but Carter (2010b) proposed in SAPRC-07A a chlorine mechanism with 9 inorganic species and 3 products formed by the reactions with VOCs.
In SAPRC-07A, the chlorine chemistry is represented by 68 reactions, which have been implemented in CHIMERE-2016 only if the SAPRC-07A chemichal mechanism is chosen by the user. 15 5.2 Evolution of the aerosol scheme

Disretization of the aerosols size distribution
The CHIMERE model accounts for the size distribution of the aerosols using a size-bin approach: the aerosol particles for each of the model species are distributed in N size bins, covering a diameter range from D min to D max . Given these three userdefined parameters, a preprocessor compute a sequence (d i ) i=1,N +1 of cutoff diameters that meets the following requirements: The first requirement is set to allow a meaningful evaluation of PM 2.5 and PM 10 in the model, since these quantities are 25 typically available from routine measurements. calculation speed is a critical requirement, for example for operational prevision, the number of size bins could be lowered to N = 6, still ensuring that d i+1 /d i ≤ 4

Wet diameter and wet density of aerosols
In many processes, the diameter and the density of aerosols are used (deposition, absorption, coagulation, etc...). These processes have to take into account that the diameter and the density of aerosols change with humidity due to the amount of water 5 absorbed into the particles. Therefore, the notion of wet diameter and wet density was introduced in CHIMERE-2016. Particles are distributed between bins according to their dry diameter. The wet diameter of the particles is calculated as a function of humidity and the composition of the particle.
To compute the wet density and wet diameter for each aerosol size bin, the amount of water in each bins is computed with the "reverse mode" of ISORROPIA (Nenes et al. (1998)) by using the composition of particles, assuming that only sulphate, 10 nitrate, ammonium and sea salts have a high enough hygroscopicity to absorb a significant amount of water. The density of the aqueous phase of particles is computed according to composition following the method of Semmler et al. (2006). The density and mass of the inorganic aqueous phase (sulphate, nitrate, ammonium and sea salts and water) and the density and mass of other compounds (dust, organics, black carbon, etc...) are used to compute the total density of the particle and then its wet diameter, assuming internal mixing for each size bin.

Absorption
Absorption is described by the "bulk equilibrium" approach of Pandis et al. (1993). In this approach, all the bins for which condensation is very fast are merged into a "bulk particulate phase". Following Debry et al. (2007), a cutting diameter of 5 1.25 µm is used to separate bins which are inside the "bulk particle" (with a diameter lower than the cutting diameter) from other bins.
Thermodynamic models are used to compute the partitioning between the gas phase and the bulk particle phase and estimate the gas-phase concentrations at equilibrium. For semi-volatile inorganic species (sulphate, nitrate, ammonium), concentrations G eq at equilibrium are calculated using the thermodynamic module ISORROPIA (Nenes et al. (1998)). This model also de-10 termines the water content of particles. Equilibrium concentrations for the semivolatile organic species are related to particle concentrations through a temperature dependent partition coefficient K p (in m 3 µg −1 ) (Pankow (1994)).
Following Pandis et al. (1993), the mass of compounds condensing into particles, ∆A p , is redistributed over bins according to the kinetic of condensation into each bin. For evaporation, the mass of compounds evaporating from each bin is proportional to the amount of the compounds in the bin. 15 If the variation of particulate bulk concentration of compound i, ∆A p,i , is greater than 0 (condensation): with k bin i the kinetic of condensation given by Seinfeld and Pandis (1997): with N bin the number of particles inside the bin, D bin p the mean diameter of the bin, D i the diffusion coefficient for species i in 20 air, M i its molecular weight and f (Kn, α) is the correction due to noncontinuum effects and imperfect surface accomodation.
f (Kn, α) is computed with the transition regime formula of Fuchs and Sutugin (1971).
If the variation of particulate bulk concentration of compound i, ∆A p,i , is lower than 0 (evaporation): If a particle shrinks or grows due to condensation/evaporation, the mass of this particle has to be redistributed over diameter 25 bins. The mass redistribution algorithm of Gelbard and Seinfeld (1980); Seigneur (1982) is used.

Coagulation
The flux of coagulation J b coag,i of a coumpound i inside a bin b is computed with the size binning method of Jacobson et al. (1994): 25) with N k the volumic number of particles in bin k, K j,l the coagulation kernel coefficient between bins i and j and f b j,k the 5 partition coefficient (the fraction of the particle created from the coagulation of bins j and k which is redistributed into bin b).
The coagulation kernel and the partition coefficients are calculated as described in Debry et al. (2007).

Wet deposition
For the in-cloud scavenging of particles, the deposition of particles is assumed to be proportional to amount of water lost by precipitations. The deposition flux is written as: with P r the precipitation rate released in the grid cell (kg m −2 s −1 ), w l the liquid water content (kg m −3 ), h the cell thickness (m) and ε l an empirical uptake coefficient (in the range 0 -1) currently assumed to be 1. l and k are respectively the bin and composition subscripts.
For the below-cloud scavenging of particles, particles are scavenged by raining drops following Henzig et al. (2006). A 15 polydisperse distribution of raining drops is applied: where A = 1.047 − 0.0436 ln P + 0.00734 (ln P ) 2 with P the precipitation rate in mm/h and R the radius of the droplet. The below-cloud scavenging rate is written: with R, the radius of the raindrop (in m), r l the radius of the particle (in m), u g the terminal drop velocity (in m/s), E(R, r l ) the collision efficiency of a particle with a raindrop, N (R) (in m −4 ) the raindrop size distribution.

20
Geosci. Model Dev. Discuss., doi: 10.5194/gmd-2016-196, 2016 Manuscript under review for journal Geosci. Model Dev. 10 CHIMERE-2013 did not take into account all of these processes (Menut et al., 2013b), relying instead on a very simplified calculation of the photolysis rates, as shown in Table 2. The photolysis rates were evaluated from tabulated values using TUV (Madronich, 1987), depending only on the solar zenithal angle and the altitude. These tabulated values were calculated assuming a vertical profile for ozone that was typical of the northern hemisphere midlatitudes, neglecting the effect of the aerosols, and assuming a constant and uniform surface albedo. The effect of clouds was parameterized as an exponential 15 reduction of the photolysis rates as a function of the cloud optical depth. While this set of approximations was acceptable when the CHIMERE model was used as boundary-layer regional CTM for locations in Europe, this had strong limitations for its use for longer-term simulations including long-range transport in the free troposphere over geographical domains including polar and/or tropical zones. Therefore, the Fast-JX module has been included in CHIMERE-2016. Photolysis rates for the photodissociation of ozone and nitrogen dioxide as computed by the Fast-JX model inside CHIMERE have been compared 20 favorably to in situ measurements at the island of Lampedusa (Italy), including in presence of aerosols (Mailler et al., 2016)

surface albedo
The surface albedo in the near-UV spectral region, which is determinant for the calculation of photolysis rates Dickerson et al. (1982), is highly variable according to the landuse and to the presence or absence of snow. It is worth noting at that point that the albedo of all the continental and oceanic surfaces is smaller than 0.1, while the albedo of snow ranges from 0.3 to over 0.8 according to the type of landuse. Therefore, the absence/presence of snow will modulate very substantially the values of 5 the modelled photolysis rates, and therefore the concentration of trace gases such as ozone. Even though strong ozone peaks generally occur in summertime in a context of strong anthropogenic NO x production and in the absence of snow, it has been shown recently that strong ozone peaks can occur in wintertime over the continental United States in zones of oil and gas extraction due to the combination of the strong anthropogenic concentrations of VOCs in a very shallow boundary layer with relatively strong photolysis rates due to the high surface albedo (Edwards et al., 2014;Schnell et al., 2009). It is therefore 10 important that CTMs take into account the impact of snow on surface albedo, in order to be able to reproduce correctly such cases.
The surface albedo in the UV band in CHIMERE-2016 is evaluated according to Laepple et al. (2005) in the absence of snow (tested as snow depth less than 1 cm), and from Tanskannen and Manninen (2007) in the presence of snow, tested as snow depth greater than 10 cm. Values are displayed in Table 3 Table 3. Tabulated values from Laepple et al. (2005) and Tanskannen and Manninen (2007) used for the calculation of the albedo in the UV band. In the presence of sea-ice over ocean, the albedo of the ice surface is assumed equal to the Tanskannen and Manninen (2007) value for > 10 cm of snow on barren land.
The snow depth is read from the WRF or ECMWF meteorological inputs, if available. If any other model is used, the snow cover will be assumed inexistent. If the snow-cover is thinner than 1 cm in the model, the albedo is assumed to be that of dry land. If the snow-cover is thicker than 10 cm, the albedo is assumed to be that of snow-covered land. In-between, a linear interpolation is performed. Even though the case of sea-ice is not explicitly treated in Tanskannen and Manninen (2007), the 22 Geosci. Model Dev. Discuss., doi:10.5194/gmd-2016-196, 2016 Manuscript under review for journal Geosci. Model Dev. Published: 7 September 2016 c Author(s) 2016. CC-BY 3.0 License. assumption is made in CHIMERE-2016 that the albedo of sea-ice is the same as that of a thick layer of snow covering barren land.

Implementation
The physical calculations performed by Fast-JX are split in two steps.
First, the Legendre coefficients for the scattering phase function for all aerosol species and diameter bin are calculated using 5 Michael Mischenko's spher.f code (Mischenko et al., 2002), assuming sphericity of the aerosol particles. This calculation is performed for each of the n spec × n bins species, and for the five wavelengths that are used for the Mie scattering processes in Fast-JX. This step is performed once and for all before the first simulation step, and lasts from a couple of seconds to a couple of minutes according to the number of aerosol species and diameter bins. The refractive indices reproduced in Table 4 are the ones provided along with the model, essentially based on results from the ADIENT project. However, the specification of these 10 parameters is in a parameter file, and can be changed by the user to other values. In the same way, the user can easily introduce more species in the optical treatment for specific studies, e.g. volcanic ashes.

Species
Real part of the refractive index Imaginary part of the refractive index  After the preprocessing phase, at each time step and in each model column, the Fast-JX module resolves the radiative transfer in the model atmospheric column, computing the actinic fluxes at each model level and integrating them over N wavelength bins in order to produce accurate photolysis rates. In the configuration adopted for CHIMERE-2016, N is set to 12, which 15 is the value recommanded by Fast-JX developers for tropospheric studies. These 12 wavelength bins include the 7 standard Fast-J wavelength bins from 291 nm to 850 nm, as described in Wild et al. (2000). The 7 standard Fast-J wavelength bins are essentially concentrated from 291 nm to 412.5 nm which is the spectral band relevant for tropospheric photochemistry. Following the recommandations of Fast-JX model developers, these 7 standard wavelength bins are complemented by 5 additional wavelength bins, from 202.5 nm to 291 nm, which are only relevant in the upper tropical troposphere. In a typical simulation framework, it has been found that the increase in computational time relative to the simulation with tabulated photolysis rates is below 10% (Mailler et al., 2016).

Online calculation of lidar profiles
During the model integration, some additional diagnostic variables are estimated: (i) the Clouds Optical depth (COD) and the 5 Aerosol Optical Depth (AOD) using the FastJX module, and (ii) the lidar profiles.
The lidar profiles are calculated using the aerosols concentrations contributions only, as detailed in Stromatas et al. (2012).
They are proposed as output after a simulation and are designed to be directly comparable to ground-based or spatial lidars.
Three different profiles are calculated both in Nadir and Zenith lidar configurations: (i) the Attenuated Scattering Ratio, R (z), (ii) β (z, λ) and β m (z, λ), respectively the total and molecular attenuated backscatter signal.

10
By definition, R (z) is equal to 1 in absence of aerosols/clouds and when the signal is not attenuated. In the presence of aerosols, R (z) would be greater than one. Following Winker et al. (2009), this ratio is expressed as: The total attenuated backscatter signal β (z, λ) is calculated as: and the molecular attenuated backscatter signal β m (z, λ) as: σ sca/ext p (z, λ) and σ sca/ext m (z, λ) are the extinction/scattering coefficients for particles and molecules (in km −1 ). S m and S p 20 are the molecular and particular extinction-to-backscatter ratios (in sr). η (z) represents the particles multiple scattering and z represents the distance between the emitter and the studied point. Note that for the case of a space lidar the integration begins from the top of the atmosphere (TOA) while for a ground lidar the integration begins from 0 (ground level) to z. Further details about these calculations are provided in Stromatas et al. (2012).
The performance of CTMs is often evaluated by comparing simulation results to data of measurements, either from routine networks (Solazzo et al., 2012a, b) or from dedicated field campaigns (e.g. Menut et al. (2015); Petetin et al. (2015)). Simon et al. (2012) presented an overview of performance evaluation studies for a large set of models and studied cases. In this study statistical validation with data of measurements was performed for two 3-month long simulations with CHIMERE-

25
Geosci. Model Dev. Discuss., doi: 10.5194/gmd-2016-196, 2016 Manuscript under review for journal Geosci. Model Dev.  Table 5. Number of EMEP stations per species and per season used for performance statistics. Stations CH01 CH04 CH05 DE03 DE08 AT05 AT48 IT01 IT04 ES78 DE44 were excluded from the analysis due to their topography difficult to simulate with a 0.5 • resolution.
The (Wesely, 1989) aerosol dry deposition and (Loosmore, 2003) resuspension schemes were used. The online coupling with ISORROPIA model was not activated, and the precalculated lookup tables were used instead (Menut et al., 2013b).
The statistical scores were computed against the EMEP surface concentration measurements for main gaseous and aerosol species, after filtering out the stations with complex topography (CH01 CH04 CH05 DE03 DE08 AT05 AT48 IT01 IT04 ES78 DE44) that cannot be simulated appropriately at 0.5 • resolution 5 Figure 7 shows the performance statistics for the main model species. The number of EMEP stations used for each species for winter and summer is shown in Table 5. The standard metrics used for air quality modeling (Simon et al., 2012) were employed, namely the Pearson's correlation P COR, the mean fractional error M F E, and the mean fractional bias M F B.
Ozone shows the best scores among all the species, both for summer and winter, with P COR = 0.69, M F E = 19.6%, M F B = 11.4% in summer and P COR = 0.72, M F E = 24.9%, M F B = 14.8% in winter. It also shows the smallest vari-10 ability of scores among the stations (96 available stations in summer and 98 in winter). As noted by Simon et al. (2012), the ozone overestimation often reported for CTMs is related to the averaging over the hours with high and low concentrations, so the scores are dominated by performance at low concentrations, which occur much more often than high concentrations.
Indeed, the MFB computed from daily maximum ozone concentrations in this study is quite lower: 1.5% for summer and 3.6% for winter. 15 The NO 2 shows quite larger MFE: 54.9 % in winter and 60.4% in summer with a large variability of both MFE and MFB between stations. The bias is negative in winter, slightly positive in summer but with a high negative values at some stations.
The model resolution of 0.5 • might be not enough to represent the large NO x emission gradients, even when measured at the background rural sites. Also, as discussed by refbterrenoire2015, the negative bias can be related to the general underestimation of the emissions in the inventory used. This is in agreement with the relatively high correlation: 0.65 in winter and 0.41 in 20 summer.
The SO 2 shows the largest MFE for both summer (74.5%) and winter (80.2%) and quite low correlation in summer (0.20).
It shows positive bias in winter: M F B = 35.5%. The difficulty of SO 2 simulation can be related to the emission profile, in particular the vertical distribution of emitted species (Pirovano et al., 2012;Mailler et al., 2013).
The performance for PM is affected by compensating effects of several chemical components, such as dust, primary organics The PM 25 concentrations show slightly larger overestimation than PM 10 . The winter correlation is higher though (0.60 vs 5 0.46), and its variablility between the stations is smaller. This might indicate that the dust, whose emissions are very sensitive to the wind speed, contribute to the PM 10 errors in winter. As suggested by refbterrenoire2015, the PM 25 overestimation is likely to be related to the overestimation of sulphate.
The EPA guidelines, (e.g. Boylan and Russel (2006)), define the performance goals and criteria to be attained by air quality models. The performance goal is attained for particulate matter when the MFE is less or equal to 50%, and |MFB| is less than 10 30%. The performance criteria are attained when the MFE is less or equal to 75%, and |MFB| is less than 60%.
The PM 10 simulation satisfies the performance goal for both summer and winter. As for PM 25 , it satisfies the performance goal in terms of MFB for both summer and winter and in terms of MFE in summer, but fails in terms of MFE in winter (M F E = 53.4%). However it does satify the performance criteria for both seasons.  Global Volcanism Program, 2013;Klüser et al., 2013). This volcanic eruption case provides a perfect testbed to evaluate the new abilities of the CHIMERE model to simulate as accurately as possible transport at hemispheric scale, including cases where the transported plume undergoes a complete circumpolar trajectory around the South Pole.

Model configuration
The meteorological simulation has been performed using the WRF meteorological model, version 3.5.1, on a simulation domain
-Including volcanic emissions of SO 2 and volcanic ashes

Volcanic emissions
The total mass flux emitted in the form of particles has been represented according to Mastin et al. (2009), using the following equation, where H is the column height expressed in km,V is the volume flux expressed in m 3 s −1 ,Ṁ is the mass flux in kg s −1 and ρ = 2500 kg m −3 is the ash density. The altitude of the ash column has been taken from Collini et al. (2013), and is reproduced here in Table 6. Only the fine fraction of the emissions, with particle diameter smaller than 63 µm has been included. The conversion from the total emitted mass flux has been performed using a conversion factor m 63 taken from Mastin et al. (2009) for S2 type volcanoes. The particles emitted with a diameter greater than 80µm have not been considered because they are not The granulometry of the ashes are taken as 80% in a coarse mode, with a lognormal distribution centered at 30 µm and 20% in a finer mode with a lognormal distribution centered at 4 µm, consistent with the results of Durant et al. (2009). 20 The SO 2 mass flux has been adapted from Theys et al. (2013) for the first 48 hours of the eruption. Since these authors do not provide an estimation for the subsequent part of the eruption, we completed the time series for SO 2 emission by assuming a constant ratio between the mass flux of the ash, provided by Mastin et al. (2009), and the SO 2 mass flux (see Tab. 7).

Anthropogenic and natural emissions
Anthropogenic and natural emissions have been modelled as described above. June 4, the model undergoes a spinup period, with the concentrations of gaseous and particulate species building up due to the emissions of sea-salt and anthropogenic contaminants (Fig. 8a). At the end of this spin-up period, significan AOD values, from 5 0.05 to 0.20 appear over the southern ocean from 30 to 70 • S, mostly due to sea-salt emissions, consistent with the findings of Jaeglé et al. (2011), and consistent with the satellite-based climatology of these authors, which represent a mean value about 0.15 in these areas. In the subsequent time steps, the volcanic ash plume from the Puyehue volcano becomes the dominant feature of the AOD structure in the southern hemisphere. While it is difficult to compare the simulated values to measured ones because of the large uncertainties on the mass flux and size distribution of the volcanic ashes, it is possible to compare 10 the modelled trajectory of the ash plume with spaceborne observations. For this purpose, we will rely on the space images and analyses provided by Klüser et al. (2013) and Global Volcanism Program (2013). Fig. 8b for June 6 at 12UTC (8AM local time) can be compared to Figure 2 of Klüser et al. (2013)  after the onset of the eruption, the initial direction of the volcanic plume is eastward, with a slight southward tilt, consistent with the CHIMERE simulations. On June 8 (Fig. 8d), the simulated pattern for ash transport also fits very well the pattern that is visible on Fig. 3 by these authors (upper left panel), with the initial portion of the ash plume travelling southward over the southern Atlantic and reaching towards the southern Pacific ocean over cape Horn, a pattern that is observed in both CHIMERE observations and the satellite observations. The older parts of the plume are located either off the Atlantic coasts of 5 Argentina, covering a large part of southern Brazil in the model but not so in the infrared AOD data provided by Klüser et al. (2013). Finally, the plume from the initial explosions are located at that time in the souther ocean, in-between the southern tip of the African continent and the Antarctic. It can also be observed that while the ash plume is continuous in the CHIMERE simulation, it is not so in the observations. This reflects the succession of explosive phases and quiet phases of the volcanic eruption, while the flux imposed to the CHIMERE model is continuous, as discussed in Boichu et al. (2013), who also present ash plume were located further to the South, close to the Antarctic peninsula, consistent with Fig. 8f. On June 14 and during the following days, the plume from the initial explosion of June 4 and the following days is overpassing the Puyehue volcano again, a fact that is correctly captured by the CHIMERE model.

Lidar observables
As described in Section 5.4, the main LIDAR observables are now provided as model outputs by the CHIMERE model. As an 5 example of the results of this treatment for the case of the Puyehue eruption, the zenithal and nadir simulated backscattering ratios are presented on Fig. 9. Even though a validation of these observables by confrontation to observations is not presented here, the simulated patterns are physically reasonable, with backscattering rations reaching maximum values in the lowest part of the aerosol plumes (for the zenithal backscattering ratio, Fig. 9a) and low values above due to absorption and scattering of the beam by the lowest part of the plume. Of course, the contrary is observed for the simulated space-based nadir backscatter 10 ratio ( Fig. 9b)

Conclusions
CHIMERE-2016 is a model version which presents several major improvements compared to the earlier version described in Menut et al. (2013a). The version presented here has the ability to include all types of emissions. Compared to the previous model version, anthropogenic emissions can be generated anywhere in the world from the HTAP emission inventory, as well 15 as mineral dust emissions, which were available only for North Africa and the Arabian peninsula in previous model versions.
With the same objective of permitting the use of the model in any part of the world and at any scale from urban to hemispheric scale, an important limitation of the model has been removed by improving the internal treatment of the transport on the sphere, allowing for domains up to the hemispheric scale, and possibly including a geographic pole. Much attention has also been paid to the physical processes, including a major update in the representation of the physical processes affecting the aerosols, as 20 well as the effect of the modelled aerosol on the photolytic reaction rates. Other efforts have been made to improve the user's experience with the model: this includes improvements in the parallelization of the model in order to reduce computation time, as well as providing key observable variables such as the Aerosol optical depth and LIDAR backscatter coefficients, which permits the user to compare the outputs of the model directly with the results of remote-sensing observations.
These improvements pave the way to many applications that were out of reach for the CHIMERE model up to now: 25 CHIMERE 2016 has the necessary abilities to give new insights on questions such as the radiative impact of aerosols on photochemistry, at all scales, from urban to hemispheric, including mineral dust emissions and deposition anywhere in the world. The possibility to run hemispheric simulations also allows the use of this CTM for the study of transport of aerosol and gaseous contamination plumes between the different continent within a hemisphere. It contributes to bridge the gap between global chemistry-transport models such as LMDz-INCA, MOZART or Geos-CHEM and regional models: while CHIMERE 30 has already been used successfully for the evaluation of the decadal trends in air quality over Europe (Colette et al., 2011), as shown by the study Xing et al. (2015) with the hemispheric version of CMAQ, hemispheric versions of regional CTMs are (a) (b) Figure 9. (a) Zenithal backscattering ratio as simulated by CHIMERE at 1000 nm wavelength above the city of Buenos Aires ; and (b), same as (a) but for the Nadir backscattering ratio tools that can be used successfully to study long-term trends in regional air quality with added value from models simulated in regional domains only because they can perform a consistent simulation over the entire hemisphere without relying on boundary conditions provided by global CTMs relying on different assumptions and parameterizations.

Code availability
The present article refers to the CHIMERE-2016 release, which is freely available and provided under the GNU general public Acknowledgements. For anthropogenic emissions, EuroglobalMap products includes Intellectual Property from European National Mapping and Cadastral Authorities and is licensed on behalf of these by EuroGeographics. Original product is freely available at www.eurogeographics.org.
The MACC boundary conditions data set was provided by the MACC-II project, which is funded through the European Union Framework 7 programme. It is based on the MACC-II reanalysis for atmospheric composition; full access to and more information about this data can 5 be obtained through the MACC-II web site http://www.copernicus-atmosphere.eu. We acknowledge C.Prigent for providing the global high resolution aeolian aerodynamic roughness length.