GEM-MACH-PAH ( rev 2488 ) : a new high-resolution chemical transport model for North American polycyclic aromatic hydrocarbons and benzene

Environment and Climate Change Canada’s online air quality forecasting model, GEM-MACH, was extended to simulate atmospheric concentrations of benzene and seven polycyclic aromatic hydrocarbons (PAHs): phenanthrene, anthracene, fluoranthene, pyrene, benz(a)anthracene, chrysene, and benzo(a)pyrene. In the expanded model, benzene and PAHs are emitted from major point, area, and mobile sources, with emissions based on recent emission factors. Modelled PAHs undergo gas–particle partitioning (whereas benzene is only in the gas phase), atmospheric transport, oxidation, cloud processing, and dry and wet deposition. To represent PAH gas–particle partitioning, the Dachs–Eisenreich scheme was used, and we have improved gas–particle partitioning parameters based on an empirical analysis to get significantly better gas–particle partitioning results than the previous North American PAH model, AURAMS-PAH. Added process parametrizations include the particle phase benzo(a)pyrene reaction with ozone via the Kwamena scheme and gas-phase scavenging of PAHs by snow via vapour sorption to the snow surface. The resulting GEM-MACH-PAH model was used to generate the first online model simulations of PAH emissions, transport, chemical transformation, and deposition for a high-resolution domain (2.5 km grid cell spacing) in North America, centred on the PAH data-rich region of southern Ontario, Canada and the northeastern US. Model output for two seasons was compared to measurements from three monitoring networks spanning Canada and the US Average spring–summertime model results were found to be statistically unbiased from measurements of benzene and all seven PAHs. The same was true for the fall–winter seasonal mean, except for benzo(a)pyrene, which had a statistically significant positive bias. We present evidence that the benzo(a)pyrene results may be ameliorated via further improvements to particulate matter and oxidant processes and transport. Our analysis focused on four key components to the prediction of atmospheric PAH levels: spatial variability, sensitivity to mobile emissions, gas–particle partitioning, and wet deposition. Spatial variability of PAHs /PM2.5 at a 2.5 km resolution was found to be comparable to measurements. Predicted ambient surface concentrations of benzene and the PAHs were found to be critically dependent on mobile emission factors, indicating the mobile emissions sector has a significant influence on ambient PAH levels in the study region. PAH wet deposition was overestimated due to additive precipitation biases in the model and the measurements. Our overall performance evaluation suggests that GEM-MACH-PAH can provide seasonal estimates for benzene and PAHs and is suitable for emissions scenario simulations. Published by Copernicus Publications on behalf of the European Geosciences Union. 2610 C. H. Whaley et al.: GEM-MACH-PAH: regional PAH and benzene modelling


Introduction
Polycyclic aromatic hydrocarbons (PAHs) are semi-volatile atmospheric pollutants that have numerous negative health effects (some are carcinogenic, mutagenic, and teratogenic) (Kim et al., 2013).Measurements of PAHs in North America are sparse in both time (typically 24 h averages, every 6 days) and space (limited surface measurement networks), yet show ambient concentrations that regularly exceed the Ontario provincial government's health-based threshold (Galarneau et al., 2016).Similarly, benzene is a gas-phase single-ring aromatic hydrocarbon, is a known carcinogen, and also exceeds atmospheric health-based guidelines (Galarneau et al., 2016).Accurate, 3-D modelling of PAHs and benzene can fill in the space-time gaps of the measurements, identify atmospheric processes that are responsible for the threshold exceedances, and simulate effects of emissions scenarios.
AURAMS (A Unified Regional Air-quality Modelling System) was an offline (meteorology from a weather forecast model used as an input), Eulerian 3-D chemical transport model (CTM) developed by Environment and Climate Change Canada (ECCC).In Galarneau et al. (2014), AU-RAMS was modified to include seven PAH species (phenanthrene, anthracene, fluoranthene, pyrene, benz(a)anthracene, chrysene, and benzo(a)pyrene -hereafter abbreviated to PHEN, ANTH, FLRT, PYR BaA, CHRY, and BaP, respectively).AURAMS-PAH included emissions, transport, gasparticle partitioning, oxidation of the gas-phase PAHs with OH, dry deposition, and wet deposition of the particle-phase PAHs.This model was able to accurately simulate the 2002 annual average PAH concentrations in North America when compared to 45 measurement sites, located in Ontario, the northeastern US, and California.However, the AURAMS-PAH gas-particle partitioning over-predicted the gas phase for the lighter PAH species, and was employed at relatively poor time and spatial resolutions.It was also missing two known PAH loss processes: the surface reaction of O 3 on particulate BaP (Kwamena et al., 2004(Kwamena et al., , 2007;;Ringuet et al., 2012;Keyte et al., 2013;Liu et al., 2014), and snow scavenging of gas-phase PAHs (Franz and Eisenreich, 1998;Daly and Wania, 2004;Lei and Wania, 2004;Skrdlíková et al., 2011).These missing processes, along with the coarse (42 km) spatial resolution, may have contributed to the differences between model results and measurements.Also, this model used PAH emission factors for mobile emissions which are now out of date for representing the modern vehicle fleet.
Other PAH CTMs include the following: GEOS-Chem (Friedman and Selin, 2012;Thackray et al., 2015), which is a global model; CMAQ, which was run on the Europe continental domain in Aulinger et al. (2007) and in North America in Zhang et al. (2016Zhang et al. ( , 2017)); and FARM (Flexible Air quality Regional Model) (Gariazzo et al., 2007), which is a regional model, applied for the Rome region in Italy.The most relevant of these model studies to our own is that by Zhang et al. (2016Zhang et al. ( , 2017)), whereby they ran CMAQ with 16 PAH species added, at 36 km resolution in a mainly US domain (that included parts of Canada and Mexico), evaluated their model results against NATTS measurements, and used their results to determine the cancer risk to the US human populations from various sources.
Therefore, the goal of this study is to update and improve ECCC's PAH modelling capabilities by using a more advanced model framework, updating emission inventories, and utilizing better process representation of PAHs than were used in AURAMS-PAH to allow better exploration of PAH processes and scenarios.To achieve this goal, GEM-MACH (Global Environmental Multiscale model -Modelling Air quality and CHemistry), ECCC's next generation, online air quality forecasting model (meteorology and air quality are predicted in the same code) was modified to include the same seven PAH species, as well as benzene.The paramaterization of PAH processes was improved in the following ways: (1) on-road mobile PAH emissions were updated with more recent data, to better represent the modern vehicle fleet; (2) gas-particle partitioning parameters were improved based on empirical results and analysis of AURAMS model output; (3) process representation for the on-particle O 3 -particulate BaP reaction was added to the model; and (4) process representation for in-and below-cloud wet scavenging (including scavenging by snow) were added for gasphase PAHs and benzene.Simulations using GEM-MACH-PAH were then carried out at high (2.5 km) spatial resolution in a small, but densely populated North American domain including southern Ontario, and most of the northeastern US (Fig. 1) for summer and winter of 2009.We refer to this region as the "Pan Am" domain because it was created for high-resolution air quality modelling during the 2015 Pan American Games in Ontario (Joe et al., 2017).This domain contains approximately 109 million people, including about 38 % of the Canadian population and 30 % of the US population.The model results were evaluated using measurements from a high spatial resolution campaign in Hamilton, Ontario (Anastasopoulos et al., 2012), as well as the binational Integrated Atmospheric Deposition Network (IADN), the Canadian National Air Pollution Surveillance network (NAPS), and the US National Air Toxics Trends Stations network (NATTS).We focus our model evaluation on spatial variations at high resolution, estimating the level of model sensitivity to uncertainties in the PAH emission factors, gasparticle partitioning, and wet deposition, which are all related to novel aspects of the GEM-MACH-PAH model.
The following sections will further describe the GEM-MACH-PAH model (Sect.2), the measurements used for evaluation (Sect.3), the results of the model evaluation (Sect.4), and the conclusions (Sect.5).

Model description
In the present study, we have modified ECCC's highresolution air quality forecasting model, GEM-MACH (hereafter called "GEM-MACH-PAH"), to include benzene and seven PAHs (in both gas and particle-phases) and have carried out 6 months of simulations in 2009 at the highest resolution (2.5 km grid cell size) in a North American domain yet reported for PAH simulations (to our knowledge).We have also tracked PAH wet deposition, and gas-particle partitioning, and have attempted to qualify model sensitivity to uncertainty in mobile emission factors, which has not been reported in other model studies.

GEM-MACH overview
GEM-MACH is an online, 3-D chemical transport model, which is embedded in GEM, ECCC's operational numerical weather prediction model (Côté et al., 1998b, a;Moran et al., 2010) (available online here: https://github.com/mfvalin?tab=repositories, last access: 3 July 2018).Online models such as GEM-MACH improve air-quality chemical prediction performance by reducing interpolation errors between different model coordinate systems and removing the input/output time and disk storage required for the transfer of meteorological input files to their offline CTM counterparts (e.g., Baklanov et al., 2014).The coupling to meteorology is a one-way process in this version, whereby chemistry does not influence the meteorology.More detailed description of the gas-, aqueous-, and particle-phase process representations of GEM-MACH, and an evaluation of its performance for common pollutants such as ozone, particulate matter (PM), and ammonia appears in Moran et al. (2013), Makar et al. (2015a, b), Gong et al. (2015), and Whaley et al. (2018b).Here we will focus on the model changes made to include PAH species and processes.
GEM-MACH is used to provide ECCC's twice daily, 48 h operational public forecasts of criteria air pollutants (ozone, nitrogen oxides, and PM), as well as the Air Quality Health Index (https://ec.gc.ca/cas-aqhi/, last access: 3 July 2018.).To reduce the computational burden for forecasting, the PM size distribution is represented using a simplified sectional treatment consisting of two size bins, a fine-fraction bin for particles with Stokes diameter from 0 to 2.5 µm and a coarsefraction bin for particles with Stokes diameter from 2.5 to 10 µm (Moran et al., 2010), with sub-binning used for those particle processes requiring a finer particulate size distribution.Here, we utilize the research version of GEM-MACH version 2, revision 2476, with this two-size-bin representation as our starting point for PAH modifications.The model grid used corresponds to a rotated latitude-longitude map projection with 2.5 km horizontal grid spacing and a hybrid vertical coordinate with 80-level vertical discretization spanning the atmosphere from the surface to 0.1 hPa.

Model modifications for benzene and PAH species
Our modifications to GEM-MACH include adding benzene and 7 gas-phase and 14 particle-phase (7 species × 2 size bins) PAHs to the species arrays.The gas-particle partitioning subroutine described in Galarneau et al. (2014) was also added, but with updated partitioning parameters (see Sect. 2.2.1).Since PAHs have very small concentrations relative to criteria air contaminants, as in Galarneau et al. (2014), we assume they do not have a significant effect on oxidant concentrations (O 3 and OH).Thus, the PAHs in GEM-MACH-PAH make use of the outcomes of the model's gas-and aqueous-phase chemistry in a diagnostic fashion for PAH oxidation.Processes in which the PAHs directly participate include advection, vertical diffusion, plume rise of major point source emissions, aerosol particle microphysics, in-and below-cloud scavenging, and dry and wet deposition of both gas and particle phases.Some of these processes and/or their controlling parameters were updated relative to Galarneau et al. (2014) and are described in the subsections below.Like AURAMS-PAH, the total (gas+particle) PAH emissions were treated as gas-phase emissions in GEM-MACH-PAH, since these quickly repartition between parti-

Gas-particle partitioning
As PAHs are semi-volatile organic compounds that partition between the particulate and gaseous phases of the atmosphere, their partitioning is a major determinant of their atmospheric fate (Bidleman, 1988).Despite decades of study (Junge, 1977;Yamasaki et al., 1982;Bidleman and Foreman, 1987;Pankow, 1987;Smith and Harrison, 1996;Dachs and Eisenreich, 2000;Lohmann and Lammel, 2004;Keyte et al., 2013), the mechanisms responsible for PAH partitioning and its spatiotemporal variability are not well-understood.AURAMS-PAH included two parametrizations to calculate gas-particle partitioning: Junge-Pankow, JP (Junge, 1977;Pankow, 1987) and Dachs-Eisenreich, DE (Dachs and Eisenreich, 2000), both of which, when applied for partitioning in AURAMS-PAH, assigned too much PAH mass to the gas phase.The two schemes resulted in surprisingly similar gasparticle partitioning (Galarneau et al., 2014).We carried out post-processing and analysis on the AURAMS-PAH model output from both schemes as well as the observations of gas and particle PAHs from the Galarneau et al. (2014) study, in order to determine which scheme to proceed with in GEM-MACH-PAH, and how it could be improved.Measured PAH partitioning typically takes the linear form of where K p,k is the partitioning coefficient for each PAH species, k: and C p , C TSP , and C g are the concentrations of the particulate PAH, the total suspended particles, and the gas-phase PAH, respectively.p • L,k is the sub-cooled liquid vapour pressure of the kth gas, and m k and b k are empirically derived coefficients.This linear relationship (where the logK p of all PAH species in any given measurement sample fall on a line relative to their logp L ) is common among homologous compound groups such as PAHs, polychlorinated biphenyls (PCBs), and polychlorinated dioxins and furans.However, the JP formulation only allows for m K = −1 (see Sect.S2 in the Supplement for more detailed information and a derivation).Conversely, observation-based estimates show a wide variety of |m K | values that are usually less than 1 (e.g., Fig. S2a in the Supplement), and this could be the reason why the AURAMS-PAH JP model results under-predicted the particulate fraction.Therefore, we proceeded with the Dachs-Eisenreich formulation in GEM-MACH, but with improved parameters.The Dachs-Eisenreich (DE) partitioning formulation was adapted from work examining water-sediment partitioning (Dachs and Eisenreich, 2000).The DE expression for K p (Eq. S2.2.1) is related to the octanol-air and soot-air (K SA,k ) partitioning coefficients, the latter depending on the soot-water (K SW,k ) and air-water partitioning coefficients.The soot-water partitioning coefficients are highly uncertain.Their values in the literature span two orders of magnitude for the same compound (Dachs and Eisenreich, 2000;Bucheli and Gustafsson, 2000;Jonker and Koelmans, 2002;Xu et al., 2012).K SW,k from Jonker and Koelmans (2002), was used in AURAMS-PAH.However, using the 2002 measurement data and their average m K , we have determined new K SW,k values based on ambient observations that improve the DE particulate fraction representation (see Sect.S2.3 in the Supplement for this process, and Table 1 for the original and new values).The purple boxes in Fig. 2 represent the results of the AURAMS-PAH partitioning module, making use of our new K SW,k values instead of the originals.
While the adjusted K SW values in Table 1 are significantly different from those in the original model (based on Jonker and Koelmans, 2002 as adjusted by the relative contributions of PM mass to the domain total in the inventory of Galarneau et al., 2007), particularly for lower molecular-weight species,  Dachs and Eisenreich (2000) 126 631 501 3160 Jonker and Koelmans (2002) 1. 86-37.2 3.80-12.6 5.01-91.2 5.13-89.7 40.7-182086-37.2 3.80-12.6 5.01-91.2 5.13-89.7 40.7- 32.4-3390 257-11700 Xu et al. (2012) ) 4. 79-5.13 12.9-19.1 1.86-37.2 30.9-40.7 631-794 389-575 2000-3390 Bucheli andGustafsson (2000) 2.57-58.9 2.82-16.227.5-155 they fall within the range of values found in the literature (e.g., Dachs and Eisenreich, 2000;Bucheli and Gustafsson, 2000;Jonker and Koelmans, 2002;Xu et al., 2012).The large correction and large range of published K SW,k values (which lack a temperature dependence) lends further support for the need to measure K SA,k directly on soots of atmospheric relevance so that they need not be estimated based on the highly uncertain/variable K SW,k .(2007,2014)).Below we outline the further modifications and updates that we made to this existing emissions database, in order to generate updated PAH emissions for modelling.PAH stationary emissions.Most of the PAH emission factors (EFs) used for the 2002 AURAMS-PAH model were compiled from the US EPA's Locating and Estimating Series (US EPA, 1998), the AP-42 document (US EPA, 1995), and the 1999 National Emissions Inventory (NEI99), (Galarneau et al., 2007).PAH EFs for stationary sources that were published between 1999 and the present are not substantially different from those already being used in SMOKE.For example, recent literature on agricultural burning (e.g., Dhammapala et al., 2007;Hall et al., 2012) reported EFs that were close (within a factor of two) to those already in the inventory.Only emissions from iron and steel production were updated to those in Odabasi et al. (2009) for electric arc furnaces, as the values used in Galarneau et al. (2007) were derived from literature published before 1990, and were 1-2 orders of magnitude larger (hence likely represented outdated (or absent) pollution control equipment).

Emissions
PAH mobile emissions.On-road mobile PAH emission factors in AURAMS-PAH were taken from NEI99 (Galarneau et al., 2007).The mobile emissions in this inventory may no longer be relevant as the values compiled were from an older (1990s) vehicle fleet.Therefore, we employed updated EFs for more current on-road mobile emissions in Canada and the US for 2009 modelling.Also, some off-road emissions, such as emissions from helicopters and marine sources (large ships) were not previously considered and were added to the inventory (from Chen et al., 2006 andAgrawal et al., 2008, respectively) in this study.
MOVES 2014, the latest version of the US EPA's motor vehicle emissions simulator (https://www.epa.gov/moves, last access: 3 July 2018) contains a more recent standard set of mobile EFs; these are separated into one set of factors for gasoline vehicles, based on one large study of vehicles in the US (Kishan et al., 2008), and one set of factors for diesel vehicles, based on another large study in the US (Khalek et al., 2009).In order to investigate whether these US values would be representative of conditions in Canada and whether only having these two fuel-type categories is adequate, when this would neglect studies that have reported different EFs for several different vehicle/fuel categories (e.g., cars, trucks, buses, motorcycles; light-or heavy-duty; gasoline or diesel), we compiled and researched PAH-to-TOG emission factors for these classes of mobile sources from over 30 recent (1999 to present) publications, as well as from the US EPA's SPECIATE v4.4 database (containing data from 1990 to 2012: https://www.epa.gov/air-emissions-modeling/speciate-version-45-through-40, last access: 3 July 2018).Please refer to Section C in the supplemental material for this mobile emission factor analysis.In this analysis, we found that the MOVES2014 EFs provided the best results in the model and they were selected for use in our simulations.

On-particle BaP-O 3 reaction
The only PAH oxidation reactions included in AURAMS-PAH were temperature-independent OH reactions with each gas-phase PAH species (Galarneau et al., 2014), which were also added to the GEM-MACH-PAH model.Temperaturedependent OH reaction rates were not pursued because Brubaker and Hites (1998)  oranthene has a slight temperature dependence, but the dependence was smaller than their error levels.Also, Friedman and Selin (2012) performed a phenanthrene sensitivity study with their model, and determined that including temperature dependence in k OH did not affect their mean non-urban midlatitude concentrations.
The AURAMS-PAH model overestimated BaP concentrations compared to measurements (Galarneau et al., 2014).This could be due to two O 3 -related factors: (1) particulate BaP measurements are known to be affected by on-filter O 3 degradation, causing measured particulate BaP measurements to be biased low (Menichini, 2009); (2) heterogeneous BaP degradation by O 3 in ambient air (Keyte et al., 2013) was not simulated in AURAMS-PAH, thereby biasing modelled concentrations high.Therefore, we added a particlephase BaP-O 3 reaction in GEM-MACH-PAH to account for the latter atmospheric process as described next.For the former, on-filter O 3 reaction, we have attempted to correct the measurements as described in Sect.3.
In GEM-MACH-PAH we used the Kwamena scheme (Kwamena et al., 2004) for the atmospheric on-particle BaP-O 3 reaction, as this scheme produced the best results in Friedman and Selin ( 2012)'s global model, and according to our sensitivity calculations, other schemes either overestimate (e.g., Pöschl et al., 2001) or underestimate (e.g., Kahan et al., 2006) the amount of BaP destroyed by this reaction.The Kwamena scheme was used because it produced BaP loss consistent with measurement studies (Ringuet et al., 2012;Jariyasopit et al., 2014;Liu et al., 2014).The reaction rate, k, is expressed as follows: where, k max = 0.060 ± 0.018 −1 and K O 3 = (2.8 ± 1.4) × 10 −15 cm 3 .The Kwamena scheme is expressed in the model as where δt is the model time step in seconds.This formulation does not depend on the particle size, but rather on the overall bulk particulate concentration, and the concentration of O 3 .

Dry and wet removal of PAHs and benzene
Gas-phase dry deposition follows a multiple resistance approach and single-layer "big leaf" approach (Wesely, 1989;Zhang et al., 2002;Makar et al., 2018) with a temperature dependency for Henry's law constants and for water solubility (Sander, 1999;Ma et al., 2010).Dry deposition of benzene and PAHs is also output by the model; however, measurements of the deposition flux of these species were unavailable during the study period.PAH particle-phase dry deposition is treated following Zhang et al. (2001), resulting in size-dependent particle deposition velocities.
Gas-phase benzene and PAHs undergo cloud and rain scavenging via Henry's law.Henry's law partition coefficients (K AW ) for the seven PAHs are proportional to inverse temperature.The mass of benzene and PAHs in the gas-phase (as opposed to the aqueous phase in cloud droplets and raindrops) is derived from and solving for m gas : where, m gas i is the initial mass of the gas-phase PAH before the Henry's law partitioning, and the remaining PAH mass is scavenged to the liquid rain or cloud phases (m aq ).Once the gaseous PAHs are scavenged to cloud-water (which is recalculated at each time step), they are subject to rain-out (cloud-to-rain conversion) process.At the end of each chemistry step, the fraction of the dissolved tracers not removed by the rain-out process will be returned to gas phase.For the fraction that do go into rain water, there is a parametrization in the rain scavenging code for re-evaporation of the rain before it reaches the ground.That fraction of PAHs would also return to the atmosphere.The remaining fraction that is not re-evaporated is counted towards the wet deposition in the model, which is irreversible (i.e., no re-emission of PAHs from the ground after they are deposited).
Note that Okochi et al. (2004) reported that assuming Henry's law equilibrium for benzene under-predicts the extent of wet-deposition.In the absence of a suitable alternative parameterization, we used Henry's law partitioning and obtained a conservative estimate of wet deposition for benzene.Note that benzene wet deposition is not evaluated in this paper as no measurements are available.
Where temperatures are < 0 • C below-cloud, or < −15 • C in-cloud, scavenging of gas-phase benzene and PAHs by snow and cloud-ice is done via surface adsorption following the formulation used in Franz and Eisenreich (1998), which was also used by Wania et al. (1999), Lei and Wania (2004) and Friedman and Selin (2012): In Eq. ( 7), W g is the gas scavenging ratio (equal to the concentration of PAH in snow over the concentration of PAH in air -both in moles m −3 ), K ia,k is the interfacial adsorption coefficient (equal to the mass adsorbed per surface area of snow to the atmospheric vapour phase concentration -both in ng m −3 ), SA is the specific surface area of the snow crystal, for which we use a constant 1 m 2 g −1 based on literature values for fresh snow precipitation, which are highly variable, and for which no clear relationship with temperature or wind speed has been found (Hoff et al., 1998;Hanot and Dominé, 1999;Domine et al., 2007;Hachikubo et al., 2014), ρ is the density of ice (0.917 g cm −3 ), and K ia,k is calculated from the following (Franz and Eisenreich, 1998): Eq. ( 7) is used to determine the fraction of PAH mass in the gas and snow/cloud-ice phases.
Particle-phase PAHs are treated as passive tracers that undergo wet removal along with the modelled aerosol particles (Gong et al., 2006;Wang et al., 2010).The cloud and precipitation processes above are applied sequentially in the model using operator splitting, and the amount of PAH deposited from wet deposition is output by the model.Table S4 in the Supplement provides all of the constants used in the model.

Model setup for two three-month simulations
GEM-MACH-PAH (rev2488) was run from 8 May to 13 August 2009 and from 18 October 2009 to 5 January 2010, where the first week from each period is treated as a spin-up period (for chemical concentrations to stabilize and for the initial condition effects to be negligible: e.g., Samaali et al., 2009), and were not used in our evaluation.The time periods were chosen to coincide with as many PAH concentration and deposition measurements as possible, while limiting simulation duration to reduce computational expenses.
The chemical initial and boundary conditions for the outer nest North American domain were taken from a 1-year MOZART simulation for all pollutants (Emmons et al., 2010;Pendlebury et al., 2017), except for benzene and PAHs.Initial and boundary conditions for PAHs and benzene were set to zero for the North American domain as its boundaries are generally away from PAH and benzene sources (e.g., over the ocean), and are also very distant from the Pan Am domain.
The simulations for the nested Pan Am region were run using the chemical initial and boundary conditions from the 10 km North American model run.
The model simulation was carried out in sequence of 27 h staggered simulations starting at 00:00 UTC, in order to reinitialize meteorology with the analysis at 10 km resolution.The first 3 h in the 2.5 km domain were discarded as spinup to reduce the dependency on the 10 km resolution meteorological initial conditions.Each 27 h simulation used the chemical concentrations from the end of the previous simulation as initial conditions for the next 27 h, and this sequence continued until each 3-month period was complete.
GEM-MACH-PAH was run in the two-size-bin mode to represent the PM size distribution, which means that particles fall in either fine mode (PM 2.5 -diameter 2.5 µm or less) or coarse mode (PM 10 -PM 2.5 -diameter 2.5-10 µm).

Measurement description
We compare the GEM-MACH-PAH predictions to all of the benzene and PAH measurements available in the Pan Am domain during the two time periods in 2009.These include a high spatial resolution urban measurement campaign in the Hamilton, Ontario region, as well as network monitoring stations from NAPS, NATTS, and IADN.Locations of PAH and benzene measurement stations are plotted in Figs. 3 and 4a.Note that all measurement stations were not equipped with oxidant removal technology; therefore, all measured PAHs, especially benzo(a)pyrene (which has the highest particulate fraction, and is the most reactive with O 3 ), would have had losses due to reaction with ozone on the filters (Menichini, 2009;Liu et al., 2014); thus, measured PAHs would be biased low compared to concentrations in ambient air.Accordingly, we have applied an O 3 correction to the BaP measurements in this study, as the literature suggests that the BaP sampling artifact is substantial, with around 20-72 % lost on average during sampling (Menichini, 2009;Liu et al., 2014).Note that our correction follows the linear method recommended by Schauer et al. (2003), which is dependent only on O 3 concentrations.However, other studies state that the O 3 degradation of BaP is more complex, with additional dependencies on the resident atmospheric lifetime of BaP (Goriaux et al., 2006), and relative humidity (Pitts Jr. et al., 1986;Umwelterhebungen and Gerätesichereit, 2002;Menichini, 2009).However, those studies did not provide an alternative correction equation.Therefore, in our results, we will present both the Schauer-corrected BaP measurements (for sites that had O 3 monitors nearby), as well as the original reported BaP from the measurements, given the lack of a better correction for the sampling artifact.

Hamilton measurement campaign
Ambient measurements of PM 2.5 and 16 PAH species were collected from a dense network of measurement sites in Hamilton, Ontario during June/July 2009 and December 2009.These measurements are described in Anastasopoulos et al. (2012), where they found a high level of intraurban variability for the PAHs; 3-4 times more variable than PM 2.5 concentrations.
There were 43 measurement sites operating during the summer period (see Fig. 4a), and 46 sites during the winter period.All measurements are from 2-week integrated time frames (24 June to 8 July and 2 to 16 December) taken with URG personal pesticide samplers, which collected gas and particle-phase PAHs less than 2.5 µm in diameter in 40 m 3 of sample air.The PM 2.5 measurements were made at the same sites using a three-stage Harvard cascade impactor.
The particle-phase PAHs (up to 2.5 µm in diameter) were collected on a Teflon filter, gas-phase PAHs were collected in polyurethane foam (PUF), and total (gas+particle) PAHs concentrations were reported in ng m −3 as determined by gas chromatography/mass selective detection.PM sample filter masses were determined by gravimetric analysis (Anastasopoulos et al., 2012).
O 3 measurements that were used to correct the Hamilton BaP measurements came from three monitoring sites in the Hamilton region ("Downtown", "Mountain", and "West") from the Ontario Ministry of Environment and Climate Change (MOECC) website for historical air quality data (http://www.airqualityontario.com/history/,last access: 3 July 2018).

National Air Pollution Surveillance program
NAPS is a Canadian program to provide accurate and long-term air quality data of a uniform standard across the country.NAPS is managed under a cooperative agreement between ECCC and the provinces, territories, and some municipal governments.There are currently 286 NAPS measurement sites in 203 communities located in every province and territory (https://open.canada.ca/data/en/dataset/1b36a356-defd-4813-acea-47bc3abd859b, last access: 3 July 2018).
Under this program, PAH samples were collected over 24 h, beginning and ending at midnight (local time; LT), typically every 6 days, with a sample volume range of 600-800 m 3 (Environment Canada, 2013).Benzene samples were collected in 6 L stainless steel canisters over 24 h, starting at midnight, every 3 days (Galarneau et al., 2016).
Within the Pan Am domain, total PAHs (gas+particlephases combined) and benzene were measured at eight NAPS sites (listed in Table S1 in the Supplement; Fig. 3), and their 2009 data were downloaded from the following url: http://maps-cartes.ec.gc.ca/rnspa-naps/data. aspx (last access: 3 July 2018).
For BaP measurement corrections, the NAPS network also measures hourly O 3 at four of these eight PAH/benzene sites (Windsor, Hamilton, Simcoe, and Egbert).Two of the missing sites (Toronto and Etobicoke) had nearby O 3 measurements from MOECC, but the last two (Burnt Island, and Point Petre), which are rural sites, had no O 3 measurements nearby.Therefore, BaP could only be corrected at six of the eight NAPS sites in the Pan Am domain.

National Air Toxics Trends Stations network
NATTS is a US program to monitor toxic air pollutants in accordance with the US Government Performance Results Act, which requires the US EPA to reduce the risk of cancer and other serious health effects associated with hazardous air pollutants (HAPS) by achieving a 75 % reduction in toxic chemical emissions in air, based on 1993 levels (US EPA, 2009).
Regulated under the Clean Air Act are 188 HAPS species including benzene and the 7 PAHs in this study.
Every 6 days, 24 h ambient air samples are collected starting at midnight (00:00 LT).Analysis of the samples is done by high-resolution gas chromatography/mass spectrometry (GCMS) selective ion monitoring (SIM) mode to get total (gas+particle) PAH concentrations, and benzene concentrations (Eastern Research Group, Inc., 2009).
There are 115 NATTS sites within the model domain (Table S1 in the Supplement, Fig. 3), but only 21 sites measured PAHs, while 113 sites measured benzene.Those data were downloaded from the following url: https://www3.epa.gov/ttnamti1/toxdat.html#data(last access: 3 July 2018).Measurement methods in NATTS are very similar to those of NAPS.
Since O 3 was not measured at the NATTS sites, NATTS BaP was corrected with the nearest O 3 monitor data found at the US EPA and CASTNET websites: https://www.epa.gov/outdoor-air-quality-data/download-daily-data (last access: 3 July 2018) and https://java.epa.gov/castnet/reportPage. do (last access: 3 July 2018), respectively.

Integrated Atmospheric Deposition Network
IADN was mandated by the 1987 Great Lakes Water Quality Agreement, and was initiated in 1990 to measure atmospheric concentrations of persistent toxic pollutants in the Great Lakes basin.There are nine IADN sites total within our Pan Am model domain, and they are listed in Table S1 of the Supplement (see also Fig. 3).Six of the nine IADN sites report gas-and particle-phase PAH atmospheric concentrations separately (labelled "PAHs" in Table S1), and a different set of six sites report wet deposition of PAHs (labelled "PAH wet dep" in Table S1) using sampled precipitation concentrations (Blanchard et al., 2005).Thus, these data can be used to evaluate the model's gas-particle partitioning and deposition, respectively.Benzene was not measured by IADN, nor was C TSP or PM 10 in 2009, with the unfortunate result that K p can not be directly calculated from the IADN measurements.O 3 was also not measured by IADN, necessitating the use of the nearest O 3 monitors in order to carry out the BaP oxidation correction.This latter step was only possible for observation stations at Cleveland and Chicago.The other four IADN air sites were rural/background locations, and did not have any O 3 measurements nearby.
PAHs were collected by a high-volume sampler for periods of 24 h beginning at 08:00 EST, every 12 days.At Canadian IADN sites, glass fiber filters and PUF sorbent collected the particulate and gaseous fractions, whereas the US stations collected PAHs with quartz fiber filters and XAD resin (Blanchard et al., 2005).Sample volume for the US method is about 800 m 3 , but is 400 m 3 for the Canadian method to minimize the breakthrough of volatile species during warm summer months (Blanchard et al., 2005).
Wet deposition of PAHs are measured with MIC-B precipitation collectors.The US stations used XAD-2 resin column cartridges for accumulating the organics on a 28-day cumulative basis, while the Canadian stations use a dichloromethane solvent extraction system, also on a 28-day cumulative basis.Both countries collect samples on a monthly basis.Note that one of the six wet deposition sites, St Clair, Ontario (STC), only had valid measurements during February 2009, which was not a time period simulated here.Therefore, only five IADN sites appear in our wet deposition analysis in Sect.4.4 below.

Model evaluation
In this section we evaluate GEM-MACH-PAH's performance for benzene and PAH surface concentrations, their spatial variation, gas-particle partitioning, and wet deposition.We also assess the sensitivity of the model output to PAH emission factors for mobile sources.

PAH concentrations in the Hamilton region
GEM-MACH-PAH output for gas+fine-PM PAH were compared to measurements of same from the 2009 Hamilton campaign (Anastasopoulos et al., 2012).Figure 4a shows a map of measured and modelled fluoranthene concentrations (14-day average) in the summer time period, as well as their differences and ratios.Here we see that GEM-MACH-PAH has captured intra-city variability, and that the differences between observations and simulated values are, at a maximum, 2.8 times too high.The model is biased low in the upwind/background areas of the city, and a high in the eastern areas of the city (Fig. 4a); this pattern is seen across all seven PAHs.The spatial pattern in the PAH bias is less apparent when PAH / PM 2.5 ratios are plotted (in ng µg −1 -shown in Fig. S6.1 in the Supplement) -removing the dependency on modelling PM correctly (since fractions of the PAH are particulate).Therefore, the spatial pattern in the PAH bias is mainly due to the pattern in the model PM bias, which is shown in Fig. S6.2 in the Supplement.
When the spatial variability is represented by the standard deviation over the mean (σ /mean), the model achieves very similar spatial variability to the measurements (Fig. 4b).The scatter plot of model vs measurements for summertime fluoranthene concentrations (Fig. 5a, FLRT selected as a good example) has a correlation coefficient (R 2 ) of 0.57, and the slope of the best-fit line is 0.96.The other PAH species had similar results, where, except for FLRT, the slopes and R 2 values were better in the winter than in the summer.The model bias (given as a model/measurement ratio) for all PAHs is shown as box and whiskers in Fig. 5b.Here we see that wintertime biases are smaller than those in the summertime for all PAHs except for ANTH.The four lightest PAHs (left side of Fig. 5b) have model/measurement ratios near one (except summertime PHEN), but the three heaviest PAHs, are biased high (except for wintertime CHRY).We will see this same pattern for the model bias (small for lighter PAHs, high for BaA and BaP) in the next sections as well.
BaA and BaP are the most reactive of the heavier species; thus, the lack of O 3 correction to the BaA measurements may be partially responsible for the model-measurement differences.However, mean O 3 for the three measurement stations in the Hamilton region was only 20-26 ppbv day −1 , which meant that the O 3 -corrected BaP was approximately 20 % greater than the reported BaP concentrations.The median BaP bias was brought down to 6.2 from 7.6 in the summer, and 5.5 from 6.3 in the winter -these are shown as the purple boxes in Fig. 5b.Additional reactions with BaA and BaP (e.g., with NO 3 ) are noted in the literature (Keyte et al., 2013), but were not included in GEM-MACH-PAH at this time, given larger uncertainties in those reactions.However, our model biases appear to indicate that those missing reactions may need to be considered for further model improvement.Indeed, a back-of-the-envelope calculation using a reaction rate reported in Liu et al. (2012) implied that the lifetime of particulate BaA with respect to a heterogeneous reaction with NO 3 (using mean nighttime concentrations of NO 3 from the model) was about 1 h.More research should be undertaken to determine the range of uncertainty for this reaction and whether it should go into PAH models.Additionally, Mu et al. (2018) suggest that the heterogeneous BaP-O 3 reaction should be temperature-, humidity-, and organic aerosol phase state-dependent (none of which were taken into account in the Kwamena scheme used in our work).However, it has been shown that the Kwamena scheme and the Mu scheme produce similar results in mid-latitudes (where our study is located) (Mu et al., 2018).Spring/summertime BaP would be minimally affected, as outdoor temperatures at that time of year resemble the room temperature laboratory conditions that the Kwamena scheme was based on.Additionally, our positive model bias would likely increase in the fall-wintertime, when low temperatures and humidity would increase BaP lifetime in the Mu scheme.
In order to remove the impact of the model's PM predictions on the PAH comparison, we also plotted the PAH / PM 2.5 model-over-measurement ratios (shown in Fig. S6.3).There we see all of the ratios reduced -which improves results for the heavier PAHs, but increases the low bias for the lighter PAHs.The reason the bias decreases for all seven PAHs is that the model PM 2.5 is overestimated by a factor of 2 in the summertime, and a factor of 1.4 in the wintertime (average across all sites in the Hamilton region).

PAH and benzene concentrations from the NAPS, NATTS, and IADN networks
Modelled 24 h-average total (gas+particle) PAH and benzene concentrations can be evaluated with measurements (taken every sixth day) from the NATTS, NAPS, and IADN surface measurement networks, which sample much of the model domain well (Fig. 3).As with the Hamilton evaluation, the model has very good agreement for seasonal averages at the monitoring network sites for benzene, phenanthrene, anthracene, fluoranthene, and pyrene, which all have model/measurement ratios (red and blue boxes) close to one, and their concentrations (green and orange boxes) overlapping in Fig. 6a.The 24 h average model (daily) and measurements (every sixth day) have been averaged over each of the 3-month time periods.BaA and CHRY are within a factor of 5 of the measurements in the summertime, but worse in the wintertime.BaP is overestimated by the model in both summer and winter by about a factor of 10, although the measurement-corrected BaP has a slightly reduced bias.We have not shown the O 3 -corrected BaP measurements in the plots because the changes are small, similar to the Hamilton plot (Fig. 5a).
When the model biases are examined more closely we find a few patterns to determine the cause(s).The following list summarizes some observations from our evaluation of each model-measurement pair (24 h averages, not seasonal averages): -By site -overestimates: all PAHs are significantly overestimated at the Kennedy Township, Pennsylvania (northwest of Pittsburgh) NATTS site (Fig. 6b).There appears to be a major emissions point source near that station that is emitting too much PAH in our model compared to reality.There are in fact hundreds of point sources in the emissions inventory that are within 20 km of Kennedy Township, but one in particular emits a relatively large amount of VOCs, and is associated with the "Secondary Metal Production; Aluminum; Raw Material Charging" source category, which has very large PAH-to-TOG EFs in Galarneau et al. (2007) because aluminum smelter emissions are largely particulate; therefore, expressing EFs as a large fraction of TOG was somewhat artificial.However, our results indicate that the PAH EFs for that PAH speciation profile (1036b) should be reduced substantially compared to Galarneau et al. (2007).In order to ensure that this facility did not begin operation after 2009 -which is a risk when using a 2011 inventory to model the year 2009 and would result in a large overestimation -we have further verified that the facility existed and was emitting similar VOC amounts in the NEI2008 inventory as well.
PHEN (Fig. 7c) and ANTH (not shown) are also greatly overestimated in New York City, although none of the other PAHs are biased particularly high there.However, we note that the measurements for New York City appear erroneously low, as the reported PHEN concentrations there are around the same magnitude as those in Underhill, VT (Fig. 7c), which is a background site near a national park.
Most PAHs are also overestimated at the Gary, Indiana site (Fig. 6b), which may also have a nearby major point emissions source that is too high compared to reality.Furthermore.the heavier PAHs (BaA, CHRY, and BaP) are overestimated at the Toronto Gage Institute NAPS site, but are only slightly higher there than the average model/measurement ratio for those species (not shown).
-By site -underestimates: all PAHs are markedly underestimated at the Liberty, Pennsylvania site (e.g., Figs.6b, and 7c), implying that there may be industry emissions of PAHs here that are missing, misallocated, or misplaced in the NEI2011 inventory, or an improper PAH speciation profile applied.Similarly, PAHs are un-  Geosci.Model Dev., 11, 2609-2632, 2018 www.geosci-model-dev.net/11/2609/2018/derestimated in Buffalo, New York, and at Franklin Furnace, Ohio.As these are not large cities, there may be industrial emissions that are not reported in the NEI2011 emissions inventory (or are reported at too low levels) -perhaps because those facilities shut down in 2010 (or installed emission control technology), which would mean the problem simply lies in using a 2011 inventory to model 2009.
However, when we further investigated stacks near Buffalo, NY, we found that the facility with the largest CO and VOC emissions had zero PAH emissions.This facility is associated with the generic process of "Primary metal production; By-product Coke Manufacturing", which did not have an associated PAH-to-TOG profile in Galarneau et al. (2007), because the source category codes that follow it (such as flushing liquor circulation tank, excess-ammonia liquor tank, tar dehydrator, tar interceding sump, tar storage, and so on) are not expected to emit PAHs to air.However, our results imply that the PAH speciation profile for "By Product Coke Oven Stack Gas" (0011b) would have been more appropriate for this facility and its use might eliminate the model bias near Buffalo in future studies.
-By month: all PAH species have lower mod/meas ratios in the summer than in the winter (shown in Fig. 6a by season for all PAHs and in Fig. 6c for FLRT by month) -implying that modelled hydroxyl radical (OH) and/or PM biases (which have strong seasonal cycles) are impacting modelled PAHs.For example, if model OH is too high in the summer, or too low in the winter, this would cause the U-shaped pattern that we see when plotting model/measurement ratio vs. month (Fig. 6c) and it would be particularly pronounced for the lighter, gas-phase PAHs, which it is.Another possibility is seasonal bias in the representation of atmospheric vertical stability: if the modelled stability is too low in the summer and too high in the winter, then winter emissions will tend to be trapped in inversions more than observed, and summer emissions will be diluted by excessive vertical mixing.However, evidence in Makar et al. (2010) and Stroud et al. (2012) suggest that model stability is too high (not too low) for the summer time period in those studies.
-By season: for the four lightest PAHs, the model/measurement ratios are < 1 in the summer, and > 1 in the winter (Fig. 6a).As mentioned above, this is likely due to modelled OH being too high in the summer and too low in the winter.BaA and CHRY follow a similar seasonal difference but do not straddle the ratio = 1 mark.
BaP, in comparison, has a model/measurement ratio that is slightly higher in the summer than in the winter (Fig. 6a).For BaP, the OH bias could be offset by an opposite O 3 bias in the model.Indeed, it has been shown (Makar et al., 2010;Stroud et al., 2012) that the processes in GEM-MACH cause urban surface O 3 to be too low in the summertime, due to insufficient vertical mixing and excessive titration from NO x , and surface PM tends to be too high in the wintertime due to an overestimation of wintertime atmospheric stability (e.g., lack of an urban heat island parameterization in the driving meteorology).These factors, together with the BaP measurement bias due to on-filter reaction with O 3 , may explain the high model BaP bias.
Thus, the generally high bias of modelled BaP may to be due to additive OH, O 3 , and PM model biases (plus the missing O 3 denuder technology in the measurements), impacting BaP more than the other species because BaP has the highest O 3 reactivity and the highest particulate fraction of the seven PAHs examined here.
When the five measurement sites mentioned in "By Site", above (Kennedy Township, PA; Gary, IN; Liberty PA; Buffalo, NY; and Franklin Furnace, OH) are removed from the NATTS analysis (because errors in their nearby emissions were identified), model-measurement correlation (R) and slopes improve.For example, the model vs. measurement best-fit line slope for PHEN doubles from 0.3 to 0.6 when those sites are removed, and its R increases from 0.16 to 0.32.The slopes and R values of the four heaviest PAHs all move from negative to positive.PYR has the largest improvement, going from a slope of −0.049 and R of −0.028 to a slope of 0.26 and R of 0.35.To the extent that the model prediction errors at the other sites may reflect emissions inaccuracy, having an accurate major point emission inventory for the time period modelled, along with proper PAH speciation profiles are extremely important requirements for modelling PAHs well at high resolution.The cases with large discrepancies mentioned above highlight the need to be as specific as possible when assigning source category codes to facility processes (which is difficult given that there are tens of thousands of point sources in the inventories).
In addition to emissions errors (for which fine temporal information is lacking), transport errors can cause poorer agreement between model and measurements at high spatialand time-scales.For example, PAH concentrations have sharp gradients and large dynamic range.Thus, if plume transport is off by a few kilometres, the result will be a very large difference between modelled and measured concentrations downwind of point sources.
That said, when doing a paired t test on all data in the model domain to examine whether the summertime and wintertime modelled averages are the same as the measured averages, we found that the model was indistinguishable from the measurements for all PAH species (t < 1 and p > 0.05), except for wintertime BaP (which has t > 1, and p < 0.05, even with the O 3 -corrected measurements).At finer time scales (e.g., daily model-measurement pairs) only modelled

Sensitivity of model to mobile emission factors
As discussed in the previous section, ensuring the accuracy of major point source emissions is important for modelmeasurement agreement near industrial locations.However, those major point source emissions tend to be located far from large population centres where human exposure is concentrated.In our inventory, mobile emissions make up 44,45,19,32,14,21, and 30 % of total PAH emissions, for PHEN, ANTH, FLRT, PYR, BaA, CHRY, and BaP, respectively, in our continental model domain, and studies have shown that the bulk of emissions within population centres is likely to originate from on-road mobile sector emissions (Dunbar et al., 2001;Pachón et al., 2013;Kuoppamäki et al., 2014;Miao et al., 2015).Thus, in order to accurately model ambient PAHs in urban centres, the uncertainty in emission factors from on-road vehicles may play a more significant role than major point sources.Therefore, we carried out 2-week sensitivity simulations (9-24 May and 18 October-2 November 2009) wherein the mobile emissions of PAHs were scaled by factors of 0.5 (halved) and 2.0 (doubled).This is approximately equivalent to the 25th and 75th percentiles in the range of emission factors found in the recent literature.
In Fig. 7, we show the surface PHEN time series from the measurements, base model run, and 0.5 and 2.0 scaled model runs at the IADN, NAPS, and NATTS sites.It is clear that -while a relatively small PAH source overall -changes to mobile emissions cause a large change in ambient PAH concentrations at certain urban locations, such as Philadephia (PA), New York (NY), and Burlington, Etobicoke, and Windsor (ON).
On average, there is about a 20-30 % increase in PAH concentrations when mobile emissions are doubled, and a 5-10 % decrease in PAH concentrations when mobile emissions are halved (Fig. 8, PHEN and BaP shown as examples) -with a larger sensitivity in the summer than the winter, and slightly larger sensitivity at NATTS (US) sites than at NAPS (Canada) sites.The predicted ambient concentrations generally follow the increase or decrease in on-road mobile emissions monotonically.

Gas-particle partitioning of PAHs
The IADN network also allows us to assess model predictions of gas-particle partitioning of PAHs at six sites (24 h averages, every 6 days).Figure 9 shows a time series of pyrene particulate fraction (φ k ).Both model and measurements show higher φ k in the wintertime, when there are higher PM concentrations for PAH adsorption, and lower temperatures.Generally, the model seems to underestimate φ k at background sites (e.g., Burnt Island), and overestimate φ k at urban sites (e.g, Chicago), and this is true for all PAH species.Thus, in Fig. 10, which shows the results for all PAHs at all sites, the model (green) has a larger range of φ k than the measurements (orange).This is caused by the model overestimating and underestimating PM concentrations at urban and rural sites, respectively.For example, wind-blown dust is not included in the model; however, it is known to be a potentially significant contributor to total PM in rural areas.Also, due to an underestimate of vertical mixing in the model, PM tends to be biased high in urban areas, near emissions, due to a lack of a parameterization for urban heat islands (Stroud et al., 2012).Comparing Fig. 10 (blue boxes) to Fig. 2b (green boxes), we see significant improvement over the original AURAMS-PAH partitioning due to the improved K SW parameters described in Sect.2.2.1.
Generally, the gas-particle partitioning scheme in the model results in model/measurement ratios well within an order of magnitude, given by the gray lines in Fig. 10.φ k for BaA and CHRY are still underestimated, but this may be related to modelled PM errors as noted earlier.We note that the addition of C TSP or even PM 10 measurements at IADN sites (which existed in 2002, but not in 2009) would allow for the calculation of measured partitioning coefficients (logK p , Eq. 2), which could be used to validate the modelled logK p in future work.Since K p takes total suspended particle into account, it removes the dependency on modelled PM; thus, it would increase confidence that the modelled partitioning is working properly, despite model errors in PM.
The fact that the GEM-MACH-PAH model partitioning of BaA and CHRY (and BaP to a lesser extent) puts too much concentration in the gas phase, may help explain why these species in particular are overestimated in the model.While in the gas phase, these species are less likely to be removed from the atmosphere, so their concentrations would erroneously build up in the model.

Wet deposition of PAHs
When compared to the IADN 1-month wet deposition measurements, the model generally overestimates wet deposition for all PAHs, as is shown in Table 2 and Fig. 11.In Fig. 11, the blue lines show the ideal 1 : 1 model/measurement ratio, and most of the data lie well above these lines.By site (Fig. 11a), the modelled wet deposition was slightly better at urban locations (Toronto and Cleveland) than suburban and PYR particulate fraction at all IADN stations background sites (Burlington, Sturgeon Point, and Point Petre).By month (Fig. 11b), the wet deposition from the model is best represented in June and July, whereas, wet deposition is greatly overestimated in the winter, implying that the current snow adsorption parametrization may be too effective at removing PAHs in the model.This may be due to our simplification of a constant snow surface area, which may be set too high, or due to inaccuracies in the modelled or measured precipitation.
The IADN measurements, which report the concentration of PAHs in the collected precipitation (in pg L −1 ), were converted to pgPAH m −2 in order to compare to the wet deposition output of the model.However, this conversion assumes that the volume of precipitation reported by IADN was the total precipitation in the container's cross-sectional area, and in fact, it is not.The IADN wet deposition collectors are actually known to not sample all of the precipitation because the samplers are not in the correct configuration to get an accurate precipitation measurement (Dryfhout-Clark, personal communication, 2017).When we compared the actual precipitation amounts (from separate meteorological rain gauge data) to IADN precipitation volumes at the Point Petre location in January 2009, we found that only 68 % of the total precipitation was captured by the wet deposition sampler.Therefore, if that correction factor were applied to all IADN wet deposition measurements, they would increase by a factor of approximately 1.5, which would improve our comparison, but not eliminate the total bias.If IADN sites added separate, accurate rain gauges, then we could apply a "precipi- tation correction" to the IADN wet deposition measurements in a thorough, consistent way in future work.
Aside from the measurement bias, the modelled PAH wet deposition bias will also be dependent on the model's overall ability to predict accurate precipitation.We compared the modelled daily accumulated precipitation to the precipitation measured with the accurate gauges at Burnt Island and Point Petre, and found that, while the model's median precipitation bias was only about 0.2 mm, there was a large standard deviation, and there were some incidences where the model greatly over-predicted high precipitation events.Those incidences would result in greater modelled wet deposition of PAHs than was measured, and because we sum over a month, there is a significant likelihood of an over-prediction occurring in that long time frame.Indeed, the median ratios in Table 2, which are less sensitive to high outliers than the mean is, are substantially lower than the mean for most species.
Therefore, the model bias in wet deposition would appear to be caused by three additive factors: (1) measurements themselves having a negative bias relative to reality, due to insufficient capture of the net fluxes of precipitation; (2) modelled precipitation being biased high; and (3) a positive model bias in atmospheric PAH concentrations (which was highest for BaA and BaP in particular).Other PAH models have reported similar overestimates of PAH deposition (Matthias et al., 2009;Friedman and Selin, 2012).
The reverse reasoning can be applied, whereby we can see if high atmospheric concentrations of BaA and BaP were caused by too little wet deposition.In this case, since both of these species have correspondingly high wet deposition in the model (Fig. 11), it would appear that underestimation of wet deposition is probably not one of the causes.
Figure 12 shows results from a sample month (June 2009) for pyrene (PYR).The spatial distribution of wet deposition was not captured, with the model predicting lower PYR deposition in Toronto and Sturgeon Point than the measurements, higher at Point Petre, and about equal at Burlington.This spatial pattern is not the same for all PAHs, and even differs by month for the same PAH (e.g., PYR deposition in the next month, July, is low at Point Petre, high at Toronto, and highest in Burlington).The lack of spatial or temporal pattern in the sign and/or magnitude of wet deposition biases indicates that there is no major error with the PAH scavenging scheme itself.Given that wet deposition of PAHs relies on the correct simulation of many model factors (meteorology, scavenging parameters, atmospheric concentrations, etc.), our work suggests that more process studies aimed at quantifying wet deposition are needed.In fact, other PAH models also overestimate PAH deposition (Matthias et al., 2009;Friedman and Selin, 2012).

Conclusions
Through this work, a high-resolution chemical transport model for North American air toxics was created that allows us to see variations within a densely populated area.GEM-MACH-PAH was developed and run at a 2.5 km resolution for air quality forecasting and for simulating the impacts of emissions scenarios.Relative to AURAMS-PAH, on-road mobile emissions, gas-particle partitioning, and scavenging were all improved in this study.Mobile PAH emission factors from different sources were evaluated and the MOVES 2014 factors achieved the best model results compared to those in the recent literature and in the SPECIATE database.Parame- ters used in the gas-particle partitioning scheme (particularly K SW,k ) were improved based on the observed relationship between logK p and logp, resulting in much better agreement between model and observations than was achieved with AURAMS-PAH.This is an important improvement because the gas-particle partitioning determines deposition and inhalation -both pathways of exposure in humans and ecosys-tems.Finally, we added snow scavenging, which was not a process included in AURAMS-PAH, and updated wet scavenging parameters.However, the modelled wet deposition was biased high -particularly in the wintertime -thus, further improvements to these parametrizations are required if the model is to be used for deposition studies.

Figure 1 .
Figure 1.North American model domain with 10 km horizontal grid spacing (green), and the nested "Pan Am" model domain with 2.5 km horizontal grid spacing (orange).

Figure 2 .
Figure 2. All-site ensemble of modelled/measured ratios of logK p (a) and particulate fraction (b) for each PAH.Shown are the "adjusted model" (purple, from Eq. S2.2.4) and the original AURAMS-PAH model (green).Box and whiskers: thick line is the median, boxes extend to the 25th and 75th percentiles, and whiskers extend to the minimum and maximum.
Figure 3. NAPS, NATTS, IADN, and Hamilton measurement sites in the model domain.

Figure 4 .
Figure 4. (a) Map of 2-week summertime average fluoranthene concentrations in Hamilton, Ontario: (left) from measurements and GEM-MACH-PAH model, and (right) their differences and ratios.Note the grey dots are measurement sites that had missing data during this time period.(b) Spatial variability in the Hamilton data in (left) summer, and (right) winter.

Figure 5 .
Figure 5. (a) GEM-MACH-PAH model vs. measurement scatter plot of 2-week summertime fluoranthene concentrations at 40+ sites in Hamilton.(b) Frequency distributions of GEM-MACH-PAH model/measurement ratios of PAH concentrations for the Hamilton model-measurement pairs for all sites from both summer and winter.Darker-coloured boxes are results from O 3 -corrected BaP measurements.

Figure 6 .
Figure 6.(a) Frequency distributions of GEM-MACH-PAH (green) and measured (orange) benzene (gas) and PAH (gas+particle) seasonal average concentrations at all IADN, NAPS, and NATTS sites.Modelled/measured concentration ratios also shown for summer (red) and winter (blue), with grey lines indicating agreement within an order of magnitude.(b) Modelled/measured concentrations for each daily model-measurement pair, separated by site (FLRT given as example); (c) same as (b) but separated by month.

Figure 7 .
Figure 7. Phenanthrene time series for the summer 2009 period for the IADN (binational), NAPS (Canada), and NATTS (US) networks.Orange represents the measurements, dark green represents the base GEM-MACH-PAH model, light green represents GEM-MACH-PAH with 0.5× mobile emissions, and cyan represents GEM-MACH-PAH with 2.0× mobile emissions.

Figure 8 .
Figure 8.Average percent change in surface PHEN and BaP concentrations by season when PAH on-road mobile emissions are scaled up or down by factors of 2.0 and 0.5, respectively.

Figure 10 .
Figure10.GEM-MACH-PAH modelled (green) and measured (orange) particulate fraction (φ) of all PAHs at all IADN sites, and their model/measurement ratios (blue).The blue line indicates the one-to-one line, and the grey lines are for ratios of 10 and 0.1.

Figure 11 .
Figure 11.GEM-MACH-PAH model/measurement wet deposition ratios for all PAHs (a) for five sites (all months) and (b) for four months (all IADN sites).

Table 1 .
Original (from Galarneau et al., 2014, based on Jonker andKoelmans, 2002)and adjusted (based on AURAMS-PAH modelmeasurement analysis for North America) K SW values, along with other K SW cited in the literature.All values are × 10 5 , and unitless.

Table 2 .
Mean and median GEM-MACH-PAH model/measurement ratios for PAH wet deposition.
Time series of pyrene particulate fraction at six IADN network sites (BNT represents Burnt Island, CLV represents Cleveland, IIT represents Chicago, PPT represents Point Petre, SBD represents Sleeping Bear Dunes, and STP represents Sturgeon Point).GEM-MACH-PAH values are denoted by (green dots) and IADN measurements by orange dots.