Current status of the ability of the GEMS/MACC models to reproduce the tropospheric CO vertical distribution as measured by MOZAIC

. Vertical proﬁles of CO taken from the MOZAIC aircraft database are used to globally evaluate the performance of the GEMS/MACC models, including the ECMWF-Integrated Forecasting System (IFS) model coupled to the CTM MOZART-3 with 4DVAR data assimilation for the year 2004. This study provides a unique opportunity to compare the performance of three ofﬂine CTMs (MOZART-3, MOCAGE and TM5) driven by the same meteorology as well as one coupled atmosphere/CTM model run with data assimilation, enabling us to assess the potential gain brought by the combination of online transport and the 4DVAR chemical satellite data assimilation. First we present a global analysis of observed CO seasonal averages and interannual variability for the years 2002– 2007. Results show that despite the intense boreal forest ﬁres that occurred during the summer in Alaska and Canada, the year 2004 had comparably lower tropospheric CO concentrations. Next we present a validation of CO estimates produced by the MACC models for 2004, including an assessment of their ability to transport pollutants originating from the Alaskan/Canadian wildﬁres. In general, all the models tend to underestimate CO. The coupled model and the CTMs per-Correspondence Using the technique biases none models, IFS-MOZART-3 Sensitivity tests reveal that deﬁciencies in the ﬁre emissions inventory and injection height play a role.


502
N. Elguindi et al.: Validation of MACC models fires might account for as much as 25% of the global CO emissions from all wildfires during anomalous years (Goode et al., 2000;Lavoué et al., 2000). Gases and aerosols emitted from large wildfires can be transported thousands of kilometers downwind. In addition, due to the strong convection enhanced by forest fire activity, emissions can be injected into the upper troposphere and lower stratosphere (Jost et al., 2004;Nedéléc et al., 2005;Damoah et al., 2006;Cammas et al., 2009) where the residence time is long, thus having lasting effects on radiation and stratospheric chemisty.
Numerous studies have used chemistry transport models (CTMs) to simulate CO (Shindell et al., 2006;Kanakidou et al., 1999;Prather et al., 2001). Shindell et al. (2006) show that the variability among models is large and that significant underestimation are found notably in the extratropical Northern Hemisphere (Shindell et al., 2006). Sources of uncertainties are diverse and include emissions inventories, injection height estimates which determine long-range transport and chemistry. Data assimilation can improve these deficiences and thus improve model forecasts. Reducing these uncertainties and improving CO long-range transport modelling was an important task of the GRG (Global Reactive Gases) subproject of the EU project GEMS (Global and regional Earth-system (Atmosphere) (Hollingsworth et al., 2008). Within this framework, the ECMWF's (European Centre for Medium-range Weather Forecast) Integrated Forecast System (IFS) model was coupled to three CTMs: MOCAGE Bousserez et al., 2007), MOZART-3 (Horowitz et al., 2003;Kinnison et al., 2007), and TM5  with data assimilation capabilities. In the GRG subgroup, it was particularly important to evaluate the added value and robustness of the satellite 4DVAR chemical data assimilation procedure in reducing model uncertainties, and to provide specific suggestions for improvement.
Our main objective is to present a global evaluation of the GEMS-GRG models compared to observations for the reference year 2004 in which GRG simulations have been performed. Specifically, we compare modelled CO profiles to observations taken on-board commercial aircraft as part of the MOZAIC (Measurements of ozone and water vapor by Airbus inservice aircraft) program (Marenco et al., 1998). This study is unique in that it allows us to evaluate and compare the performance of different types of models, namely three off-line CTMs driven by the same meteorology and one coupled atmosphere/CTM model run with data assimilation, enabling us to more definitively infer weaknesses in the CTMs and assess the potential gain brought by the 4DVAR chemical satellite data assimilation.
The year 2004 has also been chosen because of the occurrence of the large summer wildfires that burned in Alaska and Canada. Trace gases and aerosols emitted by these fires were transported as far away as Europe. A large number of observations were collected during this period as part of the International Consortium for Atmospheric Research on Transport and Transformation (ICARTT) program and have been a valuable source for many studies Pfister et al., 2006Pfister et al., , 2008Bousserez et al., 2007;Real et al., 2007;Stohl et al., 2006;Damoah et al., 2006;Cammas et al., 2009;Warneke et al., 2006;Turquety et al., 2007). Additional analysis products provided by ICARTT include model simulations from the FLEXPART Lagrangian particle dispersion model which includes full turbulence and convection parameterizations .
In addition to the global validation, we also present an assessment of the ability of the GEMS-GRG models to simulate and transport CO originating from the 2004 Alaskan/Canadian wildfires. To this end, we perform several case studies in which a CO plume originating from the Alaskan/Canadian wildfires was transported downwind as far as the eastern United States and across the Atlantic Ocean to Europe. Profiles of MOZAIC CO at several downwind locations are compared with model outputs. In order to attribute emission sources to the MOZAIC observations we utilize the backward FLEXPART model simulations performed by Stohl et al. (2005). Furthermore, sensitivity tests are performed using tracers to evaluate ways of improving the longrange transport in the models. In order to determine how sensitive the models are to the fire emissions used, a tracer simulation is performed using the daily bottom-up fire emissions inventory for North America in 2004 constructed by Turquety et al. (2007) and is compared to a similar tracer simulation using the GFEDv2 8-daily inventory (van der Werf et al., 2006). In addition, to test the sensitivity of the model to injection height, several tracers are injected at various model levels.
Because there is considerable interannual variability in global tropospheric CO largely due to variability in boreal forest fires, we begin our study by presenting mean seasonal vertical profiles of the MOZAIC CO averaged over the period 2002-2007, as well as profiles for the individual years, from several locations around the world. This allows us for the first time to present a climatology of the MOZAIC CO profiles, as well as to characterize the year 2004 which is the focus of this study.

Measurement data
CO measurements taken as part of the European funded MOZAIC programme (Measurements of ozone and water vapour by Airbus inservice aircraft) are used for model validation in this study. For more information about the MOZAIC programme see Marenco et al. (1998) or the website found at http://mozaic.aero.obs-mip.fr. For this study, we use vertical profiles of CO taken during the ascent and descent of aircraft at various airports. The raw data are averaged over 150 m height interval. The monthly statistical scores presented in this study are based on daily averaged profiles. The number of profiles per day varies among airports. For example, three aircraft equipped with MOZAIC instruments are based in Frankfurt, thus there can be as many as six profiles per day available for Frankfurt. However, only one aircraft flies from/to Paris and Vienna so normally there are only two profiles per day available at these airports. Using daily averaged profiles, rather than individual profiles, in calculating the statistical scores allows for the same weight to be given to all days. The number of profiles per day at a given airport is also determined by factors such as instrumentation failure or the daily aircraft routing by the airlines. As a result, there may be no profiles available on some days at a given airport. The numbers of days with available profiles at each airport used in this study are indicated on each graph presented.

GEMS GRG model simulations
The IFS model is a state-of-the-art numerical weather prediction model with 4D var data assimilation capacities . In this study we analyse a simulation performed with the IFS model coupled to the CTM MOZART-3 for the year 2004, hereafter referred to as ASSIM (details of the coupling can be found in Flemming et al., 2009). MOPITT V4 total column CO data (Deeter et al., 2003) are assimilated using ECMWF's 4D-VAR data assimilation system. The data are thinned to a resolution of 0.5 • × 0.5 • and are only assimilated over land between 65 • N and 65 • S. Averaging kernel information from the MOPITT data is used in the observation operator to calculate the model equivalent of the observation. The background errors statistics for the CO assimilation were determined with the NMC method (Parrish and Derber, 1992). For this, 150 days of 2-day forecasts were run with the coupled system initialized from fields produced by the free running MOZART-3 CTM, and the differences between 24-h and 48-h forecasts valid at the same time were used as a proxy for the background errors. A control simulation with MOZART-3 which uses no data assimilation, hereafter referred to as CTRL, is also analysed in order to assess the impact of data assimilation in the ASSIM simulation. The two runs use the same model version and input data such as emissions. The main difference between ASSIM and CTRL is that in the ASSIM runs the CO and O 3 fields are replaced every 24 h at 00:00 UTC by the respective analysis fields produced by the coupled system IFS-MOZART. Therefore the comparison of the runs will show the impact of the data assimilation.
In addition to the models mentioned above, we analyse simulations from the three stand-alone GEMS-GRG CTMs (MOZART-3 (MOZ), TM5-V10:version KNMI-cy3-GEMS (TM5) and MOCAGE (MOC)). A brief description of all models is given in Table 1. The version of MOZART-3 which was coupled to the IFS model and used for the ASSIM and CTRL simulations is more recent and has a slightly different configuration than the stand-alone version of MOZART referred to as MOZ in this study. The main upgrades include a higher horizontal resolution and different emision inventories (see Table 1 for details). It is worth noting that the total global anthropogenic CO emissions used in the ASSIM and CTRL simulations sum up to 686 Tg/y. Compared to the total for the emissions used in the CTMs (MOZ, TM5 and MOC), 755 Tg/y, it is 10% lower. Furthermore, the global fire emissions have also decreased from approximately 400 Tg/y in the CTMs to 300 Tg/y in the ASSIM and CTRL simulations. The impact of the differences in these emission inventories on the model biases are discussed later.
To perform the tracer transport simulations used for the sensitivity tests, we use the IFS model coupled to MOZART3 with the same set-up as the CTRL run. A lifetime of 50 days, similar to the lifetime of CO, is imposed on the passive tracer. For the sensitivity test comparing the fire emissions inventory, the tracers are injected at the surface as in the CTRL and ASSIM simulations. For the injection height sensitivity test, tracers are injected at the surface, 6 and 8 km and the Turquety emissions inventory is used.

FLEXPART model simulations
In order to attribute emission sources to the MOZAIC observations we utilize the backward model simulations for the summer 2004 performed by the FLEXPART Lagrangian particle dispersion model  at NOAA as part of the ICARTT program. For the simulations used in this analysis, the FLEXPART model was driven by meteorological fields from ECMWF on 60 model-levels and with a spectral resolution of T511. The derived gridded data has 1 • × 1 • resolution globally, but a 0.36 • × 0.36 • nest is used in the region 108 • W-18 • E and 18 • N-72 • N. For emission input, the emission inventory of the EDGAR information system (version 3.2, Olivier and Berdowski, 2001) on a 1 • × 1 • grid is used outside North America. Over most of North America, the inventory of Frost et al. (2006) is used. This inventory has a resolution of 4 km and also includes a list of point sources. Previous experience has shown that Asian emissions of CO are underestimated (probably by as much as a factor of 2 or more) in the EDGAR inventory, while American CO emissions maybe overestimated. For wildfire emissions of CO, the model uses a daily inventory which was compiled from daily burn areas provided by the Center for International Disaster Information and MODIS hot spot data (further details can be found at http://www.esrl.noaa.gov/csd/ICARTT/ analysis/DAILY FIRE EMISSIONS). Several simulations are performed using various injection heights in which the fire emissions are evenly distributed from the surface up to a certain model level (150 m, 1 km, 3 km, and 10 km).

Evaluation statistics
Since a large part of the GEMS project was devoted to model validation, much consideration was given to determining the most appropriate definitions of bias and error. The concentrations of atmospheric species can vary by orders of magnitude, thus an important criterion of the metrics was the use of relative (normalized) definitions. In bias assessment when the mean observation is used as the reference, there is an asymmetry between cases of under-and over-prediction. In order to avoid this asymmetry, the modified normalized mean bias (MNMB), which is a normalization based on the mean of the observed and forecast value, has been adopted as the most appropriate definition of bias within the GEMS/MACC project and is used in this study. The MNMB is calculated as follows, where f i and o i represent the model forecast and observed values, respectively. The MNMB is bounded by the values −200% and +200%.

MOZAIC CO profiles
We begin this study by presenting the characteristics of seasonal vertical profiles of MOZAIC CO data averaged for the period 2002-2007 from several airports. Based on the availability of data, the following 10 airports were selected to represent different regions of the world: Frankfurt and Paris for Europe, Beijing and Tokyo for East Asia, Caracas and Delhi for low latitude regions, Atlanta and Dallas for the US, and Abu Zabi and Cairo for the Middle East. Seasonally averaged profiles of CO for the whole period, as well as the profiles for the individual years, are presented in Figs. 1-4 for selected airports (other airports are shown in the online supplementary material). It should be noted that there is a large discrepency in the number of flights (as indicated on each graph) between the various airports as well as from year to year. Therefore, not all averages for each airport and year are statistically robust. Over Frankfurt (Fig. 1), CO concentrations are highest during DJF (approximately 300-350 ppb near the surface) and lowest during JJA (approximately 200-250 ppb near the surface) due to the seasonal variations of OH which is the main sink for CO. The largest interannual variability occurs during JJA and SON. This is mainly due to fire emissions as well as photochemical activity which is more favorable during these seasons. During the spring, some interannual variability is observed in the upper troposphere/lower stratosphere region (between 10-12 km), which is the period and location where extratropical stratosphere-to-troposphere transport is maximum. In JJA 2003, the anomalously high concentrations of CO due to the intense heatwave experienced in Europe, especially in August (Tressol et al., 2008;Ordóñez et al., 2010), are well represented in the data. Likewise, the high concentrations seen in SON 2002 are due to exceptional circumstances, namely the intense boreal forest fires which occurred over western Russia Yurganov et al., 2005;Kasischke et al., 2005). These examples nicely demonstrate how well the MOZAIC database can be used to identify CO anomalies throughout the entire troposphere. Figure 2 shows the CO profiles over Beijing which is one of the most polluted cities in the world. Note that the scale for Beijing ranges from 0-2500 ppb, unlike in the other plots where the scale ranges from 0-350 ppb. It is also worth noting that there are fewer flights available over Beijing (184) compared to Frankfurt (3801), thus the statistics are less robust. As over Frankfurt, the highest CO concentrations near the surface occur during DJF. The year 2004 was particularly bad with surface concentrations reaching as high as 5725 ppb during DJF. During the other two years in which flights were available (2002 and 2003), CO surface concentrations range between 1000 and 1500 ppb. Unlike Frankfurt, there is significant interannual variability during all seasons in the lower troposphere. This is probably due to the various and intense local to regional sources, however the small number of available flights for each year might also be a factor. Unlike the other cities, CO concentrations over Caracas (Fig. 3) are characterized by a very thick layer between 1 and 3 km throughout the year due to its particular location in a valley located 1000 m a.s.l. This layer is thickest during MAM when the average concentration reaches 225 ppb near the 2 km layer. The interannual variability is also greatest during MAM which corresponds to the regional biomass burning period. Despite the biomass burning period being in MAM, the surface concentrations of CO are maximum during fall, reaching 350 ppb. The year 2003 shows particularly high concentrations in both the lower troposphere and the upper troposphere during MAM. The year 2002 is also exceptional with a multi-layer CO plume below 2 km and maximum CO concentrations of more than 350 ppb during JJA and SON. As noted over Frankfurt, and attributed to the intense forest fires over western Russia, the period SON 2002 is also characterized by maximum concentrations throughout the troposphere. Although Caracas lies along way from this source it is possible that the region was also influenced by these intense boreal fires. At this time, we are not aware of any other anomalies which could have caused such an increase in CO throughout the troposphere. Average CO concentrations near the surface over Dallas reach 225 ppb during the winter and 175 ppb during the summer (Fig. 4). Compared to Frankfurt, there seems to be significantly more interannual variability throughout the troposphere, however, this may simply be due to the smaller number of flights available over Dallas. The year 2003 stands out as having particularly high CO concentrations in the lower troposphere throughout the year (except in SON), with concentrations around one standard deviation above the climatological average. As found at other locations, very high concentrations throughout the troposphere are present during SON 2002, reflecting the global impact of the boreal fires at this time.
Despite the Alaskan/Canadian wildfires that occured during the summer, globally the year 2004 had comparably lower CO concentrations. As we have selected this year to evaluate the models performance, it is an important point to keep in mind. Seasonal mean concentrations and standard deviations over all ten of the selected airports are given in the supplementary online material. These tables provide not only a quantitative reference for the global evaluation presented in the next section, but also an available reference for the wider community (modelling, satellite, regional air quality, etc.) for validation purposes.

Global assessment of modelled CO with MOZAIC data
In this section we compare estimates of monthly averaged CO from the stand-alone CTMs (MOZ, TM5 and MOC) and the coupled IFS-MOZART system to the observed CO measured near several airports during the year 2004. To assess the impact of the data assimilation, we compare the coupled IFS-MOZART simulation with full data assimilation (AS-SIM) to the control run with no data assimilation (CTRL).
It should be noted that the MOC CTM was only run for the months of January-September. The models are being compared with the profiles over the 10 airports representing the different regions of the world presented in the previous section. An important point to keep in mind while interpreting the results is that we are are comparing point data over cities to model grid boxes, thus we might expect some underestimation by the models particularly near the surface. The modified normalized mean biases (MNMB) for CO are calculated for different atmospheric layers for each month using daily averaged profiles from the various airports (Figs. 5-9). The different atmospheric layers are defined as follows: surface layer (< 950 hPa), boundary layer (950-850 hPa), free troposphere (850 hPa up to 1 km below the tropopause) and upper troposphere (1 km below the tropopause up to the tropopause, where the tropopause is defined as the highest level with a lapse rate lower than 2 K/km). In order to conserve space, figures are shown for only one of the airports from each of the five regions of interest, however we analyze and discuss results from all ten airports.  In Europe (Frankfurt and Paris), the models generally underestimate CO except in the upper troposphere where it tends to be overestimated (Fig. 5). Despite the fact that a more recent version of the CTM MOZART-3 was used to for the CTRL simulation, it still has a significantly larger bias than the stand-alone CTMs. This result can be largely explained by the differences in the emission inventories used by the models as discussed in Sect. 2. Nevertheless, the AS-SIM model greatly improves the simulation and reduces the biases by up to 50%, indicating that the data assimilation is compensating for deficiencies in the model (i.e. the emission inventories). Although the ASSIM model biases are similar to those of the CTMs (0 to −50% in the SL, 0 to −30% in the BL and and FT), if the same emission inventories had been used the ASSIM model would certainly have smaller biases. In general, the largest biases occur during the winter months when CO concentrations are maximum, while the smallest biases occur during the fall months when concentrations are minimum. Among the CTMs, MOC has the highest biases during the summer months, but performs much better during the first part of the year (January-April). Biases in the upper troposphere are mainly between ±25%, except for the CTRL model which has a larger negative bias (−25 to −50%).
Similar to Europe, CO is mostly underestimated in the free troposphere, boundary and surface layers, but with more negative biases that reach > 100% near the surface in some months over Beijing (Fig. 6). Again we can see the large impact that the data assimilation in the ASSIM model has on reducing the biases in the CTRL model. In some months, the biases are reduced by as much as 75%. In the free troposphere the CTM biases range from −30 to 10%. Although biases are slightly smaller over Tokyo (shown in the online supplementary material), they are still quite large with values > 50% near the surface during much of the year. The ASSIM model has much smaller biases than the CTRL model and despite the lower emissions, has less of a negative bias than the CTMs during several months in the surface and boundary layers. Perhaps this is because the assimilation is able to capture much of the pollution originating upwind in Northern China. MOC performs quite well in the lower troposphere over Tokyo compared to the other CTMs.
Regions at low latitudes are represented by Delhi in SE Asia (Fig. 7), and Caracas in tropical South America (shown in the online supplementary material). Biases here are less consistent than those for other regions. While Caracas shows a general underestimation of CO by the models in the free troposphere and surface and boundary layers, Delhi does not exhibit any general consistent model behavior. The biases are quite high throughout the troposphere over Delhi, reaching over 100% near the surface in some months. The improvements brought about by the data assimilation in the ASSIM model are clear in the boundary layer and free troposphere where biases are reduced by up to 50% compared to the CTRL model, however in the surface layer there is no evident impact.
Over the US (Atlanta, Dallas shown in supplementary materials), biases indicate that the models generally underestimate CO in the free troposphere and surface and boundary layers, as found over most of the other cities (Fig. 8). The data assimilation in the ASSIM model significantly reduces biases (up to 50%) compared to the CTRL model, especially during the winter and spring months when emissions are highest. For the CTMs, biases are mostly between −25 and −50% near the surface and between 0 and −25% in the free troposphere, over both Atlanta and Dallas. In this region during the spring, the biases of the ASSIM model are slightly smaller than the CTMs despite the lower emissions. This could be partly due to the better performance by the coupled model during the transitional spring season in the mid-latitude regions.
The Middle East region is represented by Abu Zaby (Fig. 9) and Cairo (shown in online supplementary material). The models again underestimate CO in the surface layer resulting in the strong negative bias. The CTRL biases in the boundary layer and free troposphere are between −40 and −60%, while the ASSIM biases are much smaller, only between ±10%. The CTM biases range from 0 to 30% in these layers.

Biomass burning signature in MOZAIC data
In this section we examine how well the stand-alone CTMs and the IFS coupled system with assimilation (ASSIM) can simulate the long-range transport of CO plumes originating from biomass burning during the 2004 Alaskan/Canadian wildfires at three downwind locations: Washington, Paris and Frankfurt. As in the previous section, we include the control simulation with no data assimilation (CTRL) in our analysis in order to provide some insight on the sensitivity to the assimilation process. We select four case studies, based on the availablity of FLEXPART model simulations, in which CO plumes have been transported from Alaska. First we present MOZAIC vertical profiles for each case study along with the FLEXPART diagnosis which supports the claim that the CO plumes did actually originate from the Alaskan/Canadian wildfires. Then we examine how well the stand-alone CTMs and IFS coupled system with assimilation are able to reproduce the CO plumes observed in the MOZAIC data.
In addition, following results from other studies which suggest that emissions from boreal forest fires can be injected as high into the atmosphere as the upper troposphere/lower stratosphere (Jost et al., 2004;Damoah et al., 2006;Leung et al., 2007), we investigate to what extent the injection height in the IFS model affects the long-range transport of fire emissions. Furthermore, to test how sensitive the model is to the fire emissions inventory, an additional simulation is performed using the inventory compiled by Turquety et al. (2007) for North American during the year 2004, rather than the GFED inventory.  (Fig. 10, top left). A CO plume is present in the MOZAIC data between approximately 3 and 6 km with maximum concentrations reaching 250 ppb. The FLEX-PART backward model run shows CO due to fire emissions present in concentrations of 40-160 ppb at this level. There is also another FLEXPART plume between 8-10 km with concentrations up to 160 ppb, although in the MOZAIC data this plume is higher and much weaker (110 ppb). The CO plume between 3-6 km is present in the FLEXPART simulations regardless of the injection height used, therefore this case is considered rather insensitive to injection height. The contribution of European emissions ranges from 0-30 ppb from 500 m to 2 km.
-CASE 2: measurements were taken on ascent from the Frankfurt airport on 22 July 2004 at 08:48 UTC takeoff time (Fig. 10, top right). A very deep CO layer exists between 4.5-7 km as well as a thin layer around 8.5 km with concentrations up to 225 ppb. The FLEX-PART backward model run with an injection height of 10 km simulates a CO plume in which the altitude range is in good agreement with MOZAIC observations, and in this case the FLEXPART results are very sensitive to the assumed injection height. Peaks of Flexpart CO for biomass fires is of around 160 ppbv, and when added to a tropospheric background of about 100ppbv makes CO in excess of 220 ppbv, which is in relatively good agreement with the MOZAIC profile. European emissions significantly contribute to CO below 2 km with a magnitude of 60 ppbv.
-CASE 3: measurements were taken on ascent from the Frankfurt airport on 23 July 2004 at 08:54 UTC takeoff time (Fig. 10, bottom left). The CO plume here lies between 3.5-5 km over Frankfurt with concentrations up to 275 ppbv. The FLEXPART backward model run leads to CO concentrations of up to 120 ppb between 4-5.5 km in the upper part of the CO plume seen in the MOZAIC data. According to the FLEXPART simulations, CO concentrations are sensitive to the injection height. The contribution of European emissions below 4 km range from 0 to 90 ppb.
-CASE 4: measurements were taken on descent into the Washington airport on 30 June 2004 at 17:00 UTC landing time (Fig. 10, bottom right). This case was also examined by Cammas et al. (2009) in a study involving the injection of biomass fire emissions into the lower stratosphere and its long-range transport. There are 3 distinct CO plumes present; the first between 2.5 and 4 km, the second between 4.0 and 6 km, the third between 6 and Fig. 10. Vertical profiles of MOZAIC CO for four case studies. The lines with filled circles rep resent MOZAIC data. The solid line represents CO from the FLEXPART simulation using an in jection height of 10 km, while the dotted, dashed and dash-dot lines corresponds to FLEXPAR simulations using injection heights from the surface up to 3 km, 1 km, and 150 m, respectively The long dashed line represents CO produced from regional anthropogenic emissions.
42 Fig. 10. Vertical profiles of MOZAIC CO for four case studies. The lines with filled circles represent MOZAIC data. The solid line represents CO from the FLEXPART simulation using an injection height of 10 km, while the dotted, dashed and dash-dot lines corresponds to FLEXPART simulations using injection heights from the surface up to 3 km, 1 km, and 150 m, respectively. The long dashed line represents CO produced from regional anthropogenic emissions.
42 Fig. 10. Vertical profiles of MOZAIC CO for four case studies. The lines with filled circles represent MOZAIC data. The solid line represents CO from the FLEXPART simulation using an injection height of 10 km, while the dotted, dashed and dash-dot lines corresponds to FLEXPART simulations using injection heights from the surface up to 3 km, 1 km, and 150 m, respectively. The long dashed line represents CO produced from regional anthropogenic emissions.
8 km. The CO concentrations within the plumes are around 150-190 ppb. The FLEXPART backward model run with an injection height of 10 km indicates that the CO mixing ratios observed in the Washington area originated from the Alaskan wildfires. The altitudes of the 2 layers of the North American biomass burning tracer transported by FLEXPART are well correlated with 2 of the 3 layers observed by MOZAIC. When a 10 km injection height is specified, maximum CO concentrations in the 7 km and 3.5 km altitude layers are about 115 and 70 ppb, respectively. None of the CO plumes exist, except for a very weak one between 3-4 km, when an injection height of 3 km, 1 km or 150 m is used, suggesting that this case is highly sensitive to injection height. CO resulting from American anthropogenic emissions are only present between 0 to 3-4 km, with concentrations from 50 ppb to 80 ppb near the surface.

Model comparison
The modelled and observed CO vertical profiles for each of the case studies are presented in Fig. 11. In case study 1 over Paris, the only CTM which is able to capture a small hint of the CO plume is MOZ. Although the concentrations in the MOZ plume are very weak and the layer is too thick and not well placed in comparison to the MOZAIC data, it is encouraging that the model is able to transport the CO emissions such long distances. One factor to keep in mind is the coarser horizontal resolution of MOC and TM5 (3 • × 2 • ) compared to MOZ and the coupled models (1.875 • ×1.875 • ) which inhibits their ability to represent small-scale plumes. The AS-SIM model does a slightly better job than the stand-alone MOZ model in terms of concentration, but the plume is still too weak and not well vertically distributed. The profile from the CTRL model with no assimilation also has a weak plume, indicating that the better transport brought about by using the meteorology from the ASSIM simulation and the higher horizontal resolution are playing a role. In case 2 over Frankfurt, only the ASSIM and CTRL models are able to capture the CO plume. Similarly to case 1, the ASSIM model does a slightly better job than CTRL, but the plume in both models are also still very weak in comparison to the MOZAIC data. In case 3 over Frankfurt, only the ASSIM model shows signs of a CO plume, although it is even weaker than in cases 1 or 2.
In case 4 over Washington, the CO plume is more complex with 3 distinct layers. From the 3 CTMs only MOZ shows signs of 2 weak plumes which to some extent match the 4 km and 7 km layer plumes in the MOZAIC data. The ASSIM model also shows weak signs of the multi-layer CO plume found in the data, whereas the CTRL model does not, indicating that it is the assimilation that is improving the long-range transport of CO.

Sensitivity to fire emissions
In order to evaluate how sensitive the IFS-MOZART model is to the fire emissions inventory we perform two tracer simulations, one using the 8-daily GFEDv2 inventory and another using the daily inventory compiled by Turquety et al. (2007). The Turquety fire emissions inventory was constructed using a bottom-up approach which takes into account the burning of the ground-layer organic matter stored in the soils, notably peat, which is quite important in boreal regions. They estimate a total of 30 Tg CO was emitted from the Alaskan and Canadian wildfires during the summer of 2004, of which 37% (11 Tg) was due to peat burning. The emissions from Fig. 11. Vertical profiles of modelled (colored) and observed (filled circles and black lines) CO for each case study. 43 Fig. 11. Vertical profiles of modelled (colored) and observed (filled circles and black lines) CO for each case study. 43 Fig. 11. Vertical profiles of modelled (colored) and observed (filled circles and black lines) CO for each case study. 43 Fig. 11. Vertical profiles of modelled (colored) and observed (filled circles and black lines) CO for each case study. 43 Fig. 11. Vertical profiles of modelled (colored) and observed (filled circles and black lines) CO for each case study.
both inventories are shown in Fig. 6 in the on-line supplementary material. The Turquety data cleary show a much higher CO emission rate than the GFED data, in large part because they have taken into account peat burning in their estimates. This, and the fact that the data are daily, have a significant impact on the long-range transport of CO. Tracer profiles from the two simulations along with the corresponding MOZAIC CO profile for the four case studies discussed in the previous section, and for four additional examples, are presented in Fig. 12. The solid lines represent tracers injected at the surface as in the CTRL and ASSIM simulations. The dashed and dotted lines are discussed in the following section regarding injection height. Although we can not directly compare the tracer plumes which only represent CO due to biomass burning to the observed profiles (black solid lines), the MOZAIC CO data serve as a proxy for the location and depth of the transported plumes.
In the first two cases over Washington, neither the GFED nor the Turquety tracer emitted at the surface show the presence of a significant plume. However, for the cases at Paris and Frankfurt, the Turquety tracer plume is clearly in better agreement with the observed CO plumes than the GFED plume. Despite Washington being closer to the sources of CO, the plumes seem to be better represented over Europe. One explanation maybe that the fire emissions preceeding  the Paris and Frankfurt cases in July (see Fig. 6 in the online supplementary material), were much more intense than those preceeding the Washington case and thus there was a greater quantity of CO transported downwind. In addition, the meteorological conditions and the intensity of the fires during late June may have been more favorable to higher injection heights (Damoah et al., 2006), and as a consequence, the model was unable to reproduce the observed plume at Washington when emissions were injected at the surface. This is supported by the fact that the FLEXPART simulation for case 4 (30 June over Washington) was also found to be highly sensitive to the injection height (see Fig. 10, bottom right). Note that the FLEXPART simulations also used fire emissions at daily resolution.
These results support findings from other studies which highlight deficiencies in current fire emission inventories for modelling purposes (French et al., 2004;van der Werf et al., 2006;Turquety et al., 2007). However, despite the clear improvement in using the Turquety data, the plumes in most of the cases are still notably weaker at the downwind locations over Europe than the observed CO plumes, except perhaps on 22 July when the plumes are quite deep.

Sensitivity to injection height
For the models used in this study, emissions were injected at relatively low heights in the atmosphere (see Sect. 2 for details). We performed simulations in which a tracer is injected over the wildfire regions of Alaska/Canada during the  The profiles of the tracers at the various injection heights are represented by the purple lines in Fig. 12. The impact of the injection height on the long-range transport of the tracer is variable. In some of the cases, the tracers injected at 6 or 8 km produce plumes with higher concentrations than the tracer injected at the surface. For example, for the two cases over Washington during late June, a plume is not observed when the tracer is injected at the surface (as noted in the previous section). However, plumes are evident when the tracer is injected at 6 and 8 km, although the location and depth of the plumes do not exactly match those observed. In the 26 June case, both the 6 and 8 km tracer plumes are located near the same altitude as the observed plume but are not as deep. In the 30 June case, the 6 km tracer plume matches quite well in location and depth to the observed lower plume but the middle and upper plumes are not represented. Contrarily, the 8 km tracer produces a multi-layered plume but it is considerably weaker than the one observed one. For the cases over Paris and Frankfurt on 22-23 July, the in-jection height does not seem to have an effect on the longrange tracer transport. The fact that the tracer concentration maximizes nearby the altitude of the CO plume at the downwind site regardless of the injection height could indicate that cloud convection and biomass fire emissions occur at the same time in the same grid mesh of the model, and that convection is contributing to the vertical transport.
In order to get a broader picture of the transport of tracers in the model we examine spatial maps and vertical crosssections of the different tracers on select days (Figs. 13 and 14). In comparing the spatial maps of tracer burden (integrated from the surface to approxiamately 100 hPa) on 30 June, we see that although the concentrations vary somewhat among the different tracers, the spatial pattern over North America is quite similar indicating that the surface tracer is getting transported downwind (Fig. 13). However, the concentrations for the surface tracer are considerably weaker than the 6 and 8 km tracer over the northeast US and Europe. The longitudinal vertical cross-sections show that the largest differences in tracer concentrations occur near the source region of Alaska and western Canada. This is expected since the tracers are injected at various heights here,  thus we see the largest concentration of the surface tracer in the lower troposphere and the largest concentration of the 8 km tracer in the mid-to upper-troposphere. Over the eastern US and Canada (50 • W-90 • W) and Europe (0-25 • E), the 6 km and 8 km tracers have quite similar concentrations while the downwind transport is considerably weaker. Similar maps of tracer burdens and longitudinal vertical cross-sections for 22 July are presented in Fig. 14. As in the other case, the overall spatial pattern is quite similar but the concentration varies among the tracers. The surface tracer concentrations are higher near the source region and lower further downwind than the 6 and 8 km tracers. The 8 km tracer concentrations are higher than the 6 km tracer concentrations along the US Eastern seaboard but surprisingly lower in the plume extending to the northwest of Europe. Nonetheless, in the tracer profiles shown in Fig. 12 the surface and 8 km tracer plumes appear to be deeper than the 6 km tracer plume. On a closer inspection of the 2-D spatial maps we see that the 8 km plume indeed extends farther into France and Germany, despite being less intense than the 6 km plume. Likewise, the surface tracer plume also extends farther into Europe.
While we clearly see enhanced long-range transport of the tracers with higher injection heights compared to the surface injection height, it is difficult to conclude whether the 8 km tracer is more representative of the transport of CO emitted from the biomass burning than the 6 km tracer. One factor not addressed in this study is the sensitivity of the plumes to the model's horizontal resolution. At about 5 km, the vertical extent of the layers is about 500 m so the injected mass is therefore already diluted to a rather larger volume and this continues on the transport way. A higher resolution would produce plumes which are more defined in their extent and of higher concentrations. In reality, there is considerable uncertainty associated with the injection height of emissions from boreal fires, as the heights vary with the intensity of the fire and the present synoptic conditions. Given the temporal and spatial variability of the injection height, a parameterization that mimics pyro-convective processes would be more accurate.

Conclusions
In the first part of this study we have presented profiles of CO using measurements made by MOZAIC aircraft on ascent and descent at various airports around the world. Based on data spanning 2002-2007, we present the first seasonal climatologies of CO from MOZAIC, and investigate the interannual variability. At most locations, the highest concentrations, as well as the largest interannual variability, occur during the winter season (DJF). The quasi-global impact of the intense boreal fires during the fall of 2002 documented in other studies Yurganov et al., 2005;Kasischke et al., 2005) is well captured by the MOZAIC data. Furthermore, the MOZAIC data show that the impact extends throughout the entire troposphere, illustrating the usefulness of the MOZAIC data in assessing the global impact of boreal forest fires and other events which have large-scale influences.
In the second part of this study we have presented a general global validation of CO estimates produced by the GEMS GRG models (3 stand-alone CTMs and the IFS-MOZART coupled model) using the MOZAIC data for the year 2004. Comparing the coupled model run with data assimilation to the control run has allowed us to quantify the potential gain brought about by using an online model with 4D-VAR data assimilation. We find that the CTMs tend to underestimate CO in the free troposphere and boundary and surface layers, while they overestimate CO in the upper troposphere. In general, the models perform best over Europe and the US where biases range from 0 to −25% in the free troposphere and from 0 to −50% in the surface and boundary layers. Compared to the CTRL model, the ASSIM simulation has significantly lower biases (up to 50%) in the free tropopsphere, surface and boundary layers, indicating that data assimilation is a very effective tool for compensating for model deficiencies such as biases in emission inventories.
The fact that the models tend to underestimate CO the most when and where emissions are highest (during the winter in the daytime and in the surface and boundary layers), suggests that the emission inventories are probably too low. Although part of the models underestimation, particularly near the surface, might be due to the fact that we have compared point measurements to model grid boxes, improvements in the estimation of the emissions are still necessary in order to properly evaluate the model performances. Nonetheless, the results presented here clearly indicate that data assimilation greatly reduces the model biases. A more comprehensive multi-year validation planned for the future will be useful in further assessing the improvements due to data assimilation.
Finally, in the last part of this study we assessed how well the GEMS GRG models were able to simulate and transport CO originating from the Alaskan/Canadian wildfires during the summer of 2004. Several case studies were analysed to see if the models could transport the CO plumes downwind to the eastern US and Europe. Overall the ASSIM model performed better than the other models, however, the CO plumes were still much too weak in terms of concentrations and not always at the correct altitude in comparison to the observed profiles, showing that the method used for assimilation does not provide enough information about the vertical profiles and is therefore not sufficient to compensate for other model inadequacies. A sensitivity test using the Turquety inventory showed that the emissions play a significant role in the model's performance. The Turquety inventory has a daily resolution and takes into account peat burning which results in higher emissions. This led to an overall better representation of the downwind CO plume in most of the cases when compared to simulations using the GFEDv2 inventory. These results are in agreement with other studies which have reported deficiences in current fire emissions inventories (French et al., 2004;van der Werf et al., 2006;Turquety et al., 2007).
Another factor contributing to the model's poor representation of the CO plumes is the low injection height. While results from the sensitivity test indicate that in some cases using a higher injection height can improve the transport of the CO plumes downwind, in other cases the impact is not evident. One possible explanation for this inconsistency is the fact that in reality there is considerable variability associated with the injection height of emissions from boreal fires, depending on the intensity of the fire and the present synoptic conditions. Therefore a parameterization which is based on these factors would be most accurate. However, we can not rule out the possibility that there are other factors in the model, such as mass conservation in the advection scheme and numerical diffusion, which inhibit the longrange transport. The models' horizontal and vertical resolution also affects their ability to represent small-scale plumes. It is likely that increasing the model's resolution would improve the simulation of these plumes.