Assimilation of OMI NO2 retrievals into the limited-area chemistry-transport model DEHM (V2009.0) with a 3-D OI algorithm

. Data assimilation is the process of combining real-world observations with a modelled geophysical ﬁeld. The increasing abundance of satellite retrievals of atmospheric trace gases makes chemical data assimilation an increasingly viable method for deriving more accurate analysed ﬁelds and initial conditions for air quality forecasts. We implemented a three-dimensional optimal interpolation (OI) scheme to assimilate retrievals of NO 2 tropospheric columns from the Ozone Monitoring Instrument into the Danish Eulerian Hemispheric Model (DEHM, version V2009.0), a three-dimensional, regional-scale, ofﬂine chemistry-transport model. The background error covariance matrix, B , was estimated based on differences in the NO 2 concentration ﬁeld between paired simulations using different meteorological inputs. Background error correlations were modelled as non-separable, horizontally homogeneous and isotropic. Parameters were estimated for each month and for each hour to allow for seasonal and diurnal patterns in NO 2 concentrations. Three experiments were run to compare the effects of observation thinning and the choice of observation errors. Model performance was assessed by comparing the analysed ﬁelds to an independent set of observations: ground-based measurements from European air-quality monitoring stations. The analysed NO 2 and O 3 concentrations were more accurate than those from a reference simulation without assimilation, with increased temporal correlation for both species. Thinning of satellite data the use of constant


Introduction
Chemistry-transport models (CTMs) are widely used for forecasting air pollution, evaluating proposed emission reductions, studying chemical or physical processes, and assessing climate-scale effects and forcings related to atmospheric components (Jacobson, 2005). Modelled concentrations are often highly uncertain, and can be improved in several ways, such as better parameterisation of sub-grid scale processes, more accurate estimates of forcings at the lateral and lower boundary conditions, higher spatial resolution, higher order numerical methods, and more accurate initial conditions. Data assimilation (DA) involves estimating initial conditions by combining previous forecasts with recent observations (Kalnay, 2003).
In recent decades, satellite retrievals of atmospheric constituents have complemented observations from groundbased monitoring stations (Martin, 2008). Satellite retrievals provide concentration estimates for the total vertical column, for a partial column (e.g. the troposphere) or at a range of vertical levels, and they cover a far greater geographical range across the planet compared to ground-based measurements of surface concentrations. They therefore present great potential for use in "chemical DA" (i.e. DA for CTMs). For a comprehensive review of chemical DA, see Carmichael et al. (2008).
Optimal interpolation (OI) is the one of simplest DA algorithms currently applied to CTMs; it is based on a leastsquares formulation of the DA problem. While the assumptions underpinning OI are relatively crude, this algorithm is simple to implement and may be computationally cheaper than other more sophisticated DA methods, provided that neither the number of observations nor the number of model variables is too large. In meteorology, OI has long been surpassed by variational or Kalman filtering methods (Kalnay, 2003), however it is still in use with chemistry-transport models. For example, Mok et al. (2008) used OI with a Gaussian puff model for sulfur dioxide over Lisbon, Portugal. Adhikary et al. (2008) and Matsui et al. (2004) applied OI to assimilate satellite retrievals of aerosol optical depth when modelling aerosol concentrations over Southeast Asia and the eastern United States, respectively.
In the case of off-line CTMs, a small perturbation in the initial conditions will typically decay as the simulation proceeds, mainly due to forcing from sources and sinks such as chemistry and emissions Wu et al., 2008). Thus the quality of the initial conditions is less critical in air quality modelling than for numerical weather prediction (NWP) models, where perturbations tend to grow with time. In the case of short-lived chemical species, the duration of the initial perturbation may be quite brief (e.g. one day) and this limits the extent to which better initial conditions can improve forecasts. Chemical DA can, nonetheless, be used for historical re-analysis.
The conceptual and practical simplicity of OI makes the algorithm a reasonable starting point for use of DA in CTMs. Wu et al. (2008) compared four different DA methods (OI, two types of Kalman filter, and four-dimensional variational assimilation) applied to ozone forecasting. They demonstrated that OI, although a relatively simple method, was comparable in performance to the more advanced and computationally intensive variational and Kalman filter methods.
This study concerns the assimilation of satellite-derived estimates of tropospheric concentrations of nitrogen dioxide (NO 2 ). NO 2 plays an important role in atmospheric chemistry. In the stratosphere, it is involved in catalytic cycles that destroy ozone (O 3 ); in the troposphere NO 2 is a key O 3 precursor, especially in polluted urban environments (Seinfeld and Pandis, 2006, Chapters 5 and 6). NO and NO 2 interconvert and atmospheric lifetime of NO x varies from hours to days at the surface to a couple of weeks in the upper troposphere (Seinfeld and Pandis, 2006, p. 224). There is also substantial seasonal variation; Schaub et al. (2007) estimated the lifetime of NO x to be around 3 h during summer and 13 h during winter. Thus NO 2 is a relatively "local" pollutant. It is also of interest from a human health perspective; for example, exposure to NO 2 has been linked to reduced lung function, asthma and increased mortality (reviewed by Searl, 2004).
In this study we make use of tropospheric NO 2 column concentrations, derived from measurements by the Ozone Monitoring Instrument (OMI) aboard the NASA satellite AURA (see Sect. 2.1). These NO 2 retrievals have been studied in a number of contexts. They have been used to reestimate NO x emission rates (Zhao and Wang, 2009). They have been validated against ground-based measurements (Lamsal et al., 2008), spectrometers (Ionov et al., 2006), aircraft campaigns (Boersma et al., 2008). They are a resource for validation of air quality models (Huijnen et al., 2010) or comparison with retrievals from other satellites (Boersma et al., 2008). Furthermore, they have be used to study particular pollution or emission reduction events (Wang et al., 2007).
In this article, we assimilated retrieved tropospheric NO 2 columns from OMI into a limited-area CTM with a threedimensional OI scheme. We describe how the background error covariance matrix was parameterised based using the difference between paired simulations (the "NMC method" of Parrish and Derber, 1992). We ran a number of simulations relating to different treatment of the observation error statistics. The analysed concentration fields are compared to ground-based observations of NO 2 concentrations (see Sect. 3.2). In Sect. 4, we discuss these results in the broader context of forecasting and chemical DA.

OMI retrievals
Tropospheric NO 2 concentrations were retrieved from radiances measured by the Dutch-Finnish Ozone Monitoring Instrument (OMI) aboard the NASA satellite Aura. Aura's orbit is sun-synchronous, crossing the equator between 13:30 and 14:00 local time, passing over Europe shortly after. The retrieval scheme is described in Boersma et al. (2002Boersma et al. ( , 2007. Retrievals from pixels with a cloud radiance fraction in excess of 50 % were excluded or with a surface albedo greater than 0.3, as recommended by Boersma et al. (2011). We used the DOMINO version 2.0 (produced September 2010) of the level 2 retrieved tropospheric column NO 2 concentrations. The retrieval process yielded a measure of the estimate's uncertainty. The retrieved total column was found to be highly correlated with the associated uncertainty measure (R 2 = 0.83 for the year 2005 for retrievals within the DEHM Geosci. Model Dev., 6, 1-16, 2013 www.geosci-model-dev.net/6/1/2013/ domain). Around 20 % of total column estimates were negative, and these were also included despite the non-physical nature of the quantity; excluding them would lead to a positive bias over regions with low NO x concentrations, and also because they represent uncertainty intrinsic in the retrieval process. The OMI data represent a different temporal and spatial resolution compared to that of the CTM used in this study (see Sect. 2.2). The model domain covered Europe, and satellite readings in this area were available several times a day, usually few hours before and after 12:00 UTC. Multiple images are produced per satellite overpass. The spatial resolution of the CTM used in this study is approximately 50 km × 50 km across the model domain. Resolution of the OMI images varies across the camera's swath. Nadir pixels are 13 km × 24 km, while pixels furthest from nadir are 13 km × 128 km.
Finally, OMI data from the outermost (i.e. first and last) row and column of the CTM's 96 × 96 grid were not assimilated, thus providing a buffer for potential boundary effects (e.g. see Fig. 13).

Chemistry-transport model: DEHM
The Danish Eulerian Hemispheric Model (DEHM) is an offline, Eulerian, three-dimensional, long-range CTM (Christensen, 1997;Frohn et al., 2002;Brandt et al., 2012). The model simulates atmospheric transport and diffusion, chemical transformations, wet and dry deposition, and emissions from a range of biogenic and anthropogenic sources.
Version V2009.0 of DEHM was used in this study: this is the version developed for the 2009 annual report for the Danish Air Quality Monitoring Programme (NOVANA; Ellermann et al., 2010). In the present configuration of the model, the horizontal domain was spatially discretised with a 96 × 96 grid using a polar stereographic projection. In the vertical, the model extends from the surface to 100 hPa in 20 vertical layers using terrain-following σ -coordinates. This version of the model describes a total of 58 gaseous chemical species and 9 classes of particulate matter. The chemistry scheme is similar to that used in the European Monitoring and Evaluation Programme (EMEP) model (Simpson et al., 2003).
For all but one of the CTM simulations presented here, meteorological parameters (e.g. wind speed, temperature, pressure) were calculated by the Eta mesoscale NWP model (Janjic, 1994). In the remaining simulation, the MM5 (V3.7) NWP model (Grell et al., 1995) was run to provide meteorological inputs. In both cases, the meteorological initial conditions were taken from the NCEP FNL (Final) Operational Global Analysis dataset (available at http://dss.ucar.edu). The simulations with MM5 meteorology involved a hemispheric domain, with two-way nesting over Europe; the horizontal grid-spacing for the hemispheric and European domains were 150 km and 50 km, respectively, at 60 • N (Fig. 1a). The simulations with Eta meteorology did not include nesting, and the domain was rotated with respect to the European MM5 nest. The horizontal grid-spacing for the Eta-based simulations was also 50 km at 60 • N.
Over Europe anthropogenic emissions were based on the EMEP emission inventory (Vestreng and Klein, 2002), and elsewhere RCP2.6 emissions (Lamarque et al., 2010) were assumed. Natural emissions were based on the GEIA inventory (Benkovitz et al., 1996), including NO x emissions from lightning and soil. For wildfire emissions, the RETRO database was used (Schultz et al., 2007). Aircraft emissions are not accounted for in DEHM.
The extended continuity equation is split into several subequations, which are in turn solved sequentially (Lanser and Verwer, 1999). Horizontal advection is solved via "accurate space derivatives" (Dardub and Seinfeld, 1994), and by applying Forester and Bartnicki filters to resolve, respectively, spurious oscillations and negative mass (Forester, 1977;Bartnicki, 1989). Finite elements with linear shape functions are applied to vertical advection. Time integration for the advection is solved using a third-order Taylor series expansion. Diffusion is solved using a combination of the finite elements method and the θ-method (e.g. Morton and Mayers, 2005, Sect. 2.10). The chemistry solver involved a combination of a second-order, two-step, variable step-size backwards differentiation formula (Verwer et al., 1996) and the Euler backward iterative method (Hertel et al., 1993). Lateral boundary conditions are either free or fixed, depending on the wind direction at the boundaries -see Frohn et al. (2002) for further references and details.

Assimilation scheme
The algorithm presented here will be referred to as V1.0 of the three-dimensional OI scheme for DEHM (as distinct from the two-dimensional scheme described in Frydendall et al., 2009).
The assimilation was performed once an hour, if any retrievals were available in the model domain in the previous hour. No special treatment was used to account for the time discrepancy between the model and OMI data, due to short interval between assimilation cycles. Let x b be the NO 2 concentration field (expressed in units of 10 15 molecules per cm 2 per model level) and y be the retrieved OMI NO 2 tropospheric column concentrations (expressed in units of 10 15 molecules per cm 2 ). Let B and R be the error covariance matrices for x b and y, respectively. Let H be the linear transformation from the model space to the observation space. Then the analysed field x a is estimated (Kalnay, 2003, pp.150-156) by (1) Given available computational resources it was not possible to solve Eq. (1) as posed. Indeed, the background covariance B has dimension n x n y n z = 184 320, and would thus require 126 GB of memory at single precision. The following steps were taken to obtain analysis increments. First, B was implemented algorithmically (i.e. calculating only values required for a given operation). Second, H was represented as a sparse matrix, and multiplication by H exploited this. As described below, H involved interpolation to the nearest grid point, and this increased the sparsity the matrix (compared to bi-linear interpolation, for example). Third, R was treated as diagonal, and thus it was only necessary to add to the diagonal elements of HBH . Fourth, the system HBH + R was symmetric positive definite, thus the Cholesky decomposition could be used. Fifth, the analysis increment (x a − x b ) was calculated by multiplying each term by the corresponding vector on the right. In other words: The most computationally demanding step was Eq. (4), which would require (n x n y n z )(n z n obs ) operations if we only exploited the sparsity of r 2 . The product Br 2 was approximated by truncating covariances to zero if the corresponding correlation was below a threshold of 10 −4 , which greatly reduced the work required for this calculation. A final approximation was to process observations in batches of no more than 1000 at a time, and the analysis from assimilating one batch of observations was used as the background for the next batch (as in Houtekamer and Mitchell, 2001). This is an approximation since sequential assimilation of observations is only strictly valid when data with correlated observation errors are processed in the same batch (Houtekamer and Mitchell, 2001) -this point is discussed further in Sect. 4. Each step of the assimilation scheme was parallelised where possible, and a year-long simulation with assimilation and observation thinning (discussed later) took about 8 % longer than the corresponding reference run without assimilation.
The assimilation step involves an additive increment to the concentration field, and in a large fraction of assimilation cycles, negative values were present in the analysis. Such values are not only unphysical (representing negative concentrations), but may cause further problems in other components of the CTM. Consequently, after the assimilation step any negative values in a given layer were (somewhat arbitrarily) set to the lowest non-zero concentration present in the background field at this layer. This problem is distinct from the negative concentrations due to numerical artifacts of certain higher-order advection schemes, where mass-conservation is required (Bartnicki, 1989).
The background error covariance matrix is parameterised as where subscripts i and j refer to two grid points in the threedimensional model domain, and subscripts m and h are the indices for the hour of day and month of the year. All parameters were estimated separately for each month of the year and hour of the day, to allow for diurnal and seasonal Geosci. Model Dev., 6, 1-16, 2013 www.geosci-model-dev.net/6/1/2013/ patterns in the concentration field, and the associated error statistics. The background error standard deviation for grid point i is termed σ i;m,h . The subscript l i denotes the model level of grid point i, and r l,l ;m,h the correlation between pairs of grid points at levels (l, l ) with zero horizontal separation. The term ψ(d i,j , L l i ,l j ;m,h ) represents the horizontal correlation between two grid points, and is modelled based on the assumption of horizontally homogeneous and isotropic correlations. This models correlations with a second-order autoregressive function (Balgovind et al., 1983), where d i,j is the horizontal separation between points i and j , and L l,l ;m,h is the correlation length scale for pairs of grid points at levels (l, l ). The estimation of the parameters σ j ;m,h , r l i ,l j ;m,h and L l i ,l j ;m,h is presented in Sect. 2.4. For the decomposition of B into its variance and correlation components, see Kalnay (2003, p. 151).
The linear observation operator, H, calculates the modelequivalent of the OMI tropospheric retrievals. For simplicity only the DEHM column closest to the OMI pixel was used, thus each column of H is non-zero for at most n z = 20 values. The DOMINO product includes averaging kernels, which provide the estimated sensitivity of the retrieval to NO 2 at different layers of the atmosphere (Eskes and Boersma, 2003). The averaging kernels tend to increase with altitude; thus while NO 2 concentrations are highest at the surface (over populated regions, at least), the tropospheric retrievals are far more informative for the free troposphere. The ratio of the tropospheric and total-column air-mass factors was used to convert from total-column to tropospheric averaging kernels (Boersma et al., 2011). The DOMINO averaging kernels are valid at model levels of the chemistrytransport model TM4 (Meirink et al., 2006), which provides the a priori profiles for the retrieval, and thus it was necessary to calculate a weighted average, for each DEHM level, of the averaging kernels (weighted by the proportion of the corresponding TM4 layers overlapping the DEHM level).
The observation matrix, R, was treated as a diagonal (i.e. assuming that observation errors are uncorrelated). The choice of observation error variances is discussed in Sect. 3.1.

Parameter estimation
Parameters for the background error covariance matrix were estimated using differences between paired simulations, also known as the NMC method (Parrish and Derber, 1992). Two DEHM simulations were run, forced with meteorology from the Eta (Janjic, 1994) and MM5 (Grell et al., 1995), respectively; these two simulations will be respectively referred to as Ref. Eta and Ref. MM5. The simulation ran for 2005 and the three-dimensional modelled NO 2 concentration field was stored each hour. We will denote the concentration fields from the two simulations as C andC. Estimation of parameters was handled separately for each month and each hour of the day, due to the large annual and diurnal cycles in atmospheric concentrations and lifetimes of NO 2 . For each time point, t, differences between the paired simulations, were centred at zero and used to calculate the standard deviations (Eqs. 7-9). The · denotes the averaging operator, and T (t) the set of time points in for the same month and hour as time point t (i.e. the number of elements of T (t) is equal to the number of days in the month). The correlation between points i and j can then be calculated by Eq. 10.
The above is based on part of the error covariance estimation procedure described in Kahnert (2008).
Estimation of the remaining parameters required in Eq. (6) (the horizontal length scale and inter-level correlation) was based on a sample of 5000 randomly chosen pairs of grid points in the horizontal domain with separation distance range from 0 km to 2500 km. Pairs of grid points were chosen to ensure a roughly even distribution of horizontal separation distances. For each pair of points, sample correlations were calculated for each pair of vertical levels (l, l ). The interlevel correlation r l,l ;m,h was estimated as the mean correlation between the points with zero separation distance; we note that r l,l ;m,h = 1 by definition if l = l .
For each pair of vertical levels (l, l ), the length scale was estimated by fitting Eq. (6) to the sample of the estimated correlations. The fitting involved minimising the objective function where subscript k refers to the k-th the pair of grid points i k , j k (with points i k , j k positioned at levels l, l , respectively), and K = 5000 is the total number of sample correlations (illustrated in Fig. 3c). In many cases, the apparent correlation decay length scale was quite short (e.g. less than 100 km), due to the short life-time and consequent high-resolution spatial heterogeneity in NO 2 concentrations. In these cases, correlations corresponding to short length scales were of most interest, and hence pairs were down-weighted by distance (using the (d i k ,j k + 1) 2 normalisation factor). In most cases (i.e. different months, hours, pairs of levels), the correlation model fitted the available data well and the estimation procedure yielded consistent results, in the sense that realistic patterns were observed in the parameter values. These patterns are discussed below and illustrated in Figs. 4-6.
However in some cases, the parameter estimation gave apparently unrealistic results. In particular, it was difficult to estimate correlation decay length scales for pairs of levels if there was also very low vertical correlation between the layers. Also, the upper-most model layer (and to some extent, the layer below this) appeared to show artificially low standard deviations and negative correlations with other layers, which was due to strong damping from the upper boundary condition. Three steps were taken to correct for such artifacts. First, background standard deviations for the uppermost layer were replaced with a copy of those from the layer below. Second, if r l,l ;m,h < 0 and l < l then set r l,l ;m,h := r l−1,l ;m,h . Third, correlation length scales between two levels (which showed the highest number of unrealistic artifacts) were modelled as a weighted sum of the original estimated length scale and the sum of the length scales for the individual levels, weighted by the square of the inter-level correlation.
While the fitted correlation captures the overall behaviour of the sample correlations, there is evidence that the covariance model described by Eqs. (5) and (6) was not optimal. In particular, the sample correlations showed evidence of horizontal heterogeneity and anisotropy. These features can be seen in sample correlations between a particular grid point and all other grid points ( Fig. 3a and b). More sophisticated covariance modelling or use of a stochastic assimilation technique, such as the ensemble Kalman filter, would be required to properly account for this. The use of time-varying length scales and background covariances appears to be further justified by evidence of diurnal and seasonal variation in these parameters. Figure 4 illustrates variation in the parameter r l,l ;m,h . Vertical correlations are strongest in the lowest model layers (i.e. within the boundary layer). During the summer, the diurnal cycle is far more prominent, due to strong daytime vertical mixing from surface convection and to the effects of the photochemistry. In winter, the boundary layer is more often stably stratified, resulting in weaker correlations with the upper model levels and stronger vertical correlations between the lower model levels.
In general, estimated correlation length scales (L l,l ;m,h ) tend to increase with altitude and also with the separation between vertical layers (Fig. 5); while this latter point was to some extent enforced by the correction to the estimated length scales, the trend was also seen in the uncorrected estimates. This increase in length scale with height reflects the longer atmospheric lifetime of NO x and the increase in wind speed. During the summer months, horizontal length scales can be seen to decrease somewhat around layer 10, due to substantial diurnal variation in the boundary layer height, which is diagnosed differently by the Eta and MM5 models (discussed in Sect. 3.2). Longer length scales at the surface are seen during the winter months. A diurnal pattern in L 1,1;m,h is most pronounced during the summer months, as NO 2 can accumulate at night whereas it is dissipated during the daylight hours due to O 3 formation and an increasing boundary layer. The background error standard deviations (σ i;m,h ) are partly determined by the magnitude of the concentrations observed in each grid-cell (Fig. 6). This can be seen in higher estimated error variances in areas of high NO x emission density (e.g. the Benelux region). During the winter months, NO 2 concentrations exhibit substantial temporal variation (Fig. 7), and this is reflected in the background errors (as they were estimated by the standard deviation of a time series of values with one value per day). The vertical profile of σ i;m,h tends to increase over the first 6-10 model levels and then decay, following the trend seen in the modelled concentrations (in terms of mass NO 2 per model level) at each level; this is in spite of the general decay of modelled NO 2 mixing ratio with altitude (Fig. 12a). This is due to the fact that the mass of NO 2 per unit area per model level is the product of the mixing ratio (which decays with height), the density of air (which decays exponentially with altitude) and the thickness The estimated background errors exhibit artifacts (appearing as ripples in Fig. 6a, c and d) from the interpolation from the MM5 grid to the Eta grid (Fig. 1a). These are most clearly visible when shown on a logarithmic scale (very little spatial structure is visible when the same data are plotted with a linear scale). No filtering was applied to correct for this.

Experiments
Along with the two simulations used for the background covariance estimation, which are denoted Ref. Eta and Ref. MM5, three simulations were run using the assimilation procedure described above (summarised in Table 1). They were designed to explore the treatment of observation errors and the problem of correlated observation errors.
Satellite data can be classified as level-1 products (calibrated and collocated radiances) and level-2 products (derived geophysical quantities, such as temperature, humidity, concentration of trace gases). The process of retrieving level-2 information from level-1 data involves an inversemodelling framework similar to that used in data assimilation, requiring an a priori estimate of the state of the atmosphere. This results in correlations between observation errors; the assumption of independent errors in level-1 data fits much better than for level-2 products (Kunzi et al., 2011). The best treatment of correlated observation errors for data assimilation is still an open research question. The most common approaches used in operational meteorology are the assumption of uncorrelated errors in conjunction with "thinning" (discarding observations to reduce the data density), "superobbing" (assimilating the average of a group of nearby observations) or observation error variance inflation (Stewart et al., 2008). If observation errors are indeed correlated but are treated as independent in the assimilation scheme, then to achieve an appropriate balance between observation and background errors the observation error variances must be artificially inflated. A final and promising approach is to model the observation error correlations explicitly (Stewart et al., 2008), however this is beyond the scope of the present study.
The assimilation experiments addressed two different aspects of the problem of correlated errors. First, the OMI observations were at a much higher spatial resolution than the model's resolution: a DEHM grid-cell represents a roughly 50 km × 50 km region, whereas the OMI pixels were as small  Figure 13 illustrates the spatial distribution of the total column concentration, as well as the effect of the assimilation. Panels a and b present, respectively, the background and analysis (for simulation Exp. 2) projected into observation space, while panel c shows the OMI retrievals. The grid-averages of these data were calculated for the month of July 2005, averaging over all assimilation cycles in this period. Only observations used in the assimilation (i.e. passing the quality control and thinning) were included in the averages presented, thus no data are available over areas of persistent ice cover or in the outermost rows and columns of the model domain.
The OMI field shows a greater contrast between "clean" and "polluted" regions (n.b. retrievals in some remote regions, such as over oceans, may be negative even after averaging). The grid-averaged retrievals also show greater spatial heterogeneity than either the averaged Hx b or Hx a fields. As we would expect, the analysis appears as a merger of the background and the observations. However one can also see the results of the ripple-like interpolation artifacts from the background error correlations (see Sect. 2.4); these are most apparent in relatively unpolluted areas.
The observation errors used in the assimilation experiments were based on the reported errors in the DOMINO product. As stated in Sect. 2.1, the retrieved tropospheric column concentration was highly correlated with the associated error. This is may be an appropriate description from a measurement perspective, however it may be problematic in an assimilation context. If the higher retrieved values are assigned higher observation errors, then these data will have less influence on the analysis, resulting in negatively biased analysis increments. In simulations Exp. 1 and Exp. 2, the Geosci. Model Dev., 6, 1-16  observation errors used were those reported in the DOMINO product, whereas in Exp. 3, the observation error standard deviation was fixed at a constant value, chosen to be the median of the tropospheric column errors reported in the DOMINO product for the given month.
To assess the balance of the background and observation errors, we conducted an a posteriori validation of the assimilation using theχ 2 metric (Ménard and Chang, 2000), defined in Eq. (12), where p is the number of observations available at the assimilation step.
If the standard assumptions hold (unbiased background and observations, Gaussian errors) and if the background and observation error covariances adequately describe the residuals (i.e. the difference y −Hx b ), then the statisticχ 2 should follow a χ 2 1 distribution with 1 degree of freedom. It then follows from the properties of the χ 2 1 distribution that χ 2 ≈ 1.0. This provides an a posteriori check of whether the observation and background errors are appropriately specified.  Theχ 2 metric was calculated for simulations Exp. 1-3 for each month, and are shown alongside the theoretical distribution statistic (Fig. 2). The background errors were determined, as described above, by the difference between paired forecasts. Assuming that these have been appropriately specified, then theχ 2 results reflect on the observation errors. Exp. 1 and 2 yieldχ 2 values consistently less than 1.0, indicating that the magnitude of observed residuals was smaller than that prescribed by the error variances. The observation errors would need to be inflated substantially (as to compensate for correlated observation errors) to correct this. Thê χ 2 values for Exp. 1 were lower than for Exp. 2, suggesting that the impact of observation error correlations was less severe for the Exp. 2, where observations were thinned. For Exp. 3, which used fixed observation error variances as well as thinned observations, theχ 2 values show a much closer fit to the χ 2 1 distribution compared to Exp. 1 and 2. This is partly attributable to the fact that none of the prescribed observation errors for this experiment were relatively small (i.e. from the lower tail of the reported DOMINO errors).  Figs. 7,8,9). Results for the different simulations and regions are denoted by different point types and colors. Note that the ranges of the xand y-axes differ between the three panels. Thinning observations clearly results in a loss of information, which is not the aim of this exercise. In order to assess the impact of the loss of information, we now turn our attention to verification with ground-based monitoring data.

Verification with EMEP data
The DEHM describes a total of 67 atmospheric components, however we examine results for only two: NO 2 and O 3 . Our main interest is in how the accuracy of NO 2 estimates varies when assimilating this species. We also chose to examine O 3 , as it has a close chemical relationship with NO 2 and has important consequences for human health.
Observations of NO 2 and O 3 concentrations for the year 2005 were obtained for European air-quality monitoring stations in the EMEP network (Aas, 2008). Stations were selected for inclusion based on four quality-control criteria: at most 50 % data were missing, the station altitude was below 2500 m above sea level, the difference between the station altitude and the land-surface height represented by DEHM differed was below 200 m, and (for NO 2 only) the measurement technique was reported. Four areas within the Eta domain were defined: Scandinavia and the Baltic region (SB), central Europe (CE), the Iberian Peninsula (IP) and the British Isles (BI), as shown in Fig. 1b. The numbers of monitoring stations for each species in each region are reported in Table 2. Hourly O 3 measurements were available for all stations, and daily-average NO 2 data were provided for all stations, with 25 of those passing the quality-control criteria also reporting NO 2 at hourly frequency. The in situ NO 2 data were measured with either chemiluminescence or aqueous phase techniques (variants on the Griess-Saltzman reaction). The molybdenum oxide catalysts used in chemiluminescence monitors have been shown to reduce a range of nitrates, and thus the resulting measurements are subject to interference from such species (Dunlea et al., 2007). It was thus necessary to compare such measurements with NO 2 plus the sum of modelled nitrates (2N 2 O 5 + HNO 3 + HO 2 NO 2 + CH 3 COOONO 2 + NO − 3 ). For stations that used an aqueous-phase measurement, it was possible to compare directly to the modelled NO 2 . In both cases, modelled and measured values are expressed in units of µg N m −3 . The details of which measurement technique was used at the individual monitoring sites was obtained from AirBase station configuration files (Mol and de Leeuw, 2005).
The EMEP monitoring sites are located to measure regional background concentrations in Europe. Yet the concentrations measured at these sites are necessarily subject to local factors, and the sites differ in their proximity to emission sources. The model's 50 km × 50 km grid resolution will not capture such local variation, yet by averaging geographically, the influence of these sub-grid scale effects is diminished. Thus for each day, the average over all station (in the Eta domain as well for the four sub-regions) was calculated for the observed and calculated concentrations (Figs. 7-9). Given the availability of hourly O 3 measurements, it was also possible to assess how well the model captures diurnal variation as expressed in the daily maximum O 3 concentration (Fig. 9). Based on the observed and calculated time series, we computed the correlation coefficient (R 2 ) and bias (modelled minus observed). Figure 10 presents the verification statistics for the 5 simulations.
These time series show only very small differences between the five simulations. The largest differences being due to the choice of meteorological input, and Ref. MM5 was clearly distinct from the other simulations (which used meteorology from the Eta NWP model). For NO 2 (Fig. 7), the differences were most apparent during the winter months, which may be related to differences in the mixing heights diagnosed from the two meteorological datasets. The Eta  (a) (b) Fig. 12. Modelled concentration profiles, averaged over region CE (Fig. 1b), for July and December and for each of the five simulations.
35 Fig. 11. Verification statistics for forecasts of NO 2 (compared to Exp. 2, which was used to provide the initial conditions, and to Ref . Eta) in terms of correlation (a) and bias (b). The line colour indicates the simulation, the line type denotes the month in question.
model uses the η (step-mountain) vertical coordinate and was configured with 32 vertical levels, whereas both MM5 and DEHM use the same terrain-following σ vertical coordinates with 20 vertical levels. The σ vertical coordinate allows for a higher vertical resolution in the planetary boundary layer (PBL), yet the η vertical coordinate places a limit to the vertical resolution in the PBL and thus on the minimum height of the mixed-layer height that can be resolved. During the winter months, the height of the mixed layer can be quite low, leading to higher concentrations of directly emitted compounds (such as NO x ), particularly in cases of nocturnal inversion layers. This results in higher variability in the observed NO 2 concentrations, and the wintertime variability is resolved better by Ref. MM5 than the Eta-driven simulations. This is most apparent in region CE, which has a high density of NO x emissions and is subject to low inversion layers during the winter months. Low mixed layers heights are also common in the region SB, especially in northern areas, but there is much lower wintertime variability in NO 2 due to the lower NO x emission density.
The region BI showed high-amplitude temporal fluctuations in NO 2 concentrations, partly due to the small number of monitoring stations to average over. Differences between Ref. Eta and Exp. 1-3 can be seen for this region during the highly variable concentrations in November and December of the study period. In region BI, the model does not capture seasonal variation very well, tending to under-predict NO 2 concentrations during the winter months and over-predict during the summer months (leading to a relatively low overall bias).
Of all the areas considered, the DEHM shows least skill for the region IP, where NO 2 levels are underestimated throughout the year and the model does not appear to capture the observed temporal fluctuations; these comments apply both to the MM5-and Eta-driven simulations. The region IP is at a lower latitude to the other regions considered, thus receiving more solar radiation and a shorter atmospheric lifetime of NO x (due to photolysis). Hence local emission sources are conditions, and to Ref . Eta) in terms of correlation (panel a) and bias (panel b). The line color indicates the simulation, the line type denotes the month in question. (a) (b) Fig. 12. Modelled concentration profiles, averaged over region CE (Fig. 1b), for July and December and for each of the five simulations.
35 Fig. 12. Modelled concentration profiles, averaged over region CE (Fig. 1b), for July and December and for each of the five simulations.
relatively more important, and the coarseness of the DEHM grid-cells poorly describe local emission patterns. The verification statistics for NO 2 (Fig. 10a) show that for all regions considered, simulations Exp. 1-3 had a higher correlation with observations than Ref. Eta, although there was little change to the bias. The Ref. MM5 simulation had the highest correlation of the five simulations for regions BI, SB and the entire Eta domain, and the lowest correlation for the other regions. In region BI, the Ref. MM5 run was also far more biased than the other simulations.
For O 3 , the Ref. MM5 simulation again has the most distinct results (Fig. 8). Ozone has a much longer atmospheric lifetime than NO 2 and thus the nesting within the hemispheric domain (Fig. 1a) ensures that the European nest is affected by episodes of long-range transport (unlike for the Eta domain, which uses fixed inflow concentrations based on monthly averages from hemispheric simulations). The different meteorological inputs will also influence the photochemistry (e.g. due to cloud cover), affecting O 3 production during the summer months. Differences between Ref. Eta and Exp. 1-3 are most prominent during the summer months (e.g. during peak episodes in regions CE and SB in June and July) due to the higher photochemical production and to the changes to O 3 precursors by the assimilation procedure. In the region IP, the model tends to underestimate O 3 concentrations during the summer months. This may be partly due to the general underestimation of NO 2 in this region (Fig. 7), and possibly related to poor specification of emissions of biogenic volatile organic compounds .
Verification statistics for daily average O 3 (top-right panel of Fig. 10b) show little change in bias in the simulations using assimilation (with respect to Ref. Eta), but an increase in correlation for these simulations in all regions except IP. The performance of Ref. MM5 stands out, and has the highest correlation of the five simulations for region SB, BI and CE, and the lowest for regions IP and for the Eta domain.
Many of the same comments apply to the daily maximum O 3 concentration (Fig. 9)   information provided by the assimilation of tropospheric NO 2 retrievals should be most prominent around this time of day. However, there were only small differences in the verification scores for simulations using Eta meteorology. Exp. 1-3 were slightly more biased than Ref. Eta, and there was little difference in the correlation (apart from region SB, where Exp. 1-3 had slightly higher correlation). The Ref. MM5 simulation showed much lower correlations with observed daily maximum O 3 concentrations in regions IP, SB and the Eta domain. Finally, we note that on average the model tends to under-predict daily maximum O 3 concentrations in all regions, particularly from March to June.

Forecasts
The model results in Sect. 3.2 are based on daily or hourly averages, and the assimilation procedure was performed once an hour for each hour when observations were available. Thus these are effectively a mixture of background (input for an assimilation), analysis (result of an assimilation) and forecast (the time-integration using an analysis as input), and can be thought of as reanalyses. It is also of interest to assess the impact of the initial conditions on CTM forecasts.
From Exp. 2, instantaneous concentration fields were stored for every hour for July and December, and these were used as initial conditions for 48-h forecasts. It was necessary to start forecasts at each hour of the 24-h cycle in order to smooth out diurnal effects (which dominate the results otherwise). Hourly NO 2 observations were available at 25 EMEP stations. For these stations, for each hour of the 48-h forecast outlook we calculated the bias and correlation between modelled and observed values. The bias and correlation were calculated separately for each station and these were summarised as a weighted average over the per-station values, with weights based on the number of non-missing observations for the station in the period considered. For comparison, the time-matched verification scores for simulations Ref. Eta and Exp. 2 are shown together with the results of the forecasts. The first 24 h are shown in Fig. 11. The forecasts were run with the same meteorological inputs as for the other Etadriven simulations.
The bias in NO 2 shows a rapid adjustment in July (over 3-4 h), and somewhat slower in December (over 10-15 h), reflecting the seasonal variation in NO x lifetimes. While in July the forecasts adjust towards the Ref. Eta as expected, the December forecasts tend away from the corresponding results from the free run. The reasons for this are unclear, and it may suggest that either the free run (Ref. Eta) is not the only equilibrium solution when the assimilation is turned off, or that other factors in the initial conditions (influenced indirectly by the changes from the assimilation of NO 2 ) have a longer relaxation time. In July, the correlation was relatively constant over the forecast outlook considered, whereas in December the improvement in correlation appears to decay after around 15 h.

Modelled profiles
Due to vertical variation in both H (due to the averaging kernels) and B, the increment is not expected to be evenly distributed throughout the vertical profile. When averaged temporally over the months of July and December and spatially over the region CE (Fig. 1b), the differences between the modelled NO 2 profiles for the different experiments are very small and only discernable in the free troposphere and lower stratosphere (Fig. 12a). In December, the NO 2 mixing ratios in Exp. 1-3 were slightly higher than for Ref. Eta, whereas in July they were marginally lower. For Ref. MM5, NO 2 mixing ratios were higher at the surface in December than was seen for the simulations with Eta meteorology.
The differences in modelled O 3 concentrations between Ref. Eta and Exp. 1-3 were only visible for the simulations in July, where the assimilation tended to result in slightly lower O 3 concentrations in the boundary layer and free troposphere. The differences between the Ref. MM5 and the Eta-driven simulations are more pronounced for O 3 than for NO 2 , due to combination of the longer atmospheric lifetime of O 3 and the dynamic boundary conditions provided by the hemispheric nest.

Discussion
The assimilation of the OMI retrieved tropospheric NO 2 column concentrations led to increased temporal correlation with surface measurements of NO 2 , and somewhat surprisingly with surface O 3 . However the differences between the free run (Ref. Eta) and the simulations with assimilation (Exp. 1-3) were rather small and were difficult to see in the time series of the modelled concentrations. The short forecasts showed only minor differences from the other simulations once diurnal effects were averaged out (by starting forecasts at each hour of the day), and did not necessarily relax toward the free run as expected.
While the impact of assimilation of OMI retrievals may be positive, at the surface at least, it appears rather small; there may be several factors at play. First, due to the short atmospheric lifetime of NO 2 , any improvement due to extra information from an observation has a relatively short duration and is rather localised. Second, in a given day only a fraction of the model domain benefits from observations. This is partly due to the nature of the satellite overpasses, and partly due to the quality control procedure, which excluded a large fraction of the retrievals due to high cloud fraction or surface albedo.
In spite of the preceding considerations, Wang et al. (2011) report a substantial improvement in surface NO 2 concentrations due to assimilation of OMI NO 2 column concentrations. This suggests that the relatively small improvement found here was due to a non-optimal usage of the data rather than, for example, high errors in the retrievals themselves. Yet Wang et al. (2011) estimated parameters for their B in order to optimise performance at surface monitoring sites. This is likely to have contributed to the greater performance gains than may be achieved without such tuning; the NMC method used here to parameterise B did not allow for "tuning" (by its nature, and also due to the large number of parameters involved).
The assimilation experiments were all run with meteorology from the Eta NWP model, however we also examined results from the free run with meteorology from MM5, which was used in the construction of the background covariances. It was seen that for three of the regions considered, Ref. MM5 showed much higher correlation with surface NO 2 measurements than any of the Eta-driven simulations. This highlights the importance of considering the influence of model inputs in chemical data assimilation research.
The three-dimensional OI scheme presented here is a development from an earlier two-dimensional scheme (Frydendall et al., 2009). The three-dimensional version makes appropriate use of the averaging kernels, thus providing extra vertical structure in the analysis increment. Despite the extra dimension involved, a number of algorithmic speed-ups and approximations were implemented and the OI scheme to limit the extra computational burden.
Background covariances were estimated from the difference between paired CTM simulations, run with different meteorology and a different domain structure. The background covariances were calibrated for each month and for each hour, to allow for seasonal and diurnal patterns in the modelled NO 2 field. The resulting parameter estimates illustrated not only these temporal patterns but also vertical structure in the NO 2 error correlations.
This construction of the background covariances incorporates uncertainties due to meteorological inputs only. While the meteorology is an important source of uncertainty in the modelled concentration field (Vautard et al., 2012), other key factors (such as emission rates, the choice of physical and chemical parameterisations, grid resolution) should also be accounted for (Mallet and Sportisse, 2006).
Observation errors for the OMI retrievals can be assumed to be correlated due to the use of level-2 satellite data. The effects were investigated by thinning observations to a spatial resolution comparable to the model grid, as well as assuming fixed error variances for all observations (during a given month). The combination of these yielded a better balance between the magnitude of the observed residuals and the specified error variances. The loss of information resulting from the observation thinning did not appear to systematically degrade the quality of the modelled NO 2 or O 3 surface concentrations (based on the verification statistics).
The assumption of independent observation errors was not only applied in specification of the observation error matrix, R, but also in the use of sequential processing of observations. Such batch processing is only strictly valid if data with correlated observation errors are assimilated in the same batch (Houtekamer and Mitchell, 2001). This could have been avoided if a shorter assimilation cycle were used (as in Wang et al., 2011), thus reducing the number of observations per cycle. However such a treatment ignores correlations in observation errors between successive assimilation cycles (Dee, 2005). Appropriate treatment of observation error correlations requires further investigation.
The year-long span of the simulations allowed us to consider seasonal variation. For example, the assimilation showed the most pronounced effects on modelled NO 2 concentrations in the winter, and had the biggest impact on O 3 during the summer months. Barbu (2010, pp. 65-78) assimilated the OMI tropospheric NO 2 retrievals into the LOTOS-EUROS model, using a biasaware ensemble Kalman filter. Ensemble spread was generated by perturbing the emissions of NO x and VOC emissions. While the bias-aware assimilation led to much more accurate estimates of surface NO 2 , the simulation period covered only one month and verification statistics for O 3 were not reported. This significance of model bias is emphasised by Dee and da Silva (1998), who showed that a biased background field will result in a biased increment. Biases in the background or observations were not accounted for in the work presented here, and could be addressed as an extension of this system, such as with the sequential bias correction scheme given in Eqs. (17)-(18) of Dee (2005).
The OI assimilation scheme is both conceptually simple and relatively straightforward to implement. If both the length of control vector (i.e. those modelled variables adjusted by the assimilation) and the number of observations are sufficiently small, then it may be less computationally demanding than other assimilation schemes. However the framework of OI is, in several ways, rather limited (Kalnay, 2003). First, it does not scale well as the number of observations or the size of state increases. Second, it requires the preparation of a climatological B matrix, which must be recalibrated if extra variables are added to the state vector. Third, unlike variational techniques it cannot be used for parameter estimation or quality control of observations. Far more flexible approaches to chemical DA are offered by variational or Kalman filter-based schemes (see Lahoz et al., 2007, and references therein).
For historical re-analysis and for single-component assimilation, then OI may be a sufficiently competent scheme, beyond which any gains are marginal and require a significantly more complicated and computationally demanding DA procedure (Wu et al., 2008). In a forecasting context, however, the strong forcings from chemistry and emissions limit the potential benefit from more accurate initial conditions, although the rate depends on the atmospheric lifetime of the species in question and whether they are emitted directly or formed chemically. The framework of DA can be used to re-estimate highly uncertain model parameters (e.g. emission rates, as were examined by Elbern et al., 2007), and this appears to be a promising means of addressing forecast accuracy for directly emitted, short-lived atmospheric components such as NO x ; however, this is not possible with OI.
The DA scheme described here could be developed in several ways. For example, the observations were originally provided as pixels, thereby corresponding to a two-dimensional region. However for simplicity they were summarised at the pixel's midpoint. A more accurate version of the observation operator would account for the fact that pixels may span multiple grid-cells. Furthermore, the observation operator was constructed by considering only the vertical column above each OMI pixel, thus not accounting for horizontal variation in NO 2 concentrations resulting from non-nadir scan angles.

Conclusions
In this study, we have presented an algorithm used to assimilate remotely-sensed tropospheric NO 2 column concentrations. Results were compared to observations from regional background monitoring stations. The analysed concentration fields showed increased temporal correlation for both NO 2 and O 3 across the different sub-regions considered.
Non-separable, horizontally homogeneous and isotropic background covariances were estimated from the difference between paired simulations, effectively attributing all model uncertainties as arising from the meteorological inputs. Despite the incomplete survey of sources of model uncertainty, the parameters of the three-dimensional background covariances reflect temporal patterns and vertical structures that fit with the modelled variation in NO 2 concentrations.
Despite its limitations, the optimal interpolation algorithm is conceptually simple and relatively straightforward to implement, and proved it to be useful in this context. We have demonstrated that the effects of chemical DA are not limited to the assimilated species, and can be seen in chemically related compounds. Finally, the effectiveness of chemical DA in a forecasting context must be considered in conjunction with the atmospheric lifetime of the species in question.