Background error covariance with balance constraints for aerosol species and applications in data assimilation

Introduction Conclusions References Tables Figures


Background error covariance with balance constraints for aerosol species and applications in data assimilation 1 Introduction
Data assimilation in meteorology-chemistry models has received an increasing amount of attention in recent years as a basic methodology for improving aerosol analysis and forecasting.In a data assimilation system, the background error covariance (BEC) plays a crucial role in the success of the assimilation process.It determines variable Figures analysis increments and the balance relationships between different variables (Derber and Bouttier, 1999;Chen et al., 2013).However, accurate estimation is difficult due to a lack of information about the true atmospheric states and is computationally difficult dealt due to the large dimension of the BEC (typically 10 6 × 10 6 ).Different methods have been developed to estimate and simplify the expression of the BEC, such as the analysis of innovations, the NMC and the ensemble (Monte Carlo) methods.A common method is known as the NMC method, which assumes that the forecast errors are approximated by differences between pairs of forecasts that are valid at the same time (Parrish and Derber, 1992).
The NMC method is extensively used in operational atmospheric and meteorologychemistry data assimilation systems.Pagowski et al. (2010) estimated the BEC of PM 2.5 by calculating the differences between the forecasts of 24 and 48 h to develop the Grid-point Statistical Interpolation (GSI) three-dimensional variational assimilation system.Benedetti et al. (2007) calculated the BEC of the sum of the mixing ratios of all aerosol species to develop the operational forecast and analysis systems of ECMWF.
The BEC with multiple species and size bins of aerosols have been calculated and employed in data assimilation.Liu et al. (2011) calculated the BEC with 14 aerosol species from the Goddard Chemistry Aerosol Radiation and Transport scheme of the Weather Research and Forecasting/Chemistry (WRF/Chem) model and applied it to the GSI system.Schwartz et al. (2012) increased the number of the species to 15 based on the study of Liu et al. (2011).Li et al. (2013) estimated the BEC for five species derived from the MOSAIC scheme.These studies proved that data assimilation with a practical BEC can spread the observation information to nearby model grid-points and improve analysis fields and aerosol forecasting.
One role that the BEC serves in data assimilation is to spread information between different variables to produce balance analysis fields, which employ balance constraints to convert original variables into new independent variables.The balance constraint is crucial and employed in atmospheric and oceanic data assimilation, such as geostrophic balance or temperature-salinity balance (Bannister, 2008a, b).To incor-Introduction

Conclusions References
Tables Figures

Back Close
Full porate balance constraints, the model variables are usually transformed to balanced and unbalanced parts.The unbalanced parts as control variables are independent in data assimilation, and the balanced parts are constrained by balance constraints (Derber and Bouttier, 1999).Instead of using an empirical function as a balance constraint, more constraint relationships are derived using regression techniques (Ricci and Weaver, 2005).Although distinct empirical relations between some variables (such as temperature and humidity) may not exist, the regression equation can also be estimated as balance constraints (Chen et al., 2013).
In current aerosol data assimilation with multiple variables, balance constraints are not incorporated by the BEC.The state variables are assumed to be independent variables without cross-correlation.However, the aerosol species are frequently highly correlative due to their common emission sources and diffusion processes.For example, the correlations in terms of the R-square between elemental carbon and black carbon exceed 0.6 in many locations across Asia and the South Pacific in both urban and suburban locations (Salako et al., 2012), and the correlations between different size bins, such as PM 10 , PM 2.5 and PM 1 , are also significant (Hoek et al., 2002;Gomišček et al., 2004).Thus, the cross-correlations between different variables are necessary to produce balanced analysis fields.Cross-correlations spread the observation information from one variable to other variables, which enhances the impact of the observation of individual species or size bin.
Recently, several researchers have suggested that the BEC with balanced crosscorrelation should be introduced into aerosol data assimilation (Kahnert, 2008;Liu et al., 2011;Li et al., 2013;Saide et al., 2013).Kahnert (2008) exhibited crosscorrelations of the seventeen aerosol variables from Multiple-scale Atmospheric Transport and Chemistry (MATCH) Model.He found that the statistical cross-correlations between aerosol components are primarily influenced by the interrelations between emissions and by interrelations due to chemical reactions to a much lesser degree.However, he did not detail the effects of the BEC with cross-correlation on data assimilation experiments.Saide et al. (2012Saide et al. ( , 2013) )  Full cross-correlations between aerosol size bins in GSI for assimilating AOD data.The cross-correlations between the nearest two size bins for each species were considered using recursive filters, which is similar to the horizontal spread by the distance units.For the species that are not adjacent, application of this method to consider their cross-correlations is challenging.
In this paper, we explore incorporating cross-correlations in BEC by balance constraints.The balance constraints are established using statistical regression.We apply the BEC to a data assimilation and forecasting system for the Model for Simulation Aerosol Interactions and Chemistry (MOSAIC) scheme in WRF/Chem.The MOSAIC scheme includes a large number of variables with eight species and eight/four size bins.A three-dimensional variational data assimilation (3-Dvar) method for the MO-SAIC scheme has been estimated by Li et al. (2013).For comparisons, we employ the same model configurations as employed by Li et al. (2013) to perform data assimilation experiments with a focus on the impact of cross-correlations of the BEC on analyses and forecasts.
The paper is organized as follows: Sect. 2 describes the data assimilation system and the formulation of the BEC.Section 3 describes the WRF/Chem configuration and estimates the correlations among the emissions.The statistical characteristics of the BEC, including the regression coefficient of the cross-correlation, are discussed in Sect. 4. Using the BEC, experiments of assimilating surface PM 2.5 observations and aircraft observations are discussed in Sect. 5. Shortcomings, conclusions and future perspectives are presented in Sect.6.

Data assimilation system and BEC
In this section, we present the formulation of the BEC with cross-correlation using a regression technique based on the data assimilation system developed by Li et al. (2013).
Then, the cost function with the new BEC is derived and the calculating factorization of the BEC is described.Introduction

Conclusions References
Tables Figures

Back Close
Full The control variables of the data assimilation are obtained from the MOSAIC (4-bin) aerosol scheme in the WRF/Chem model (Zaveri et al., 2008).The MOSAIC scheme includes eight aerosol species, that is, elemental carbon or black carbon (EC/BC), organic carbon (OC), nitrate (NO 3 ), sulfate (SO 4 ), chloride (Cl), sodium (Na), ammonium (NH 4 ), and other inorganic mass (OIN).Each species is separated into four bins with different sizes: 0.039-0.1,0.1-1.0,1.0-2.5 and 2.5-10 µm.The scheme involves 32 aerosol variables with eight species and four size bins.These variables cannot be directly introduced as control variables in an assimilation system in consideration of computational efficiency.The number of variables must be decreased prior to assimilation.Li et al. (2013) have lumped these variables into five species as control variables in the data assimilation system.We employ the variables of Li et al. (2013) to perform the data assimilation experiments.The five species consist of EC, OC, NO 3 , SO 4 and OTR.Here, OTR is the sum of Cl, Na, NH 4 and OIN.Note that the data assimilation system aims to assimilate the observation of PM 2.5 ; only the first three of four size bins are utilized to lump as one control variable for each species.
For a three-dimensional variational data assimilation system, the traditional cost function (J), which measures the distance of the state vector to the background and observations, is written as follows: Here, x is the vector of the state variables, including EC, OC, NO 3 , SO 4 and OTR; x b is the background vector of these five species, which are generated by the MOSAIC scheme; y is the observation vector; H is the observation operator that maps the model space to the observation space; R is the observation error covariance associated with y; and B is the background error covariance associated with x b .Equation ( 1) is usually written in the incremental form where δx (δx = x − x b ) is the incremental state variable.The observation innovation vector is known as d = y − Hx.The minimization solution is the analysis increment δx, and the final analysis is x a = x b +δx.This analysis is statistically optimal as a minimum error variance estimate (e.g., Jazwinski, 1970;Cohn, 1997).
In Eqs.
(1) or (2), B is a symmetric matrix with the large size N × N (N is the size of vector x b ).For a high-resolution model, the number of model grid points is on the order of 10 6 .Therefore, the number of elements in B is approximately 10 12 .With this size, B cannot be explicitly manipulated.To pursue simplifications of B, we employ the following factorization where D and C are the standard deviation matrix and the correlation matrix, respectively.D and C can be described and separately prescribed after the factorization.D is a diagonal matrix whose elements include the standard deviation of all state variables in the three-dimensional grids and is commonly simplified with vertical levels.C is a symmetric matrix  2013), these cross-correlations were disregarded, that is, the five species are considered independently and Eq. ( 4) is a block diagonal with auto-correlations.
In this study, the cross-correlations are considered by introducing control variable transforms (Derber and Bouttier, 1999;Barker, 2004;Huang, 2009).We divide the model aerosol variables into balanced components (δx b ) and unbalanced components (δx u ): (5) Note the first variable of EC does not need to be divided.This first variable is similar to the vorticity in the data assimilation of ECMWF (Derber and Bouttier, 1999), or the stream function in the data assimilation of MM5 (Barker, 2004).The transformation from unbalanced variables (δx u ) to full variables (δx) by the balance operator K is given by δx = Kδx u .( 6) Equation ( 6) can be written as where ρ i j is the submatrix of K, which represents the statistical regression coefficients between the variables i and j (Chen et al., 2013).Note that ρ i j is a diagonal matrix with the dimension of model grid points.Each model grid point has a one regression coefficient.For convenience, we assumed that the elements of ρ i j is a constant value for all grid points, which are denoted as ρ i j and are calculated by linear regression with Introduction

Conclusions References
Tables Figures

Back Close
Full all grid points.For example, ρ 21 can be deduced from the regression equation of OC and EC as where ε is the residual.Equation ( 8) contains the slope but no intercept.The intercept is nearly zero because δOC and δEC represent all forecast differences that can be considered to be zero mean values.After obtaining ρ 21 , the balanced part (e.g., the value of the regression prediction) of δOC can be obtained by Remove the balanced part from the full variables to obtain the unbalanced part (δOC u ), that is, ε in Eq. ( 8).Thus, the calculation of δOC u can be written as Here, δOC u and δEC are employed as predictors in the next regression equation to obtain δNO 3 .Then, we can obtain the unbalanced parts of the remaining variables, which are defined as follows: The coefficient of determination (R 2 ) can be employed to measure the fit of these regressions.It can be expressed as where SSR and SST are the regression sum of squares and the sum of squares for total, respectively.10061 Introduction

Conclusions References
Tables Figures

Back Close
Full These unbalanced parts can be considered to be independent because they are residual and random.B u denotes the unbalanced variables of the BEC and can be factorized as where D u and C u are the standard deviation matrix and the correlation matrix, respectively.C u should be a block diagonal without cross-correlations as follows: Using Eq. ( 6), the relationship between B and B u is u are defined as the square root of B and the square root of B u , respectively.
Their transformation is Using Eq. ( 15), Eq. ( 18) can be written as follows: Generally, a transformed cost function of Eq. ( 2) is expressed as a function of a preconditioned state variable:

Conclusions References
Tables Figures

Back Close
Full Here, δz = B 1 2 δx.Using Eq. ( 19), Eq. ( 20) can be written as Equation ( 21) is the last form of the cost function with the cross-correlation of B.
According to Li et al. (2013), the correlation matrix of the unbalanced parts (C u ) is factorized as Here, ⊗ denotes the Kronecker product, and C ux , C uy and C uz represent the correlation matrices between gridpoints in the x direction, the y direction, and the z direction, respectively, with the sizes n x × n x , n y × n y , and n z × n z , respectively.Here, n x , n y and n z represent the numbers of grid points in the x direction, y direction, and z direction, respectively.This factorization can decrease the size of the dimension of C u .Another desirable property of Eq. ( 22) is C ux and C uy are expressed by Gaussian functions, and C uz is directly computed from the proxy data.They will be discussed in Sect.4.2.

WRF/Chem configuration and cross-correlations between emissions
In this section, we describe the configuration of WRF/Chem, whose forecasting products will be employed in the following BEC statistics and data assimilation experiments.
In addition, the cross-correlations of aerosol emissions from the WRF/Chem emission data are investigated to understand the cross-correlation of the BEC.

GMDD Introduction Conclusions References
Tables Figures

Back Close
Full ) is employed in our study.This is a fully coupled online model with a regional meteorological model that is coupled to aerosol and chemistry domains (Grell et al., 2005).The model domain with three spatial domains is shown in Fig. 1.The resolutions for these three domains are 36, 12, and 4 km, respectively.The outer domain spans southern California and the innermost domain encompasses Los Angeles.
All domains have 30 vertical levels.The discussion of the BEC and the emissions presented in this paper will be confined to the innermost domain.The initial meteorology conditions for WRF/Chem are prepared using the North American Regional Reanalysis (NARR) (Mesinger et al., 2006).The meteorology boundary conditions and sea surface temperatures are updated at each initialization.The initial aerosol conditions are obtained from the former forecast without updating.The emissions are derived from the National Emission Inventory 2005 (NEI'05) for both aerosols and trace gases (Guenther et al., 2006).For more details, the readers are referred to Li et al. (2013).

Cross-correlations of emission species
Emission files are necessary for running the WRF/Chem model, which is a primary factor for the distribution of the aerosol forecasts.The analysis of the correlations among the emission species can help us to understand the BEC statistics.The emission species is derived from the emission file that is produced by the NEI'05 data for each model domain.Only the emission file for the innermost domain is used to calculate the correlation among the emission species.The emission file contains 37 variables, including gas species and aerosol species.An aerosol species also comprises a nuclei mode and accumulation model species (Peckam et al., 2013).From these aerosol emission species, five lumped aerosol species are calculated, which is consistent with the variables in the data assimilation.These five lumped species are E_EC (sum of the nuclei mode and the accumulation mode of elemental carbon PM 2.5 ), E_ORG (sum of the nuclei mode and the accumulation mode of organic PM 2.5 ), E_NO3 (sum of the

Conclusions References
Tables Figures

Back Close
Full nuclei mode and the accumulation mode of nitrate PM 2.5 ), E_SO4 (sum of the nuclei mode and the accumulation mode of sulfate PM 2.5 ), and E_PM25 (sum of the nuclei mode and the accumulation mode of unspeciated primary PM 2.5 ).
Figure 2 shows the cross-correlations of the five lumped aerosol emission species.With the exception of the auto-correlation in the diagonal line, all cross-correlations exceed 0.5.This result reveals that the emission species are correlative, which may be attributed to the common emission sources and diffusion processes that are controlled by the same atmospheric circulation.The most significant cross-correlation is between E_EC and E_ORG with a value of approximately 0.8.This close correlation demonstrates that the emission distributions of these two species are very similar.Their emissions are primary in urban and suburban areas with small emissions in rural areas and along roadways (not shown).As shown in Fig. 2, the lowest cross-correlation is between E_ORG and E_SO4; the latter emissions are primary in the urban and suburban areas with few emissions in rural areas and roadways (not shown).

Balance constraints and BEC statistics
With the configuration of the WRF/Chem model described in Sect.3.1, forecasts for one month (00:00 UTC of 15 May to 00:00 UTC of 14 June 2010) were performed for the balance constraints and the BEC statistics.Forecast differences between 24 h forecasts and 48 h forecasts are available at 00:00 UTC.Thirty forecast differences are employed as inputs in the NMC method.For this method, 30 forecast differences are sufficient; however, a longer time series may be more beneficial for the BEC statistics (Parrish and Derber, 1992).

Balance regression statistics
Using these 30 forecast differences, we can estimate the regression equations of EC, OC, NO 3 , SO 4 and OTR and calculate the unbalanced parts of these variables accord-Introduction

Conclusions References
Tables Figures

Back Close
Full ing to Eqs. ( 6)-( 12).Table 1 shows the regression coefficients whose column and row are consistent with ρ i ,j in Eq. ( 6).The last column in Table 1 is the coefficient of determination (R 2 ) of the regression equations.For the regression equation of OC, the regression coefficient is 0.90 and the coefficient of determination of Eq. ( 7) is 0.86, which indicates that EC and OC are highly correlative and their mass concentration scales are approximate.Their correlation is similar to the correlation of the stream function and velocity potential; thus, we set them as the first and second variables in the regression statistics.For the regression equation of NO 3 , the regression coefficients of EC and OC u are 4.01 and 3.76, respectively, because the mass concentration scale of NO 3 exceeds the mass concentration scales of EC and OC u .The coefficient of determination is only 0.32, which indicates that the correlations between NO 3 and EC and between NO 3 and OC u are weak.This result reveals that the forecast errors of NO 3 differ from the forecast errors of EC and OC u .A possible reason is that NO 3 is the secondary particle that is primarily derived from the transformation of NO x , but EC and OC u are derived from direct emissions.Similar to NO 3 , SO 4 is also primarily derived from the transformation of SO 2 and the coefficient of determination for SO 4 is also low.For the last variable OTR, the maximum coefficient of determination is 0.96 because OTR includes some different compositions that are correlative with the first four variables.For this reason, we set OTR as the last variable in the regression statistics.
Figure 3 shows the cross-correlations of the five full variables and the unbalanced variables.In Fig. 3a, the cross-correlations of the full variables exceed 0.3 and most of them exceed 0.5.In Fig. 3b, however, the cross-correlations of the unbalanced variables are less than 0.2.Some of the cross-correlations are close to zero, which indicates that these unbalanced variables are approximatively independent and can be employed as control variables in the DA system.Introduction

Conclusions References
Tables Figures

Back Close
Full

BEC statistics
Using the original full variables and the unbalanced variables obtained by the regression equations, the BEC statistics are performed.Figure 2 shows the vertical profiles of the standard deviations of the original D and the unbalanced D u .In Fig. 2a, the original standard deviation of NO 3 is the largest value, whereas the smallest value is OC, whose profile is close to the profile of EC.All profiles show a significant decrease at approximately 800 m because the aerosol particulates are usually limited under the boundary level.In Fig. 2b, all standard deviations significantly decrease, with the exception of EC, which remains as the control variable in the unbalanced BEC statistics.Note that the standard deviation of OTR decreases by approximately 80 % compared with NO 3 , which decreases by approximately 10 %.This result is attributed to the small coefficient of determination for the regression of NO 3 (in Table 1), which indicates that a small portion of NO 3 can be predicted by the regression and a large portion is an unbalanced component.In contrast with NO 3 , a small portion of OTR is the unbalanced component.
For the correlation matrix of C and C u , they are factorized as three independent one-dimensional correlation matrices in Eq. ( 21).The horizontal correlation C x or C y is approximately expressed by a Gaussian function.The correlation between two points r 1 and r 2 can be written as e , where L s is the horizontal correlation scale and is a constant value for C x and C y , which are considered to be isotropic (Li et al., 2013).This scale can be estimated by the curve of the horizontal correlations with distances.Figure 5 shows the curves of the horizontal correlations for the five control variables.For the full variables (Fig. 5a), the sharpest decrease in the curves is observed for NO 3 and the slowest decrease in the curves is observed for SO 4 .The horizontal correlation scales of EC, OC, NO 3 , SO 4 and OTR are 25, 27, 20, 30 and 28 km, respectively.
For the unbalanced variables (Fig. 5b), their curves are closer than the curves of the full variables.The correlation scales of EC, OC, NO 3 , SO 4 and OTR are 25, 23, 24, 28 and 25 km, respectively.These results suggest that the unbalanced variables are 10067 Introduction

Conclusions References
Tables Figures

Back Close
Full expressed by common factors in the regression equations, which produces consistent horizontal correlation scales.
For the vertical correlation between C z and C uz , they are directly estimated using the forecasting differences because it is only an n z × n z matrix.Figure 6 shows the vertical correlation matrices C z and C uz for the full variables (left column) and the unbalanced variables (right column), respectively.A common feature of both the full variables and the unbalanced variables is the significant correlation between the levels of the boundary layer height, which is consistent with the profile of the standard deviation in Fig. 4. Some weak adjustments to the correlations between the full and unbalanced variables are made.For example, the correlation of NO 3u is stronger than the correlation of NO 3 between the boundary layers, namely, the vertical correlation scale of NO 3u is larger than the vertical correlation scale of NO 3 .Conversely, the vertical correlation scale of OTR u is smaller than the vertical correlation scale of OTR.These results demonstrate that the vertical correlations for the unbalanced variables are more consistent than the vertical correlations of the full variables, which is similar to the adjustments to the horizontal correlation scale.

Application to data assimilation and prediction
To exhibit the effect of the balance constraint of the BEC, the data assimilation experiments and 24 h forecasting are run using WRF/Chem model from 12:00 UTC on 3 June 2010 to 12:00 UTC on 4 June 2010.The surface PM 2.5 and aircraft-speciated observations are assimilated using different BEC, and the evaluations are presented for the data assimilation and subsequent forecasts.Three basic statistical measures including mean bias (BIAS), root mean square error (RMSE) and correlation coefficient (CORR) are utilized for the evaluations.

Conclusions References
Tables Figures

Back Close
Full

Observation data and experiment scheme
Two types of observation data are employed in our experiments.The first type of observation data consists of hourly surface PM 2.5 concentrations, which are obtained from the California Air Resources Board.A total of 42 surface PM 2.5 monitoring sites exist in the innermost domain of the WRF/Chem model (Fig. 7).The second type of observation data is the speciated concentration along the aircraft flight track.The aircraft observations are investigated during the California Research at the Nexus of Air Quality and Climate Change (CalNex) field campaign.This aircraft flight track is around Los Angeles from approximately 08:00 UTC on 3 June 2010 to 14:00 UTC on 3 June 2010 (Fig. 7).The species of the aircraft observations include OC, NO 3 , SO 4 and NH 4 .Note that NH4 is not a control variable; thus, the aircraft observations of NH 4 is disregarded in the data assimilation.Because the particle size of the aircraft observations is less than 1.0 µm, some adjustments to the flight observations are made according to the ratios between the concentration under 2.5 µm and the concentration under 1.0 µm for each species using model products.With the ratios multiplied by the aircraft observed concentrations, the speciated concentrations under 2.5 µm can be obtained.Three parallel experiments are performed.The first experiment is the control experiment without aerosol data assimilation, which is frequently known as a free run and denoted as the control.The second experiment is a data assimilation experiment that assimilates surface PM 2.5 and aircraft observations using the full variables without balance constraints; it is denoted as DA-full.The third experiment is also a data assimilation experiment that assimilates the same observations but employs the unbalanced variables as control variables conducted by the balanced constraint; it is denoted as DA-balance.
In each experiment, a 24 h forecasting is run using the WRf/Chem model with the same configuration described in Sect.3.1.These experiments begin from 12:00 UTC on 3 June 2010 and end at 12:00 UTC on 4 June 2010.For the DA-full and DA-balance experiment, the surface PM 2.5 observations at the initial time are assimilated.The

Conclusions References
Tables Figures

Back Close
Full aircraft-speciated observations from 10:30 to 13:30 UTC are assimilated for the use of more observation information.

Increments of data assimilation
Figure 8 shows the horizontal increments of EC, OC, NO 3 , SO 4 and OTR at the first model level for the DA-full (left column) and DA-balance experiments (right column).
In the DA-full experiment, the increment of EC and OTR (Fig. 8a and i) are similar.They are obtained from the surface PM 2.5 observations because no direct aircraft observations correspond to these two variables.In the DA-balance experiment, significant adjustments are made to the increments of EC (Fig. 8b) under the action of the balance constraints.The same applies to the increment of OC (Fig. 8d) for their high cross-correlation.Similarly, significant adjustments are made to the increment of OTR (Fig. 8j).The findings reveal some mixed characters of the first four variables that are correlative with OTR.The increments of OC, NO 3 and SO 4 are affected by surface PM 2.5 observations and aircraft observations.Some adjustments are made to the value and horizontal scales of the increments.These results demonstrate that the observation information can spread across variables by balance constraints.
Figure 9 shows the vertical increments along 35.0 N for the DA-full and DA-balance experiments.Similar to Fig. 8, the increments of EC and OTR (Fig. 9a and i) spread upward from the surface in the DA-full experiment, which are obtained from the surface PM 2.5 observation.In the DA-balance, the increments of EC and OTR (Fig. 9b and     j) exhibit observation information from the aircraft height at approximately 500 m, and the value of the increments show significant increases.The distributions of the increments for these five variables in the DA-balance (Fig. 9, right column) generally tend to coincide compared with the distributions of the increments in the DA-full (Fig. 9

Evaluation of data assimilation and forecasts
Figure 10 shows the scatter plots of the model vs. the observed surface PM 2.5 mass concentrations at 12:00 UTC on 3 June 2010, which is the time of initialization.Compared with the control experiment, significant improvements in the evaluations of the DA-full are observed.The CORR increases by approximately 0.3.The RMSE and the BIAS decrease approximately 50 % in the DA-full experiment (Fig. 10b).The evaluation of the DA-balance experiment (Fig. 10c) is similar to the evaluation of DA-full.The RMSE and BIAS of the DA-balance are slightly better than the RMSE and BIAS of DA-full, but the CORR of DA-balance is slightly lower than the CORR of DA-full.The main reason is probably attributed to the notion that the aircraft observations are independent of the surface observations and the adjustments of the balance constraints are primarily obtained from the speciated observations of the aircraft observations.These adjustments may not be consistent with the distribution of the surface observations.However, these minor differences of statistical measures imply that the balance constraints in the DA-balance are reasonable, which does not destroy the primary distributions of the increments in DA-full.
Figure 11 shows the scatter plots of the model species vs. the aircraft observed species.The CORR of the DA-balance is the highest value of the three experiments during the total forecasting period (Fig. 1a).Note that the CORR in DA-balance and DA-full are similar prior to the first 3 h; however, the former is significantly higher than the latter from the 3rd hour to the 18th hour.Similar improvements for the RMSE and the BIAS of DA-balance are observed in Fig. 11b and c, respectively.These improvements indicate that the balance constraint is positive for the subsequent forecasts, which derives from the balanced initial distribution among species.Figures

Summary and discussion
A set of balance constraints was established using a regression technique, which was incorporated in the BEC of a data assimilation system that is associated with five control variables (EC, OC, NO 3 , SO 4 and OTR) and is derived from the MOSAIC aerosol scheme of the WRF/Chem model.Based on the NMC method, differences within a month-long period between 24 and 48 h forecasts that are valid at the same time were employed in the estimation and analyses.For the original variables, these five control variables are highly correlative.Especially between EC and OC, their correlation is near 0.9.These original variables need to be transformed to satisfy the hypothesis of the independent control variables in the data assimilation system.We employ the method of the balance constraint to divide the original full variables into balanced and unbalanced parts.The regression technique is used to express the balanced parts by the unbalanced parts.Then, the independent unbalanced parts are employed as control variables in the BEC statics.Accordingly, the standard deviations of these unbalanced variables are less than the standard deviations of the full variables.
The horizontal and vertical correlation scales of these unbalanced variables tend to be uniform for the effect of the common factors in the regression equations.
To evaluate the impact of the balance constraints on the analyses and forecasts, three parallel experiments, including a control experiment without data assimilation and two data assimilation experiments with and without balance constraints (DA-full and DA-balance), were performed.In the data assimilation experiments, the same observations of surface PM 2.5 concentration and aircraft-speciated concentration of OC, NO 3 and SO 4 were assimilated.The observations of these three variables can spread to the two remaining variables in the increments of the DA-balance, which results in a more complicated distribution with more local centers.Even for the area with only surface PM 2.5 observations, some adjustments in the increments of the DA-balance are made for the mutual spread across variables compared with the increments of the DA-full.Consequently, few differences are observed between the evaluations of the Introduction

Conclusions References
Tables Figures

Back Close
Full two data assimilation analysis fields when we evaluated them using the surface PM 2.5 observations with the statistical measures of CORR, RMSE and BIAS.However, these differences are minor because the surface PM 2.5 observations are independent of the aircraft observations and the balance constraints cannot break the primary balance of the species.
The incorporation of the balance constraints improves the initial DA analysis fields.During the subsequent forecasts until 24 h, the improvements are more significant for the evaluation of the DA-balance experiment compared with the evaluation of the DAfull experiment, especially from the 3rd hour to the 18th hour.These results suggested that the balance constraint can optimize the initial distribution of variables.Although the optimization is slight for the initial analysis fields, it can serve an import role for improving the skill of sequent forecasts.
The method for incorporating balance constraints in aerosol data assimilation can be employed in other areas or other applications for different aerosol models.For the aerosol variables in different models, some cross-correlations should exist because their common emissions and diffusion processes are controlled by the same atmospheric circulation.Although these cross-correlations may be stronger than the crosscorrelations of atmospheric or oceanic model variables, theoretic balance constraints, such as geostrophic balance or temperature-salinity balance, do not exist.We expected to discover a universal balance constraint among the aerosol variables and utilize it in the data assimilation system.In addition, we expected to expand the balance constraint to include gaseous pollutants, such as nitrite (NO 2 ), sulfur dioxide (SO 2 ), and (carbon monoxide) CO.These gaseous pollutants are correlative with some aerosol species, such as NO 3 , SO 4 and EC, which can improve the data assimilation analysis fields of aerosols by assimilating these gaseous observations.The assimilation of aerosol observations may improve the analysis fields of gaseous pollutants.Introduction

Conclusions References
Tables Figures

Back Close
Full  Full Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | ) where C EC , C OC , C NO 3 , C SO 4 and C OTR at diagonal locations are the background error auto-correlation matrices that are associated with each species, which represent the correlations between the spatial gridpoints for one species.Other submatrices represent the cross-correlations between different species.For example, C EC OC represents Discussion Paper | Discussion Paper | Discussion Paper | cross-correlations between EC and OC, which is equivalent to C OC EC .In Li et al. ( Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | , left column).The results of the DA-balance are reasonable due to the influence of each other across the balance constraints.Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Sciences Division (http://esrl.noaa.gov/csd/groups/csd7/measurements/2010calnex/),for providing the download of surface and aircraft aerosol observations.Discussion Paper | Discussion Paper | Discussion Paper |

Figure 1 .Figure 3 .
Figure 1.Geographical display of the three-nested model domains.The innermost domain covers the Los Angeles basin; the black point denotes the location of Los Angeles.
incorporated the capacity to add Introduction

Table 1 .
Regression coefficients of balance operator K and the coefficient of determination (regression coefficients correspond to ρ i j in Eq. 6).