Development and application of the WRFPLUS-Chem online chemistry adjoint and WRFDA-Chem assimilation system

Here we present the online meteorology and chemistry adjoint and tangent linear model, WRFPLUSChem (Weather Research and Forecasting plus chemistry), which incorporates modules to treat boundary layer mixing, emission, aging, dry deposition, and advection of black carbon aerosol. We also develop land surface and surface layer adjoints to account for coupling between radiation and vertical mixing. Model performance is verified against finite difference derivative approximations. A second-order checkpointing scheme is created to reduce computational costs and enable simulations longer than 6 h. The adjoint is coupled to WRFDA-Chem, in order to conduct a sensitivity study of anthropogenic and biomass burning sources throughout California during the 2008 Arctic Research of the Composition of the Troposphere from Aircraft and Satellites (ARCTAS) field campaign. A cost-function weighting scheme was devised to reduce the impact of statistically insignificant residual errors in future inverse modeling studies. Results of the sensitivity study show that, for this domain and time period, anthropogenic emissions are overpredicted, while wildfire emission error signs vary spatially. We consider the diurnal variation in emission sensitivities to determine at what time sources should be scaled up or down. Also, adjoint sensitivities for two choices of land surface model (LSM) indicate that emission inversion results would be sensitive to forward model configuration. The tools described here are the first step in conducting four-dimensional variational data assimilation in a coupled meteorology–chemistry model, which will potentially provide new constraints on aerosol precursor emissions and their distributions. Such analyses will be invaluable to assessments of particulate matter health and climate impacts.


Introduction
Fine particulate matter impacts human health (Schwartz et al., 2007;Krewski et al., 2009) and climate (Myhre et al., 2013).Atmospheric climate forcing from aerosols is not only potentially large, but also highly uncertain owing to a complex spatial-temporal distribution of concentration, mixing state, and particle size for multiple species, each emitted from varying precursor sources, both anthropogenic and natural (Textor et al., 2006;Schulz et al., 2006).Depending on the species and quality of records, a nation's annual aerosol precursor and primary emissions have uncertainties anywhere between 7 % and a factor of 4, with larger variation on seasonal to diurnal scales for particular sectors (Streets et al., 2003;Suutari et al., 2001).Over these shorter timescales, aerosols impact meteorology through the semidirect (Hansen et al., 1997;Koch and Del Genio, 2010) and indirect (Twomey, 1977;Lohmann and Feichter, 2005) cloud effects, which are both dependent on aerosol vertical profiles (e.g., Samset et al., 2013) governed by mixing.
Atmospheric models are used to improve our understanding of aerosol sources, distributions, and processes.Online numerical weather prediction and chemistry (NWPchemistry) models integrate dynamic and chemical equations simultaneously, whereas offline chemical transport models (CTMs) interpolate meteorological fields from 3 to 6 h reanalyses.Grell et al. (2004) used the Weather Research and Forecasting Model with chemistry (WRF-Chem) (Skamarock et al., 2008;Grell et al., 2005) to show that vertical mass transport of chemical tracers is highly sensitive to the choice of online vs. offline modeling methodologies due to variations in boundary layer mixing strength.Additionally, NWP-chemistry models account for moisture and temperature perturbations to dynamics due to aerosol microphysics Published by Copernicus Publications on behalf of the European Geosciences Union.and radiative forcing, while CTMs cannot account for these feedbacks.
There are numerous online models with aerosolmeteorology feedbacks (e.g., WRF-Chem, COSMO-ART - Vogel et al., 2009;GEM-AQ -Kaminski et al., 2008;and IFS-MOZART -Kinnison et al., 2007;Flemming et al., 2009;Morcrette et al., 2009).Better descriptions of sources, loss mechanisms, and vertical transport in coupled models are needed to increase accuracies in short-term climate modeling (Baklanov et al., 2014).To address this, chemical data assimilation can be used to improve short-term forecasts.Bocquet et al. (2015) review methods and applications of chemical data assimilation in CTMs and NWP-chemistry models.In WRF, three-dimensional variational data assimilation (3D-Var) (Pagowski et al., 2010;Liu et al., 2011;Schwartz et al., 2012;Saide et al., 2012Saide et al., , 2013)), ensemble Kalman filter (EnKF) (Pagowski and Grell, 2012), and hybrid approaches (Schwartz et al., 2014) have all been used to improve chemical initial conditions.The limitation of these studies, using sequential methods, has been the decay of chemical concentrations back to the emissions-driven values following the characteristic loss rate of each species, necessitating periodic re-initialization with new observations.Using data assimilation solely to perturb initial conditions leaves behind underlying deficiencies in model description, emissions, or other input parameters.
In contrast to 3-D approaches, 4-D data assimilation attempts to minimize the discrepancy between model predicted values and observations at the same time observations are acquired.Variational 4-D data assimilation (4D-Var) requires an adjoint, which calculates the sensitivity of a model metric to all input parameters, such as resolved aerosol precursor emissions.Several offline CTMs already have adjoints for constraining aerosol and aerosol precursor emissions, including GEOS-Chem (Henze et al., 2007), Sulfur Transport dEposition Model (STEM) (Sandu et al., 2005;Hakami et al., 2005), Community Multi-scale Air Quality Model (CMAQ) (Turner et al., 2015), Goddard Chemistry Aerosol Radiation and Transport model (GOCART) (Dubovik et al., 2008), and Laboratoire de Météorologie Dynamique (LMDz) (Huneeus et al., 2009).Inverse modeling has been used to constrain aerosol emissions with 4D-Var, but only in offline models (e.g., Hakami et al., 2005;Dubovik et al., 2008;Henze et al., 2009;Wang et al., 2012).In addition to inverse modeling, derivatives calculated from CTM adjoints have been used to analyze sensitivities of model estimates to emissions (e.g., Turner et al., 2012).Online chemical 4-D variational data assimilation (4D-Var) has been performed with the global Integrated Forecast System coupled to the Model for OZone and Related chemical Tracers (IFS-MOZART) model, although without two-way coupling, to improve aerosol (Benedetti et al., 2009) and gas-phase (Inness et al., 2013) initial conditions.To our knowledge, 4D-Var still has not been used in a regional NWP-chemistry model with online coupling to constrain aerosol precursor emissions or other important model parameters, such as vertical mixing coefficients.
Here we present the first such system, building on existing capabilities of the WRF data assimilation (WRFDA) framework.WRFDA includes both 3D-Var (Barker et al., 2004) and incremental 4D-Var (Barker et al., 2005;Huang et al., 2009) algorithms, which are designed for constraining meteorological initial conditions (e.g., wind fields, temperature, moisture).For WRFDA v3.2 and later, WRF-4DVar requires calling the WRFPLUS (Weather Research and Forecasting plus) forward (FWM), tangent linear (TLM), and adjoint (ADM) models.These models include adiabatic WRF dynamics, along with simplified surface friction (i.e., boundary layer), cumulus, and microphysics packages (Zhang et al., 2013).Here we integrate aerosol chemistry and vertical mixing from WRF-Chem into WRFPLUS, including complementary TLM and ADM components.While existing CTMs are capable of aerosol emission inversions, this development promises to introduce new insights into meteorologychemistry couplings.We apply this system to black carbon (BC) aerosol, because of its important implications for climate (Bond et al., 2013) and health (Grahame et al., 2014).Additionally, the widespread use and development of WRF furthers the potential for continued model improvement and a community of future users.
BC is emitted from incomplete combustion of fuels.Major anthropogenic sources include residential cookstoves in developing countries, open crop burning, diesel transportation, and coal power plants with poor emission controls.Wildfires, or biomass burning, are the largest natural source.The major limitations to devising accurate bottom-up emissions inventories are poor activity data in developing countries and difficulty parameterizing complex biomass burning sources.Even in developed countries, changing economic landscapes affect real year-to-year emissions.BC is unique among atmospheric aerosols as being radiatively absorptive, relatively inert, primary emitted, and having potentially complex cloud interactions.BC is possibly the second most important human emitted pollutant in terms of climate forcing in the present-day atmosphere, with a net forcing of +1.1 W m −2 , but with 90 % uncertainty (+0.17 to +2.1 W m −2 ) (Bond et al., 2013).Also, reductions in BC emissions have been shown to reduce fine particulate health impacts (e.g., Anenberg et al., 2011).
The new TLM and ADM -referred to collectively herein as "AD/TL models" -aerosol treatments lay the groundwork for constraining aerosol precursor emissions using 4D-Var in a NWP-chemistry model.In Sect.2, we describe the WRFPLUS-Chem and WRFDA-Chem model architectures.In Sect.3, we describe the construction and verification of the AD/TL models of specific WRF-Chem forward model components.In Sect.4, we describe a special checkpointing scheme that enables adjoint and tangent linear simulations longer than 6 h which are required for accumulating sensitivities of sparse chemical observations with respect to emis-sions.In Sect.5, we demonstrate the capability of the adjoint model to calculate sensitivities of BC observation errors in WRFDA-Chem.Finally, we discuss future developments for WRFPLUS-Chem and WRFDA-Chem.

Methods
Creating the foundation for WRFDA-Chem required managing relationships between five related but separate models.These include (1) WRF, (2) its "-Chem" variant, and the (3, 4) WRFPLUS AD/TL models.Finally, (5) WRFDA 4D-Var requires communication of critical name list and state variables to the FWM, TLM, and ADM.Table 1 lists the WRF-Chem components that previously had AD/TL descriptions in WRFPLUS, those with new code developed for WRFPLUS-Chem, and those that need future development to enable fully coupled chemical 4D-Var.

Forward model
For this work, we use WRF version 3.6.The WRFPLUS-Chem code repository (https://svn-wrf-model.cgd.ucar.edu/branches/WRFPLUSV3-Chem) contains the most current version.Interested users can contact National Center for Atmospheric Research (NCAR) or the authors for access to the code.WRF contains multiple non-hydrostatic dynamic cores and parameterization options for modeling unresolved physical processes.The FWM is identical in WRF and WRF-PLUS.The simplified treatments for unresolved physics are typically only used in the AD/TL models.In addition, WRF-Chem simulates the emission, deposition, transport, turbulent and cumulus mixing, wet scavenging, cloud interactions, and chemical transformation of trace gasses and aerosols.All of these processes are modeled at the same spatial and temporal resolution, which enables coupling WRF radiation and microphysics calculations directly with chemical processes.
The forward model configuration for which we have developed the corresponding TLM and ADM will be referred to as the "adjoint model configuration," because we use the same settings when running the adjoint.We use GO-CART aerosols (chem_opt = 300), wherein the chem array has 19 aerosol (e.g., SO 2 , sulfate, black carbon, dust, sea salt) and zero gas-phase members.This option includes bulk mass sulfate chemistry and black carbon oxidative aging.We employ combined local and non-local Asymmetric Convective Model (ACM2) planetary boundary layer (PBL) mixing (Pleim, 2007b, a), with surface interactions handled by the Pleim-Xiu (PX) land surface model (LSM) (Xiu and Pleim, 2001;Pleim and Xiu, 2003;Pleim and Gilliam, 2009) and surface layer (Pleim, 2006) mechanisms (all options seven).Soil moisture and temperature nudging are not used within the PXLSM.Prior to version 3.6, the WRF-Chem vertical mixing scheme solely carried out PBL mixing and dry deposition for chemical species.That vertical mixing depended on a (local) turbulent-eddy mixing coefficient from a userselected PBL scheme and a dry deposition velocity.There is new capability to calculate tracer turbulent mixing and dry deposition within the ACM2 subroutine itself, enabling nonlocal mixing.Trace gas and particle deposition velocities are calculated using characteristic resistances found using methods from Wesely (1989).Microphysics and radiation AD/TL models with aerosol feedbacks have not been incorporated into WRFPLUS-Chem yet.These crucial components will be partially adapted from previous work (e.g., Saide et al., 2012Saide et al., , 2013)), while others still need to be developed.Both microphysics and radiation are turned off for Sect.3.3 verification simulations.In order to ensure appropriate radiative fluxes at the land-air boundary, the GSFCSW and Goddard LW radiation compute ground-incident radiation for the Sect. 5 adjoint sensitivity demonstration.However, online coupling between radiation and chemical species is deactivated.

Incremental 4D-Var
WRFDA uses an incremental 4D-Var method (Courtier et al., 1994) for finding the minimum of the cost function, J , by adjusting control variables (CVs), x.As described by Huang et al. (2009), the WRFDA cost function has three terms where J b , J o , and J c are the background, observation, and balancing cost functions, respectively.J c is not relevant to the current work.The background and observation cost functions are and The background cost function is a penalty term, which ensures the departure of the posterior, x n , from the prior, x 0 = x b , remains within the bounds justified by the background error covariance, B. The observation cost function measures the distance between the 4D-Var model solution, x n , and the observations, y.M and H are the nonlinear model and observation operators, while M and H are their linearized forms, or tangent linear operators, used to propagate analysis increments δx = x n − x n−1 from earlier simulation times to the kth observation.R is the observation error covariance matrix.The innovation, is the residual error between the real and modeled observations k at the end of 4D-Var iteration n − 1.
For each iteration of incremental 4D-Var, the model is linearized about a trajectory, which is a collection of stored values of all model state variables at all time steps within the assimilation window.This trajectory enables propagation of sensitivities forward and backward in time within the TLM and ADM.Each of these models are called in an inner loop to calculate the gradient of the observation cost function, ∇ x J o .An optimization algorithm uses the gradients to calculate optimal analysis increments to the CVs, which minimize the observation cost function.If the CVs, x n , depart too much from the initial guess for the current outer loop iteration, x n−1 , the model must be re-linearized about the new state, x n , using M. The purpose of the two-level optimization is that approximating M with M simplifies the full problem to a quadratic problem, and guarantees a unique solution to the minimization (Courtier et al., 1994).Refer to Huang et al. (2009) for more details on the WRFDA incremental method, including a full expression for ∇ x J given by Eq. ( 7) of that article.The main purpose of this work is to introduce the AD/TL model components of the WRFPLUS-Chem.

Tangent linear and adjoint model construction and verification
We have developed and tested adjoint and tangent linear code to represent aerosol-relevant processes in WRFPLUS-Chem.This development required a four step process: 1. automatically differentiate specific WRF-Chem modules using TAPENADE (Hascoët and Pascual, 2013) version 3.6; 2. verify stand alone TLM and ADM derivatives against finite difference approximations and debug as necessary; 3. incorporate code manually into WRFPLUS; 4. repeat step 2 for fully integrated WRFPLUS-Chem model.
TAPENADE takes discrete Fortran or C source code as input, then generates either TLM or ADM code using a usergenerated list of independent and dependent parameters.In addition to creating the differential code, TAPENADE reduces adjoint computational cost by eliminating unnecessary lines of code.Similar to Xiao et al. (2008) and Zhang et al. (2013), integrating the automatically differentiated adjoint code into WRFPLUS required significant manual intervention and debugging.Methods for constructing discrete adjoints are well-documented (Giering and Kaminski, 1998;Hascoët and Pascual, 2013).For the remainder of this section, we discuss the particular mechanisms for which we have created AD/TL models, and then we provide verification results for WRFPLUS-Chem.

Transport mechanisms
PBL physics and dry deposition in a column are handled by ACM2.The simple surface friction previously developed for WRFPLUS does not perform vertical mixing of tracers, which is a minimum requirement of any PBL scheme used in WRFPLUS-Chem.The ACM2 PBL depends on groundatmosphere interactions that necessitate additional surface layer and LSM AD/TL code.For example, the ACM2 PBL scheme depends on the friction velocity, U * , calculated in a surface layer scheme, which itself depends on wind speed, and the state variables u and v. ACM2 also depends on surface heat (HFX) and moisture (QFX) fluxes, which can be calculated within the surface layer or LSM code, but also depend on U * .The dependence of HFX and QFX on ground-incident shortwave radiation (GSW) is calculated in the LSM.GSW is calculated in the radiation scheme, and depends on the aerosol composition and atmospheric moisture phase and distribution.Because we have not developed radiation AD/TL code, this coupling is not represented in WRFPLUS-Chem yet.The dependencies themselves are illustrative of how ACM2, and indeed most any other PBL scheme available in WRF, is appropriate for representing chemistry-meteorology interactions critical to understanding short-term climate impacts from aerosols.ACM2 is compatible with the Monin-Obukhov and PX (options 91 and 7) surface layer options, as well as the SLAB and PX (options 1 and 7) LSM options.TLM and ADM code is developed for all of these choices, and have been tested in stand alone verification tests.In the interest of brevity, complete model verification in Sect.3.3 has been limited to the two PX options.Advection of inert tracers was added to WRFPLUS by X. Zhang (2012, personal communication).The same treatment has been applied to the "chem" array, with additional checkpointing and parallel communications.We generated stand alone TLM and ADM code for deep cumulus convection as handled by the Grell-Freitas cumulus scheme (Grell and Freitas, 2014).One of the major benefits of this cumulus scheme is the ability to use online-calculated cloud condensation nuclei (CCN) to account for the effect of aerosols on liquid and vapor water mass fractions.These parameters directly impact convection, including tracer transport.The ability of the stand alone AD/TL codes to produce the relevant members of the Jacobian has been verified for a single set of column conditions using similar methods as described in Sect.3.3.However, the FWM, TLM, and ADM do not yet account for vertical transport of chemical tracers, and thus have not been integrated into WRFPLUS-Chem.

Aerosol-specific components
GOCART is a bulk aerosol scheme that treats reactive species (BC, OC, sulfate) using a total mass approach and divides non-reactive species (dust, sea salt) into multiple size bins (Chin et al., 2000).Oxidative aging for both BC and OC is handled by a first-order decay from hydrophobic to hydrophilic forms using a time constant of 2.5 days.Sulfate (SO 2− 4 ) is produced from SO 2 and dimethyl sulfide precursor gasses in GOCART.Sulfate chemistry also requires offline-calculated values for nitrate and OH radical, which are taken from climatologies available from the PREP-CHEM-SRC preprocessor (Freitas et al., 2011).WRFPLUS-Chem includes both the carbon and sulfate chemistry AD/TL codes, but only the BC component is tested and applied here.
Emissions of aerosol precursors in WRF-Chem is a linear process corresponding to specific chemistry and emission inventory options.Emission magnitudes are calculated, then distributed spatially and temporally, in offline preprocessors.Typically, emissions are read in hourly following some diurnal pattern.In order to make the emissions code easily differentiable, scaling factors are added to the emissions such that At any simulation time, E c are the emissions most recently read in from file for chemical species c. α c,i sc and E c,i sc are the emission scaling factors and effective emissions, respectively, during scaling period i sc .For emission inversions, the CVs, x, are spatial-temporal resolved emission scaling factors.At the beginning of 4D-Var or during an adjoint sensitivity study, the scaling factors are set to unity.The scaling factors are applied in the FWM if the environment variable WRPLUS is set equal to 1 during compilation.Dry deposition velocities are calculated in WRF-Chem within the dry deposition driver.In order to ease adjoint code construction and reduce checkpointing requirements, the dry deposition velocity calculation is moved to immediately precede the PBL driver as depicted in Table 1.The new source code is similar to the dry deposition driver, except that only code corresponding to the GOCART aerosol option remains.The dry deposition AD/TL code accounts for dependencies of the dry deposition velocity on physical parameters (e.g., temperature, water vapor, U * ).As mentioned previously, the chemical concentrations are sensitive to dry deposition velocity within the PBL scheme.

Verification and linearity test
WRFPLUS FWM, TLM, and ADM performance were previously verified by Zhang et al. (2013).Here we use an alternative approach based on Taylor series derivative approximations, and similar to that used by, e.g., Henze et al. (2007), to verify WRFPLUS-Chem.We define a new cost function equal to a single predicted state variable, locally defined in grid cell p and at the end of time step f , J = SV p,f .We use the TLM, ADM, and a centered finite difference approximation from the FWM to evaluate derivatives with respect to some CV at location q and the initial time, 0. The finite difference derivatives are calculated from where each evaluation of J results from a FWM simulation with some perturbed value of x q,0 .δx varies between 0.1 and 10 % of the value of x q,0 .The adjoint and tangent linear derivatives are found by forcing the model gradient fields, λ * and λ, at J and x q , respectively.The tangent linear gradient and adjoint gradient variables are analogous to state variables in the FWM.We force gradients of 1.0, indicating a 100 % perturbation of the variable, and the resulting derivatives are retrieved from the model output gradient fields, such that and where M is the adjoint operator.
In order to evaluate our additions to WRFPLUS-Chem, we test cost functions equal to hydrophobic (BC 1 ) and hydrophilic (BC 2 ) black carbon concentrations in 100 different grid cells.We evaluate derivatives with respect to five CVs at three initial locations for each of those 200 cost functions.The CVs include BC emission scaling factors (α BC ) and initial conditions for BC 1 , zonal wind (U ), perturbation potential temperature (δ ), and water vapor mixing ratio (Q v ).All sensitivities apply over a 3 h duration for a domain covering the southwest USA.For a full domain and model setup description refer to Sect.5.1.1.Figure 1 shows that the maximum relative error between the TLM and ADM is in the 8th significant digit.Thus, we only need to compare the nonlinear model to the TLM to verify both the TLM and ADM.Those results are given in Fig. 2. The slope and R 2 statistic for a linear fit of those comparisons are very nearly unity for all CVs tested.Each of the plots in Figs. 1 and 2 depicts 600 derivative evaluations.A range of finite difference perturbations δx is used for U , δ , and Q v control variables in order to find a value of χ NL with the best compromise between truncation and roundoff error.We test derivatives with respect to meteorological variables in order to show the AD/TL models will be functional in a setting with coupled chemistry and physics.In such a system, the emissions will impact meteorology, which in turn impacts concentrations.These results illustrate the capability of the AD/TL models to represent the latter part of that relationship.All of the verification results apply to a 3 h simulation period, but longer simulations are needed to calculate the average influence of emissions on the modeled state space.

Second-order checkpointing
As discussed in Sect.2.2, the nonlinear model trajectory is an integral component for propagating gradients in the AD/TL models.As one might imagine, the trajectory contains a large amount of information.WRFPLUS stores the entire double precision trajectory in memory in order to eliminate expensive I/O time.This is very helpful with regards to storage, but presents a challenge in terms of memory.The system is designed for 6 h operational assimilation windows.In a typical WRFPLUS-Chem simulation there are at least 28 threedimensional state variables (8 physical, 1 to 3 moisture, and 19 GOCART species), and numerous other two-and onedimensional state variables that must be included in the trajectory.For an illustrative domain, simulating 3 h with a 90 s time step (18 km resolution), 79 × 79 columns, 42 levels, a 5 cell boundary width, and 28 3-D state variables, the trajec-tory would require 1.46 GB per core on 64 cores.This final cost per core includes a 50 % storage growth per doubling of the number of cores.Because the trajectory is stored for all time steps, required memory scales linearly with simulation duration and the number of simulated chemical species.For multi-day and multi-week inversions, as is typical in nonoperational chemical data assimilation, the memory requirements become impractical for most cluster computing systems.
To solve this problem we implement a second-order checkpointing scheme that shares the storage burden between the hard disks and memory.In a standard WRFPLUS adjoint simulation, the FWM is called first in order to calculate the trajectory.The FWM integrates the nonlinear equations from the initial to the final time, and stores the model trajectory at each time step.The ADM integrates the transpose of the linearized model equations backward in time, and at each time step reads the trajectory previously stored by the FWM.This process is depicted as "first-order checkpoint" in Fig. 3. Since the storage limitation is driven by the duration of a simulation, we break the simulation into smaller segments, while maintaining continuity in the adjoint derivatives.The checkpointed adjoint simulation begins with a full FWM simulation beginning at the initial time, t 0 , and ending at the final time, t f .WRF restart files are written at time intervals equal to the checkpoint interval, t c .Once the simulation is completed, the FWM is restarted at initial time equal to t f − t c .During that simulation, the trajectory is stored in memory.The trajectory is then recalled in an adjoint simulation that proceeds backward toward the current initial time.The checkpoint system alternately calls the FWM and ADM until returning to t 0 .The major hurdle to integrating this second-order checkpointing system into WRF-4-DVar is that the trajectory is no longer readily available to WRFDA for calculating modeled observations, H k M k (x n ) , between the calls to the forward and adjoint models.Instead, these values must be calculated during either the full FWM (step 1) or checkpoint FWM (steps 2, 4, 6, etc.) simulations.We take the former approach.A similar checkpointing system is also implemented for the TLM in order to enable long duration incremental 4D-Var.
In order to ensure the checkpointing method delivers consistent derivatives to the non-checkpointed version, we again compare AD/TL derivatives to finite difference approximations.Because of the wall time required to calculate derivatives across extended time periods, we limit our tests to 14 pairs of initial and final locations, q and p.For all of the J and x pairs tested in Sect.3.3, the ADM and TLM agree to 13 or more digits over a 9 h test.The improved performance relative to the previous 3 h test came about after increasing the precision of several variables in the TLM dry deposition subroutine.Because of this machine precision AD/TL agreement, we only compare the finite difference approximations to the TLM.For these checkpointed simulations, we analyze the derivative of a time variant cost function with respect to multiple control variables χ p,q (t) = ∂J p (t) ∂x q,0 .
Doing so ensures that the derivatives are continuous across multiple checkpoint intervals and we are able to see the transient behavior of multiple finite difference perturbation sizes at times when there are large discrepancies with the TLM.
The finite difference approximations of derivatives of BC with respect to the physical variables grow more unstable with time.Thus, we calculate those derivatives only for a 6 h period, while we test derivatives with respect to emissions for 48 h.Here we also include derivatives of U , δ , and Q v with respect to U and Q v to ensure that those relationships are represented properly in the surface layer, LSM, and PBL AD/TL schemes, so that they may be used in a meteorological 4D-Var setting.
Figure 4 shows the resulting derivatives for nine different pairs of J and x for a single pair of q and p.Most importantly for multi-day 4D-Var emissions inversions, and as would be expected, BC concentrations respond linearly to a 1 % perturbation of emissions for at least 48 h.Next, it becomes apparent why derivatives with respect to U and Q v require multiple finite difference perturbation sizes to ensure one of them matches the TLM at a particular cost function evaluation time.There are times when either the smallest, largest, or no value for δx agrees with the TLM.However, the TLM has inflection points at the same times as the finite difference approximations, including during fast transient periods, such as for ∂U ∂U and ∂U ∂Q v .The duration over which the tangent linear assumption is valid for chemical responses to U and Q v depends on the size of the perturbation and on the local meteorological regime.For instance, the test results shown here are for a response location very near the California coast, but better agreement was found farther inland.Further testing of the coupled derivatives will be necessary to determine over what time period they are suitable for inverse modeling, and under what conditions the model nonlinearities cease to be a limiting factor.Future inversions with coupled physics and Fully normalized time variant sensitivities calculated with the TLM with second-order checkpointing and with multiple finite difference perturbation sizes.Each plot is for a single pair of source and receptor locations, q and p.
chemistry will need to verify that ∂J ∂α has a near-linear response over the time frame considered.The behaviors noted here are similar or improved across the other 13 pairs of q and p.

Sensitivities to BC emissions in California
Here we demonstrate the new WRFPLUS-Chem capabilities in an adjoint sensitivity study.For the present example, the 4D-Var cost function is the model response metric and the biomass burning, and weekday and weekend anthropogenic emissions are the model parameters of interest.This framework is used to analyze where and when these parameters most impact the model performance and are thus in need of improvement.

Approach
For this demonstration, we calculate the sensitivity of the 4D-Var cost function in the first iteration.The background term is zero and there has been no prior CV increment (i.e., δx = 0).Therefore, the cost function, Eq. ( 1), simplifies to All off-diagonal covariances in R are assumed to be zero in order to enable timely matrix inversion.

Model configuration
The model domain encompasses California and other southwest US states from 20 June 2008, 00:00:00 UTC to 27 June 2008, 09:00:00 UTC.We generated chemical initial conditions by running WRF-Chem for 5 days prior to the adjoint time period.We used the default WRF-Chem boundary condition for BC concentration of 0.02 µg kg −1 .This is consistent with a single upwind Pacific Ocean transect taken during the 22 June flight.Meteorological initial and boundary conditions are interpolated from 3 h, 32 km North American Regional Reanalysis (NARR) fields.The horizontal resolution is 18 km throughout, and there are 42 vertical levels between the surface and model top at 100 hPa.The η levels are 1.000, 0.997, 0.993, 0.987, 0.977, 0.967, 0.957, 0.946, 0.934, 0.921, 0.908, 0.894, 0.880, 0.860, 0.840, 0.820, 0.800, 0.780, 0.750, 0.720, 0.690, 0.660, 0.620, 0.570, 0.520, 0.470, 0.430, 0.390, 0.350, 0.310, 0.270, 0.230, 0.190, 0.150, 0.115, 0.090 , 0.07 , 0.052, 0.035, 0.020, 0.010, and 0.000.For a column where the ground is at sea level, there are 13 levels below 1 km and an additional 5 levels below 2 km.The subgrid physics options used are described in Sect.2.1.Anthropogenic emissions are taken from the US EPA (Environmental Protection Agency) 2005 National Emissions Inventory (NEI2005).Fire emissions are provided by the Fire INventory from NCAR (FINN Version 1) (Wiedinmyer et al., 2011(Wiedinmyer et al., , 2006)).FINN uses Moderate Resolution Imaging Spectroradiometer (MODIS) active fire locations and radiative power from NASA Terra and Aqua satellites, as well as speciated emission factors for four vegetation types, to calculate daily total 1 km resolution emissions.Burned areas are scaled to the combined fractional coverage of each 1 km 2 fire pixel by tree and herbaceous vegetation types assigned by the MODIS vegetation continuous fields product (Hansen et al., 2003).Repeated fire detections in a single fire pixel are removed according to Al-Saadi et al. (2008).Plume rise injection heights are calculated in WRF-Chem by an embedded one-dimensional cloud-resolving model (Freitas et al., 2007(Freitas et al., , 2010;;Grell et al., 2011).

Model-observation comparison
We compare the model to observations in individual time steps, which differs from previous data assimilation approaches with WRF.In the standard WRFDA 4D-Var architecture, observations are binned over intervals, or windows, typically of 1 h or longer duration.Whereas WRFDA typically has k observation windows, here WRFDA-Chem and WRFPLUS-Chem handle k observation time steps, each of which might have multiple measurements available.In order to reduce memory requirements, the adjoint forcing is stored in a column array, instead of the 2-D and 3-D arrays that were required for each state variable for each window, k in WRFDA.Also, while WRFDA includes meteorological observation operators to be called offline, the new fine temporal resolution observation operator is called directly within WRFPLUS.The traditional approach not only made communication between WRFDA and WRFPLUS less cumbersome, but also limited the ability to use dynamic observations recorded across broad temporal scales in an inversion.
In situ observations were collected throughout California during the June 2008 portion of the Arctic Research of the Composition of the Troposphere from Aircraft and Satellites field campaign in collaboration with the Califor-nia Air Resources Board (ARCTAS-CARB) (Jacob et al., 2009).Instruments aboard the DC-8 aircraft measured trace gas and aerosol concentrations over 4 days, including absorbing carbonaceous aerosol from the single particle soot photometer (SP2) at 10 s intervals (Sahu et al., 2012).Additionally, 41 Interagency Monitoring of Protected Visual Environments (IMPROVE) sites measured daily average surface light absorbing carbon (LAC) on 20, 23, and 26 June by thermal/optical reflectance (TOR) analysis of quartz filters (Malm et al., 1994).Surface and aircraft observation locations during the campaign are indicated in Figs. 5 and 6.The aircraft trajectories are overlaid on MODIS Aqua true color images (Gumley, 2008), and locations of MODIS active fires (NASA , 2014).While we use IMPROVE elemental carbon (EC) and SP2 absorbing carbon as equivalents herein, Yelverton et al. (2014) found that the former is approximately 7 % higher than the latter, but that their error bars overlap.For the qualitative analysis performed in this demonstration, bias correction would not change any of the final conclusions.
The observation operators for aircraft and surface observations require temporal averaging.The 10 s resolution ARC-TAS observations of BC concentration, pressure, latitude, and longitude are averaged to the 90 s model time step, which is approximately the time the DC-8 would take to traverse a single 18 km × 18 km column.However, the 10 s resolution ARCTAS BC concentrations are revision 2 (R2), while a later revision 3 (R3) product was released at 60 s resolution only.The later revision includes additional mass in the 50-900 nm size range as a result of applying a lognormal fit.In order to utilize this improved product, as well as leverage the finer resolution observations, the 10 s BC mass is scaled by the mass ratio between the 60 s R3 and the 60 s average R2 data sets.The scaled 90 s average observations are compared directly with the nearest model grid cell so that the model values are not interpolated.The pressure measurements are compared to online model pressures to determine the model level of each observation.For 24 h average surface measurements from IMPROVE, the observation operator averages the nearest model surface grid cell concentration over all time steps within the observation period.For the few surface sites that have two air samplers measuring simultaneously, they are averaged together to prevent non-zero correlation in the cost function (i.e., off-diagonal terms in R).After all averaging, there are 995 aircraft observations and 107 surface observations.
As depicted in Fig. 7, the WRF-Chem simulation is, on average, biased low for both the surface and aircraft observations.The lowest biased aircraft observations tend to be at higher altitudes, although this is not true in all cases.There are many high-biased observations, and they tend to be at lower altitudes and to occur earlier in the simulation period when anthropogenic emissions dominate.Both surface and aircraft model predictions exhibit a wide spread of positive and negative errors.In order to determine potential causes for bias in specific locations, we consider the model residual errors, or simply "residuals," for each aircraft observation k. Figure 6 shows the statistically significant (p < 0.32) residuals for observations above and below the top of the model PBL.Section 5.1.3describes relevant measures of observation variance and statistical significance.
Negative residuals, and hence low model bias, are most prevalent in northern California on 22 and 26 June, most likely due to underprediction of biomass burning sources.There is also low bias above the PBL in the southern San Joaquin Valley on 20 June and below the PBL inland from San Diego on 24 June.Although neither case has visual smoke in the MODIS images, there were fires detected within 300 km.The largest positive residual occurs in Palmdale, CA, close to landing on 24 June.It could be indicative of either an emission error or the coarse horizontal resolution that collocates the airport with other significant nearby sources.Other notable high model biases aloft occur near cities during the flights on 20, 22, and 24 June.Similarly, surface site biases are higher near cities, and along the coast.As might be expected, proximity to sources is a strong indicator of error magnitude, as that is where the highest concentrations occur.The error sign and magnitude on 24 June differs in the PBL and free troposphere.That and the spatial error pattern could reflect some combination of meteorology and emissions deficiencies.For the positive residuals off the coast of Los Angeles on 22 and 24 June, there could be errors in predicting vertical mixing associated with the land-sea circulation or predominant near-surface wind direction.Discerning errors caused by emissions from those caused by meteorological mechanisms would require a separate in-depth study.) and above ground level (a.g.l.).

Variance and residual error significance
When R is assumed to be diagonal, each residual in the 4D-Var cost function is weighted inversely proportional to the observation error variance.The form of the cost function is based in Bayesian statistics, with an aim of converging on posterior control variables in a maximum-likelihood sense.However, using the variance alone to weight the residuals may result in very large cost function terms for relatively small residual errors.As our interest in this study is to determine how errors in emission estimates may lead to model bias, we wish to ensure the largest residuals have the greatest weight, while also accounting for differences in statistical significance of particular errors.Thus, we define the diagonal terms of R as where w k is an additional weighting term and σ 2 k,k is the variance.
The variance is comprised of components due to both observation and model uncertainty as The model variance at each observation location is found from an ensemble of N c = 156 WRF-Chem configurations during the modeling period.Each ensemble member, c, uses a different combination of PBL, surface layer, LSM, and long-wave and shortwave radiation options.Also, there are configurations both with and without microphysics and cumulus convection.From the ensemble, we use the population of residuals at each observation, k, to calculate the model variance where MML is the minimum model limit.The minimum possible modeled BC concentration is limited by the boundary condition, which fills the entire model domain during the 5 day warm-up simulation.The MML is simply taken as the minimum model concentration for all observation locations and all model configurations, and is found to be 0.01 and 0.02 µg m −3 for aircraft and surface measurements, respectively, after rounding to the observation precision.
The IMPROVE instrument variance combines both relative and absolute uncertainties, the latter of which arises due to the minimum detection limit (MDL) (UC-Davis, 2002).For a single filter analysis, the variance (in µg 2 m −6 ) is The sub-observation index l k is useful at sites with more than one air sampler.When a site has data from multiple instruments in a single day, we take their average and combine their instrument variances as where L k is the observation count.We assume each IM-PROVE measurement fully represent the encompassing grid cell, since all sites are in remote locations and the samples are averaged over a 24 h period.
In contrast, the aircraft variance must capture the representativeness uncertainty associated with comparing the average of an entire model grid cell with an average of multiple short duration segments of a sparse aircraft transect.According to commercial literature for the SP2 device, it has an MDL of 0.01 µg m −3 , which we assume applies over the 10 s observation interval used during the ARCTAS campaign.The observations available through the NASA ARCTAS data archive www.geosci-model-dev.net/8/1857/2015/have a BC mass concentration uncertainty of ±30 %.Although Sahu et al. (2012) report ±10 % BC mass uncertainty, that range is given by Kondo et al. (2011), who state their results are applicable in regions not impacted by refractory organic compounds, such as from biomass burning sources.Because there are significant burning sources in this domain, we adopt the more conservative range.We utilize the instrument uncertainties in a definition for total observation variance with components due to both averaging and representativeness, such that for each average aircraft measurement, the total variance is Adding the minimum variance associated with the MDL prevents the total variance from trending toward zero for any particular observation.This is important when using the variance in the cost function to ensure that near-zero observations -which have low variances -with small residuals do not dominate the inversion.The representative variance is the variance of the y l k 's that makeup ȳk , which is an attempt to capture the spread of true concentrations in a model grid cell.
In the case that there is only a single observation, the representative uncertainty is taken as double the instrument uncertainty.Thus, For any time step where L k < L max = 9, there is an additional variance penalty proportional to the sum of the individual instrument variances, where The square root term in Eq. ( 20) inflates the instrument error in cases when there are fewer than L max samples in the mean.
In order to motivate the weight, w k , applied to each residual model error, let us consider the primary inputs to the adjoint simulation, which are the adjoint forcings c k is any state variable on which H k depends and which M k (x b ) predicts.For our purposes, the state variables are modeled BC concentrations.The adjoint of the observation operator, H k transforms the forcing from observation space (λ * k,o ) back to model space (λ * k,m ).Thus, the forcing in observation space is Observations with significant model bias would require the largest perturbation in control variables to alleviate, and would seem to inform the inversion process the greatest.However, they must also have low total variance to contribute to an inversion.Figure 8 shows the surface and aircraft SD (standard deviation) plotted vs. residual error.Also plotted in that figure are 1 and 2 SD zones, as well as lines of constant λ * k,o for all w k = 1.Any residual falling outside the 2σ zone has a combined model and observation SD that is small enough to determine with 95 % confidence (p < 0.05) that the residual error deviates from zero (i.e., the model and observation disagree).These statistically significant model errors indicate that some kind of inversion is worthwhile.In their multi-cycle 4D-Var approach, Bergamaschi et al. (2009) eliminated observations outside 3 SDs after an initial 4D-Var cycle, with the thought that incorrect model physics prevents those residual errors from being fixed with 4D-Var.Thus, while statistically significant residuals are important to driving a 4D-Var inversion, that they remain afterward is a strong indication of errors in the model description that cannot be fixed through adjustments to emissions.Figure 8 shows that the relative contributions of observation and model variances is in general proportional to the relative magnitudes of observed and modeled concentration.Specifically, model (observation) variation contributes to a large fraction of uncertainty in positive (negative) residuals.
When both the observed and modeled concentrations are small, the total variance decreases to the minimum possible value, governed by the MML and MDL.This generally happens in remote regions, where small concentrations result from some combination of small nearby sources and transport from many distant sources.If the total variance is small enough relative to the residual error, λ * k,o will be very large, often larger than in cases with larger residual errors (see Fig. 10a).The adjoint model propagates a relatively large forcing from a small residual backward, resulting in large sensitivities to emission scaling factors.These sensitivities then translate to large emission perturbations in the optimization process.
The residual errors in remote locations are likely within combined model and observation uncertainty, but the model variance at these locations is unrealistically small.The ensemble will underestimate variance at observations near lowbiased prior sources due to the absence of tracer mass.The opposite may be true for a high-biased prior.The challenge then is to define the concentration uncertainty introduced by the model physics, independent of the magnitude of emissions, which we attempt to do with a weighting scheme.The weights are used only to inflate variance, which when very low is thought to misinform the adjoint about concentration errors.Variance reduction may be necessary for observations near high-biased sources.Also, while we apply the weights to the total variance, they could be applied to only the model portion.Here we are developing a philosophy for scaling the variances, of which the following description is but one example.
Because our goal in an emission inversion is to reduce model bias by perturbing emissions, model bias is itself an important characteristic.We use the ensemble of model configurations to calculate the variance in all residual errors; that is, The residual SDs, σ r , are 0.69 and 0.29 µg m −3 for surface and aircraft observation populations, respectively.After confirming that the residual errors are approximately normally distributed, the significance of the bias of a single observation relative to the entire population is In statistics, the ratio of | r k | σ r is called the z value, and denotes the number of SDs between r k and the expected value of zero.The variable r k indicates the user must select a specific form of residual error.Two examples are the mean or median of r k .A third approach, and the one taken here, is to use the residual found in the first 4D-Var iteration, r k,n=0 .f POP,k is a continuously variable p value, or the percentage of the population of all r k,c that is less significant than r k .Another measure of significance is visualized in the σ zones of Fig. 8, and was discussed previously.That is, for an individual residual error and variance, what is the probability that there will always be a mismatch between the model and observation?The individual error significance is The population and individual error significance are combined to derive the adjoint-forcing weight, The weighting scheme can be tuned for a specific application using the γ and β parameters to reshape the adjointforcing contours.However, care must be taken when selecting γ , β, and r k to ensure convergence in 4D-Var.Use of these weights may imply that residual errors do not fit a Gaussian distribution.Here we only introduce the weighting scheme and use it in a demonstration, but do not verify its validity.We use γ = 0.5 to provide some balance between the two measures of significance and β = 2 to ensure the weighting has a large impact.After calculating the w k 's according to Eq. ( 27), the new effective adjoint forcings are compared to the original values in Fig. 9.The weighting scheme is successful at reducing the impact of observation errors with low significance on the cost function.
After applying the new weighting scheme, the λ * k,o contours no longer converge on the y axis as depicted in Fig. 8. Instead, they exit radially from the origin in all directions.As both the population and individual z values approach zero, the adjoint-forcing converges toward For our specific values of σ r , all residual errors within the 2σ zone satisfy |λ * k,o | 5 µg −1 m 3 for surface, and  ) of the 4D-Var cost function (for surface and aircraft observations) with respect to anthropogenic and burning emission scaling factors overlaid on MODIS Aqua true color images for 6 days during the simulation.Anthropogenic sensitivities with magnitudes less than 1 % of the maximum anthropogenic sensitivity magnitude are removed.There is a marker for all grid cells with non-zero burning emissions.

|λ *
k,o | 10 µg −1 m 3 for aircraft observations.This is a considerable change from the unity weights where |λ * k,o | was as large as 200 µg −1 m 3 in the region between the 1σ and 2σ zones.

Results and discussion
With the weighting function applied, we calculate sensitivities of the 4D-Var cost function with respect to emissions for determining potential sources of model bias.The weights reduce the cost function from 5374 to 3784, which increases the normalized cost-function sensitivity to emission pertur-bations.Figure 10 shows fully normalized sensitivities, for 6 days of the simulation.The sensitivity in a particular grid cell is summed over the local diurnal cycle for hours n = [1, . .., 24] on day d.For anthropogenic emissions, the local time is calculated for discrete 15 • time zones, whereas for biomass burning emissions, local time corresponds to the continuous sun cycle.Undoubtedly, there are locations with positive and negative sensitivities at different times of day that will cancel, but this temporally aggregated sensitivity is an attempt to obtain average daily relationships across the domain.Although the color bar has been saturated at ±5 × 10 −3 , the full range of sensitivities are from −2.3×10 −3 to +7.1×10 −3 and −4.9×10 −3 to +11.3×10 −3 for anthropogenic and biomass burning emissions, respectively.
The magnitude of a normalized sensitivity corresponds to the fractional response in the cost function given a 100 % perturbation of emissions in a grid cell.If the model were perfect, the sensitivity magnitudes would be proportional to the difference between the background emission estimate and the true value.Thus, a negative sensitivity indicates a location where estimated sources are too low, and vice versa.Because the sensitivities themselves depend on the emission magnitudes, they will change in each 4D-Var iteration, eventually converging on a minimum of the cost function where the sensitivities are zero.We use the sensitivities here as a qualitative indicator of emission errors, and not a quantitative conclusion as might be provided with a complete inversion.
The sensitivities exhibit a similar spatial-temporal pattern as the residual errors in Figs. 5 and 6, with the exception of sources far upwind of observations.Near observations, estimated anthropogenic emissions are too high, and estimated fire emissions are too low.Indeed, most of the non-negligible (not black) burning sensitivities are negative on 24 June and 25 June, likely due to the high-altitude negative residuals on 26 June.The positive coastal fire sensitivities on 22 June and 23 June are attributable to positive forcing at the Point Reyes National Seashore IMPROVE site on 23 June and along the DC-8 flight track on 24 June.The influence of those fires on Los Angeles BC concentrations 24 h or more after the emission was determined through a sensitivity test where a perturbed residual error and adjoint forcing on 24 June were propagated through the adjoint.
The spatial variations in sensitivities reveal two phenomena.First, appreciable sensitivities will only arise in emissions that influence the particular observations available.Thus, full observation coverage is imperative to a successful inversion.Second, emission errors are heterogeneous in space and time.For the FINN inventory, heterogeneity arises due to missed detections in the MODIS active fire product, as well as potential errors in vegetation classification or attribution of a particular vegetation class to one of four land cover types.Anthropogenic source error heterogeneity could be due to a static inventory from 2005 being used to describe emissions in 2008, or to spatial variations in BC emission factors for a particular source sector.
All of the conclusions that might be drawn from the sensitivity maps about emission errors are subject to the assumption that the transport is correct in this model configuration.The SLAB LSM scheme (option 1) is used in place of the PX option to calculate comparative adjoint sensitivities.Relative to the PX option, these results exhibited non-negligible negative sensitivities to fires in the Shasta-Trinity National Forest on 23 June, but much larger positive sensitivities to those same fires on 22 June.Sensitivities with respect to coastal fires in the Los Padres National Forest also increased on 22 June.The spatial sensitivity patterns between SLAB and PX options are consistent on 25 June.The differences are presumably due to changes in the residual error between the two configurations.The differing spatial sensitivity patterns indicate that the surface heat and moisture fluxes calculated by each LSM scheme contributes non-negligibly to the vertical mixing of BC to aircraft measurement altitudes.
We also consider temporal sensitivity patterns to compare the two LSM schemes.Figure 11 shows the diurnal distribution of biomass burning, and weekday and weekend anthropogenic BC emission sensitivities for both of the LSM configurations, and for unity weights, w k = 1 and w k from Eq. ( 27).Each bar in that plot represents a summation of sensitivities across the whole domain from 20 June, 00:00:00 UTC to 26 June, 23:00:00 UTC (d = [1, . .., n d ]) within a particular local hour, n, such that The signs and magnitudes of sensitivities fit the previous description for the spatially distributed temporal aggregation.
The time period of emissions to which an observation is most sensitive depends on the altitude of that observation and the flow mechanisms that transport emitted aerosol mass to that observation.Thus, any conclusions drawn could be biased if observations do not have full temporal coverage, especially near sources.Since normalized sensitivities are proportional to emissions, it is to be expected that sensitivities at peak emission hours are magnified.Also, each hour of sensitivity is a sum of many diverse source locations.So while the net sensitivity in a particular hour may be positive, the spatial distribution of sensitivities is much more varied, as was shown in Fig. 10.The FINN biomass burning inventory applies an identical diurnal emission apportionment for all fires, regardless of vegetation, shading due to slopes, wind speed, or relative humidity.This scaling is applied in preprocessing.Both the PX and SLAB LSM setups seem to agree that fire emissions between 10:00 and 18:00 local time are overpredicted.Without the weighting scheme, the PX configuration indicates that the peak should be smoothed out, while the SLAB configuration concludes that the peak should be made sharper by reducing off-peak emissions.With the weighting scheme applied, both configurations agree that the peak is timed correctly.While the two setups disagree over the magnitude of fire emission correction required, their differences are small in comparison to the implied anthropogenic correction.The relative disagreement in burning sensitivity magnitude between the two LSM configurations is attributable to differences in residual errors, r k , and the resulting adjoint forcings, λ * k,o .In contrast to FINN, NEI applies a variety of diurnal patterns to point, area, and traffic sources.The weekend and  27).Also plotted are diurnal emission fractions.Sensitivities were calculated for two different WRF LSM options and are shown separately for biomass burning, and weekend and weekday anthropogenic emissions.
weekday emission profiles shown in Fig. 11 are the emission weighted averages for the entire domain.Individual sources may have a profile closer to flat, or alternatively zero overnight, and flat during daylight hours.The weighted average profile shown is close to the one used for commercial diesel traffic, since that is the largest BC source within the domain.Attributing sensitivities, or errors, to specific sectors is not straight forward and doing so may require a smaller horizontal grid spacing to reduce the number of sectors per grid cell.Results for the weighted and unweighted cost functions are very similar.In general, anthropogenic emissions are too high throughout all times of the day on both weekdays and weekends.Both LSM configurations indicate not only that the weekday profile peak should be sharper near 14:00 LT, and not at 16:00 LT, but also that emissions from 06:00 to 16:00 LT should be closer to the late evening and early morning magnitudes.The weekend sensitivities indicate the evening and morning emissions are too high, and that the daytime peak is timed about right, with the exception of the 18:00 LT spike.However, the relatively small magnitude of weekend sensitivities could also indicate there were not enough observations of anthropogenic sources on 21 June (SAT) and 22 (SUN) to draw definitive conclusions about emission timing.
Results for the two LSM options reveal the potential for model configuration to introduce bias in a 4D-Var inversion.For these particular observations, the posterior emissions from the PX option would likely be higher than those from the SLAB option, because of their relative sensitivity values.Model variability must be taken into consideration in 4D-Var sensitivity studies of high-resolution emissions, because model variation represents a large fractional contribution to observation error variance for positive residuals, as shown in Fig. 8.

Conclusions
We have implemented, verified, and demonstrated the WRFPLUS-Chem coupled meteorology and chemical adjoint and tangent linear models for PBL mixing, emission, aging, dry deposition, and advection of BC aerosol.A second-order checkpointing scheme enables tangent linear and adjoint model runs longer than 6 h.The adjoint was used in the first iteration of a 4D-Var inversion within WRFDA-Chem, where model-observation residual errors are compared for low-and high-temporal resolution IMPROVE surface and ARCTAS-CARB aircraft observations during 1 week of June 2008.A novel cost function weighting scheme was devised to reduce the impact of low-significance observations in future 4D-Var inversions.The adjoint sensitivities also indicate that anthropogenic emissions are overpredicted and biases in burning emissions are spatially and temporally heterogenous.The diurnal sensitivities would seem to indicate that burning emission profiles should be steeper midday, while anthropogenic emission profiles should be flattened on weekdays and sharpened on weekends.A full inversion is necessary to quantify the magnitude of the errors in the emissions.Additionally, adjoint sensitivities found using two different LSM options indicate that the results of such inversions will be sensitive to the choice of model configuration.
The next steps are as follows.We intend to incorporate tangent linear and adjoint observation operators for useful remote-sensing products (e.g., aerosol optical depth (Saide et al., 2013) and absorbing aerosol optical depth).This addition will enable WRFDA-Chem to be applied to a wider range of domains and time periods and in operational forecasting.The WRFDA-Chem optimization algorithm still needs to be applied to control variables for chemical species initial conditions and emission scaling factors.Future de-velopment and incorporation of radiation and microphysics adjoints (e.g., Saide et al., 2012) will provide coupling between aerosols and meteorology, and provide new insights into sensitivities of direct, indirect, and semi-direct radiative forcing to emission sectors and locations.In addition to the aerosol applications discussed, WRFDA-Chem 4D-Var will also be suited to emission inversions for green house gasses and other chemical tracers.

Code availability
Although an annual code release may be available in the future, WRFPLUS-Chem and WRFDA-Chem are continually being developed.A static version would not include the most recent bug fixes.Interested users can obtain the code as it is by contacting the authors: J. J. Guerrette (jonathan.guerrette@colorado.edu) and D. K. Henze (daven.henze@colorado.edu).Potential developers may also contact NCAR scientist H. C. Lin (hclin@ucar.edu)for access to the WRFPLUS-Chem repository.Any questions should be directed to the authors.

Figure 5 .
Figure 5. Surface site residual model error, r k , overlaid on MODIS Aqua true color images and active fire retrievals.Observations with a bias less than 1 SD are also indicated.

Figure 6 .
Figure 6.Aircraft residual model error, r k , with indication for the observation height relative to the model PBL height overlaid on MODIS Aqua true color images and active fire retrievals.Observations with a bias less than 1 SD are also indicated.

Figure 7 .
Figure 7. Linear fits between model BC concentrations with slope (m) and coefficient of determination R 2 for (a) IMPROVE surface and (b) ARCTAS-CARB aircraft observations colored by model height above mean sea level (a.m.s.l.) and above ground level (a.g.l.).

Figure 8 .
Figure 8. Model and total observation error SD (σ k,m , σ k ) vs. model residual error (r k ) with adjoint-forcing (λ * k,o ) contours corresponding to w k = 1 for (a) surface and (b) aircraft observations.1σ and 2σ zones reflect regions of increasing statistical significance.

Figure 11 .
Figure 11.Diurnal normalized sensitivities ( ∂ln J ∂ln E n ) of the 4D-Var cost function with respect to emissions scaling factors for (a, b, and c) w k = 1 and (d, e, and f) w k from Eq. (27).Also plotted are diurnal emission fractions.Sensitivities were calculated for two different WRF LSM options and are shown separately for biomass burning, and weekend and weekday anthropogenic emissions.

Table 1 .
Status of AD/TL development for WRF-Chem processes.