Implementation and evaluation of an array of chemical solvers in a global chemical transport model

This paper discusses the implementation and performance of an array of gas-phase chemistry solvers for the state-of-the-science GEOS-Chem global chemical transport model. The implementation is based on the Kinetic PreProcessor (KPP). Two perl parsers automatically generate the needed interfaces between GEOS-Chem and KPP, 5 and allow access to the chemical simulation code without any additional programming e ﬀ ort. This work illustrates the potential of KPP to positively impact global chemical transport modeling by providing additional functionality as follows. (1) The user can select a highly e ﬃ cient numerical integration method from an array of solvers available in the KPP library. (2) KPP o ﬀ ers extreme ﬂexibility for studies that involve changing the 10 chemical mechanism (e.g., a set of additional reactions is automatically translated into e ﬃ cient code and incorporated into a modiﬁed global model). (3) This work provides immediate access to tangent linear, continuous adjoint, and discrete adjoint chemical models, with applications to sensitivity analysis and data assimilation.

The GEOS-Chem native chemistry solver is the Sparse Matrix Vectorized GEAR II (SMVGEARII) code which implements backward differentiation formulas and efficiently solves first-order ordinary differential equations with initial value boundary conditions in large grid-domains (Jacobson and Turco, 1994;Jacobson, 1998). These sparse 5 matrix operations reduce the CPU time associated with LU-decomposition and make the solver very efficient. This code vectorizes all loops about the grid-cell dimension and divides the domain into blocks of grid-cells and vectorizes around these blocks to achieve at least 90% vectorization potential. SMVGEARII uses reordering of grid cells prior to each time interval to group cells with stiff equation together and non-stiff 10 equations together. This allows SMVGEARII to solve equations both quickly and with a high order of accuracy in multiple grid-cell models.
Calculation of tropospheric chemistry constitutes a significant fraction of the computational expense of GEOS-Chem. Given the push for running GEOS-Chem at progressively finer resolutions, which will invariably involve solving increasingly stiff chemical 15 systems, there is a continual need for efficient implementation of sophisticated numerical methods. This requires a seamless, automated implementation for the wide range of users in the GEOS-Chem community.
The present paper presents an implementation of GEOS-Chem gas-phase chemistry using the Kinetic PreProcessor KPP. This work extends the set of chemical solvers 20 available to GEOS-Chem users, and provides implementations of tangent linear and adjoint chemical models which are needed in sensitivity analyses and data assimilation studies.
The paper is organized as follows. Section 2 briefly reviews KPP capabilities. Section 3 describes the integration of the KPP tools in the GEOS-Chem framework. We

The Kinetic PreProcessor KPP
The Kinetic PreProcessor KPP (http://people.cs.vt.edu/ ∼ asandu/Software/Kpp) is a software tool to assist computer simulations of chemical kinetic systems (Damian et al., 2002;Sandu and Sander, 2006). The concentrations of a chemical system evolve in time according to the differential law of mass action kinetics. A computer 5 simulation requires an implementation of the differential law and a numerical integration scheme. KPP provides a simple natural language to describe the chemical mechanism and uses a preprocessing step to parse the chemical equations and generate efficient output code in a high level language such as Fortran90, C, or Matlab. A modular design allows rapid prototyping of new chemical kinetic schemes 10 and new numerical integration methods. KPP is being widely used in atmospheric chemistry modeling (Kerkweg and Sander, 2007;Hakami et al., 2007;Carmichael et al., 2008;Errera and Ronteyn, 2001;Errera et al., 2008;Henze et al., 2007Henze et al., , 2008. The simulation code can be easily re-generated following changes to the chemical mechanism. KPP-2.2 is distributed under the provisions of the GNU li-15 cense (http://www.gnu.org/copyleft/gpl.html) and the source code and documentation are available at http://people.cs.vt.edu/ ∼ asandu/Software/Kpp.
KPP incorporates a comprehensive library of stiff numerical integrators that includes Rosenbrock, fully implicit Runge-Kutta, and singly diagonally implicit Runge-Kutta methods (Sandu et al., 1997). These numerical schemes are selected to be very 20 efficient in the low to medium accuracy regime. Users can select the method of choice and tune integration parameters (e.g., relative and absolute tolerances, maximal step size, etc.) via the optional input parameter vectors ICNTRL U and RCNTRL U. High computational efficiency is attained through fully exploiting sparsity. KPP computes analytically the sparsity patterns of the chemical Jacobian and Hessian, and outputs Introduction

Tables Figures
Back Close

Full Screen / Esc
Printer-friendly Version Interactive Discussion which are needed in sensitivity analysis and data assimilation studies Sandu et al., 2003). Flexible direct-decoupled and adjoint sensitivity code implementations are achieved with minimal user intervention. The resulting sensitivity code is highly efficient, taking full advantage of the sparsity of the chemical Jacobians and Hessians. Adjoint modeling is an efficient tool to evaluate the sensitivity of a scalar 5 response function with respect to the initial conditions and model parameters. 4D-Var data assimilation allows the optimal combination of a background estimate of the state of the atmosphere, knowledge of the interrelationships among the chemical fields simulated by the model, and observations of some of the state variables. An optimal state estimate is obtained by minimizing an objective function to provide the best fit to all 10 observational data (Carmichael et al., 2008). Three-dimensional chemical transport models whose adjoints have been implemented using KPP include STEM (Carmichael et al., 2008), CMAQ (Hakami et al., 2008), BASCOE (Errera et al., 2008(Errera et al., , 2001, and an earlier version of GEOS-Chem (Henze et al., , 2008.

15
In this section we discuss the incorporation of the KPP generated gas-phase chemistry simulation code in GEOS-Chem. A simple three step process modifies GEOS-Chem version 7-04-10 to use KPP. This process is summarized in Fig. 1. First, the geos2kpp parser.pl reads the chemical mechanism description file globchem.dat and creates three KPP input files -also con-20 taining information on the chemical mechanism, but in KPP format. Second, KPP processes these input files and generates Fortran90 chemical simulation code; the model files are then copied into the GEOS-Chem v7-04-10 directory. A special directive has been implemented in KPP to generate code that interfaces with GEOS-Chem. The third step involves running the gckpp parser.pl to modify GEOS-Chem source code to GMDD 2, 2009 Chemical solvers in global atmospheric models P. Eller et al. KPP requires a description of the chemical mechanism in terms of chemical species (specified in the *.spc file), chemical equations (specified in the *.eqn file), and mechanism definition (specified in the *.def file). The syntax of these files is specific, but 5 simple and intuitive for the user. Based on this input KPP generates all the Fortran90 (or C/Matlab) files required to carry out the numerical integration of the mechanism with the numerical solver of choice. GEOS-Chem SMVGEARII uses the mechanism information specified in the "globchem.dat" file; this file contains a chemical species list and a chemical reactions 10 list. The perl parser geos2kpp parser.pl translates the chemical mechanism information from "globchem.dat" into the KPP syntax and outputs it in the files "globchem.def", "globchem.eqn", and "globchem.spc".
The globchem.dat chemical species list describes each species by their name, state, default background concentration, etc. For example, consider the description of the 15 following two species in the SMVGEARII syntax globchem.dat: A A3O2 1.00 1.000E-20 1.000E-20 1.000E-20 1.000E-20 I ACET 1.00 1.000E-20 1.000E-20 1.000E-20 1. 000E-20 This information is parsed by geos2kpp parser.pl and translated automatically into the 20 KPP syntax. Specifically, the species are defined in the "globchem.spc" file and are declared as variable (active) or fixed (inactive). Variable refers to medium or short lived species whose concentrations vary in time and fixed refers to species whose concentrations do not vary too much. Species are set equal to the predefined atom IGNORE to avoid doing a species mass-balance checking for the GEOS-Chem implementation Introduction The globchem.dat chemical reactions list describes each reaction by their reactants, products, rate coefficients, etc. This information is processed by geos2kpp parser.pl to create the "globchem.eqn" file, which contains the chemical equation information in KPP format. KPP uses GEOS-Chem's calculated rate constants instead of using 15 the kinetic parameters listed next to each chemical equation. A comparison of "globchem.dat" and "globchem.eqn" is shown below.

globchem.dat:
A 21 3.00E-12 0.0E+00 -1500 0 0.00 0 0 20 O3 + NO =1.000NO2 + 1.000O2 globchem.eqn: The perl parser geos2kpp parser.pl makes the translation of the chemical mechanism information a completely automatic process. The user has to invoke the parser with the "globchem.dat" input file and obtains the KPP input files. The use of the perl parser is illustrated below: 3.3 Automatic modification of GEOS-Chem source code to interface with KPP chemistry routines using gckpp parser.pl The last step involves running the gckpp parser.pl to modify the GEOS-Chem source code to interface with KPP. This parser is run once in the GEOS-Chem code directory, modifies existing files to use KPP instead of SMVGEARII for the chemistry step, and 5 adds new files with subroutines for adjoint calculations. The GEOS-Chem code modifications are necessary for a correct data transfer to and from KPP. The chemical species concentrations are mapped between GEOS-Chem and KPP at the beginning of each timestep. This is done through copying data from the three-dimensional GEOS-Chem data structures to the KPP data structures for each grid cell. At the end of the time step the time-evolved chemical concentrations are copied from KPP back into GEOS-Chem. Since KPP performs a reordering of the chemical species to enhance sparsity gains, the species indices are different in KPP than in GEOS-Chem. The KPP-generated "kpp Util.f90" file provides the subroutines Shuffle user2kpp and Shuffle kpp2user which maps a user provided array to a KPP 15 array and viceversa. These subroutines are used to initialize the VAR and FIX species of KPP with the CSPEC array values from SMVGEARII (for each grid cell JLOOP).
GEOS-Chem calculates the reaction rates for each equation in each grid cell at the beginning of each chemistry step. These rate coefficients are mapped from the SMVGEARII reaction ordering to the KPP ordering via a reshuffling, and are saved.

20
Prior to each call to the KPP integrator, the reaction rates for the particular grid cell are copied from the storage matrix into the local KPP integration data structures.
The use of the gckpp parser.pl perl parser is illustrated below. The parser outputs a message for each modified or created file.

Done modifying GEOS-Chem files
The resulting GEOS-Chem code now uses KPP for gas-phase chemistry. Makefiles are included for the finite-difference driver, sensitivity driver, and 4D-Var driver.

10
In order to illustrate the correctness of GEOS-Chem simulations using KPP for gas-phase chemistry we compare the results against the original simulation with SMVGEARII. GEOS-Chem is run for 48 h from 00:00 GMT on 1 July 2001 to 23:00 GMT on 2 July 2001. One simulation uses SMVGEARII chemistry and the second uses the KPP chemical code; the simulations are identical otherwise. Concentrations are check- and absolute tolerances ATOL=10 4 ×RTOL molecules cm −3 . Reference solutions were calculated using RTOL=10 −8 and ATOL=10 −3 molecules cm −3 .

5
To demonstrate the accuracy of each method, following Henze et al. (2007) and Sandu et al. (1997), we define significant digits of accuracy(SDA) as where E R k is a modified root mean square norm of the relative error of the solution ( c k,j ) with respect to the reference solution (c ref k,j ) for species k in grid cell j .
For θ total grid cells, θ k is the set of all locations of significant concentrations of species k (c k,j ≥a). A threshold value of a=10 6 molecules cm −3 is chosen to avoid inclusion of errors from species concentrations of less than chemically meaningful values. Figure 4 presents a work-precision diagram where the number of accurate digits in 15 the solution is plotted against the time needed by each solver to integrate the chemical mechanism. A seven day chemistry-only simulation from 00:00 GMT on 1 July 2001 to 00:00 GMT on 8 July 2001 is completed using the KPP Rosenbrock Rodas3, Rosenbrock Rodas4, Runge-Kutta, and Sdirk integrators and with the SMVGEARII integrator for each tolerance level. The results indicate that, for the same computa-20 tional time, the KPP Rosenbrock integrators produce a more accurate solution than the SMVGEARII integrator. The Sdirk and Runge-Kutta integrators are slower at lower tolerances as shown in Fig. 4  Interactive Discussion take fewer steps at higher tolerances. Since we normally use tolerances of 10 −3 or lower for GEOS-Chem, we do not get these advantages. These results are similar to the results produced by Henze et al. (2007) through demonstrating that the KPP Rosenbrock solvers are about twice as efficient as SMVGEARII for a moderate level of accuracy. Note that none of the KPP solvers uses vectorization. In addition to the solvers discussed above we have also tested LSODE (Radakrishan and Hindmarsh, 1993), modified to use the KPP generated sparse linear algebra routines. LSODE is based on the same underlying mathematical formulas as SMVGEARII. The LSODE solution provides 3-4 accurate digits for a computational time of about 300 s (this would place LSODE in the upper left corner of Fig. 4). Since LSODE does not reach the 10 asymptotic regime for the low accuracy tolerances used here we do not report its results further. Table 1 compares the reference solutions obtained with SMVGEARII, Rodas3, Ro-das4, Runge-Kutta, and Sdirk integrators through showing the number of significant digits of accuracy produced when comparing two integrators. The results show that Ro-15 das3, Rodas4, and Sdirk produce very similar references, while the SMVGEARII and Runge-Kutta solutions have between 0.5 and 4.0 digits of accuracy for each chemical species when compared to other integrators.
The GEOS-Chem/KPP parallel code is based on OpenMP and uses THREADPRI-VATE variables to allow multiple threads to concurrently execute the KPP chemistry 20 for different grid cells. To illustrate the impact of the new chemistry code on parallel performance we consider a seven-day chemistry-only simulation from 00:00 GMT on 1 July 2001 to 00:00 GMT on 8 July 2001 using a relative tolerance of 3×10 −2 and an absolute tolerance of 10 −2 molecules/cm 3 and employ 1, 2, 4, and 8 processors. Figure 5 illustrates the parallel speedups for the KPP Rodas3 and Rodas4 integrators and 25 for SMVGEARII; the code has close to linear speedups for all integrators. Table 2 compares the average number of function calls made and average number of steps taken by each integrator per advection time step per grid cell. These results correlate well with those in Fig. 4 A novel capability offered by the KPP chemistry is a readily available adjoint chemical model. Adjoint models are the essential computational ingredient for sensitivity analyses and for 4D-Var data assimilation. As an illustration of adjoint sensitivity analysis with GEOS-Chem we consider a possible TES trajectory (illustrated in Fig. 6) and 5 assume that a measurement of the total ozone column underneath the trajectory is taken at 1 April 2001. We use adjoint sensitivity analysis to assess how this measurement carries information about total NO x emissions in the week preceding 1 April 2001. Specifically, we calculate scaled adjoint sensitivities of the ozone column with respect to total NO x emissions; the distribution of these sensitivities illustrated in Fig. 6 demon-10 strates the sensitivity of O 3 over the TES trajectory with respect to the NO x emissions. See Henze et al. (2007) for a more detailed discussion of the adjoint of GEOS-Chem.

Conclusions
The widely used state of the art GEOS-Chem model has been added the capability of using KPP to build the gas-phase chemistry simulation code. The GEOS-

15
Chem/KPP interface is automatic and requires no additional programming effort. The geos2kpp parser.pl converts the GEOS-Chem input files to KPP input files, while gckpp parser.pl modifies GEOS-Chem to interface with KPP. Updates to KPP assist the model integration. Our results demonstrate that the KPP Rodas3 and Rodas4 solvers achieve a similar level of accuracy at a lower computational expense than the native 20 GEOS-Chem solver (SMVGEARII), with comparable scalability for parallel processing.
These tools provide the global chemical transport modeling community with additional options as follows. (1) The user can select a highly efficient numerical integration method from an array of solvers available in the KPP library. (2) KPP offers extreme flexibility for studies that involve changing the chemical mechanism (e.g., a set of additional reactions is automatically translated into efficient code and incorporated into a modified global model).
(3) This work provides immediate access to tangent linear, GMDD GMDD 2, 2009 Chemical solvers in global atmospheric models P. Eller et al.