Interactive comment on “ Vorticity-Divergence semi-Lagrangian Global Atmospheric Model SLAV 20 : Dynamical Core

This paper describes the new version of the hydrostatic SL-AV20 hydrostatic dynamical core used by the Russian Hydrometeorological Centre for global weather forecasting. It is based on a vorticity-divergence formulation and semi-Lagrangian semi-implicit timestepping using a reduced lat/lon unstaggered grid to reduce computational cost. The model formulation and the numerical discretizations are described in the paper and the reduced grid version of the model is compared against a standard (unreduced) lat/lon grid version on two well known test cases for dynamical cores.

Abstract.SL-AV (semi-Lagrangian, based on the absolute vorticity equation) is a global hydrostatic atmospheric model.Its latest version, SL-AV20, provides global operational medium-range weather forecast with 20 km resolution over Russia.The lower-resolution configurations of SL-AV20 are being tested for seasonal prediction and climate modeling.
The article presents the model dynamical core.Its main features are a vorticity-divergence formulation at the unstaggered grid, high-order finite-difference approximations, semi-Lagrangian semi-implicit discretization and the reduced latitude-longitude grid with variable resolution in latitude.
The accuracy of SL-AV20 numerical solutions using a reduced lat-lon grid and the variable resolution in latitude is tested with two idealized test cases.Accuracy and stability of SL-AV20 in the presence of the orography forcing are tested using the mountain-induced Rossby wave test case.The results of all three tests are in good agreement with other published model solutions.It is shown that the use of the reduced grid does not significantly affect the accuracy up to the 25 % reduction in the number of grid points with respect to the regular grid.Variable resolution in latitude allows us to improve the accuracy of a solution in the region of interest.

Introduction
Atmospheric general circulation models (AGCMs) are basic tools for weather forecasting from several days to seasons.Also, such models are essential building blocks of Earth sys-tem models used for climate simulations.AGCMs consist of a dynamical core responsible for solution of dynamics equations of a resolvable flow component and the package representing sub-grid-scale processes (often in a parameterized way).
SL-AV20 is the latest version of the hydrostatic AGCM developed at the Institute of Numerical Mathematics, Russian Academy of Sciences (INM RAS) in cooperation with the Hydrometeorological Centre of Russia (HMCR).SL-AV is the model acronym (semi-Lagrangian, based on the absolutevorticity equation), and 20 indicates the targeted horizontal resolution over the territory of Russia.The SL-AV20 dynamical core is developed by the coauthors, while the greater part of subgrid-scale parameterizations is adopted from the ALADIN/LACE model (Geleyn et al., 1994;Gerard et al., 2009).
SL-AV20 is accepted as a basic method for operational medium-range weather forecast in the HMCR in 2015.The lower-resolution version of this model is expected to be applied in the HMCR long-range forecasting system.Also, SL-AV20 is considered a starting point for developing the new atmospheric component of the INM climate model (IN-MCM, Volodin et al., 2017).Among the papers that discuss SL-AV model numerics, e.g., vorticity-divergence formulation and basic horizontal discretizations (Tolstykh, 2002), the variable-resolution version (Tolstykh, 2003), the new solver for reconstruction of the wind from vorticity and divergence, the mass-conservative shallow water formulation (Tolstykh and Shashkin, 2012), and the inherently mass-conserving version (Shashkin and Tolstykh, 2014), there is no one that would put all the developments together.This article describes the present state of the SL-AV20 dynamical core.
While developing a dynamical core, we want it to have a given accuracy combined with a minimum wall-clock time for a given number of processors.We call this the computational efficiency.Furthermore, as the dynamical core can be used at a range of horizontal and vertical resolutions typical for numerical weather prediction and climate simulations, it is desirable to maintain dynamical core computational efficiency at a maximum possible range of resolutions.This is a somewhat contradictory requirement.Indeed, using a global dynamical core at the maximum possible resolution of a few kilometers ultimately requires the dynamical core to use efficiently up to tens of thousands of cores.This is not so easy to achieve if a semi-implicit time-integration scheme is used.
Our approach to the dynamical core design is based on the following choices.We use a semi-implicit (SI) timestepping scheme (Robert et al., 1985) and a semi-Lagrangian (SL) treatment of advection (review - Staniforth and Côté, 1991).This combination allows us to circumvent Courant-Friedrichs-Lewy stability limitation for both wind speed and inertia-gravity wave phase speed.Practically, this means that the time step is much larger than with the Eulerian treatment of advection and/or the explicit time-stepping scheme, however, at the cost of solving the Helmholtz equation at each time step and having larger communication costs in parallel implementation.The unstaggered grid is used; i.e., scalar and vector variables are stored at the same grid points.Therefore, only one set of upstream trajectories needs to be computed for the SL advection scheme (the C-grid requires three sets).Also, this improves trajectory computation accuracy, since the velocity components are stored at arrival points and no spatial interpolation or averaging is needed to compute the displacement terms at the arrival point.
Following the results of Randall (1994), the vertical component of relative vorticity and the horizontal divergence are used as prognostic variables to achieve good inertia-gravity and Rossby wave dispersion properties at the unstaggered grid.However, the reconstruction of wind velocity from vorticity and divergence is needed at each time step.We use the direct inversion of relative vorticity and divergence finitedifference formulae, avoiding solution of Poisson problems for stream function and velocity potential.This leads to the accurate and efficient solver Tolstykh and Shashkin (2012).
Fourth-order finite-difference formulae are used to compute gradient, divergence, and vorticity operators.Compact finite differences were used in early versions of the SL-AV model (Tolstykh, 2002) for their smaller truncation errors.To improve parallel efficiency, compact finite differences are abandoned now, except Helmholtz and velocity reconstruction solvers.
The important question for the global atmospheric model is the choice of the horizontal grid on the sphere.The regular latitude-longitude grid that served in the atmospheric modeling for decades is now ending its era.The grid points clus-tering near the poles force one to use either a very small time step for stability or special numerical techniques (polar filters, SL advection) that need excessive data transfer in the polar regions and lead to scalability problems.Moreover, physical grid spacing in latitude in the polar regions can be 1 order of magnitude (or even more) larger than the grid spacing in longitude.This presents a difficulty for parameterizations of sub-grid-scale processes.A significant effort is made to develop models based on icosahedral (e.g., Zangl et al., 2015), Voronoy-Delaunay (e.g., Skamarock et al., 2012), cubedsphere (e.g., Fournier et al., 2004), Yin-Yang (Quaddouri and Lee, 2011) or some less popular grids on the sphere.
Unfortunately, all quasi-uniformly spaced grids proposed to date suffer from one or more of the following problems (as discussed in Staniforth and Thuburn, 2012): disbalance between vector and scalar degrees of freedom (grids with triangular or hexagonal/pentagonal cells), non-orthogonality of the underlying coordinate system (cubed-sphere), and overset regions or grid transition (Yin-Yang, cubed-sphere).These issues can degrade the accuracy of atmospheric circulation simulation, and cause grid imprinting and/or the occurrence of unphysical wave modes and some other problems.
SL-AV20 is formulated using the reduced latitudelongitude grid suggested in Kurihara (1965) but with a different design.We believe that using a reduced grid can alleviate most of the polar problems of the regular latitude-longitude grid.Moreover, the reduced latitude-longitude grid is free of problems specific to more complex quasi-uniformly spaced grids, and is relatively easy to implement.As shown in Tolstykh and Shashkin (2012), the accuracy problems of reduced grid computations reported by Williamson (2007) and Staniforth and Thuburn (2012) can be overcome with proper construction of this grid (Fadeev, 2013) and using high-order discretizations.
Another feature of SL-AV20 is the possibility of using variable resolution in latitude.This approach is especially suitable for Russian territory, which stretches for almost 180 • in longitude.Variable resolution in latitude with the ability to use non-equatorially symmetric grid reduction allows us to refine resolution in the region of interest (mid-latitudes of the Northern Hemisphere) and to coarsen it in other regions (e.g., Southern Hemisphere).
The article is organized as follows.The governing equations are listed in Sect. 2. Section 3 briefly presents the SL-AV20 semi-Lagrangian advection scheme.The details of semi-implicit temporal discretization are given in Sect.4; the numerical techniques for horizontal and vertical discretization are presented in Sect. 5.The solvers for the Helmholtz problem and wind velocity reconstruction are presented in Sect.6; Sect.7 discusses the dissipation mechanisms used in the model.Section 8 describes SL-AV20 parallel implementation.Numerical experiments are described in Sect.9.

Governing equations
Governing equations of the SL-AV model are derived from the set of hydrostatic shallow-atmosphere primitive equations on the sphere (Holton, 2004, chap. 2).The hybrid coordinate η of Simmons and Burridge (1981) is used in the vertical.
The spherical longitude and latitude are (λ, ϕ), r is the vector pointing from the center of the sphere to the point on its surface, and a is the radius of the sphere (notations used in the article are summarized in Appendix A).Standard definitions of horizontal ∇ operator and Lagrangian derivative d dt are used.The pressure is p(η) = A(η)p 0 + B(η)p s , p 0 is a constant reference pressure and p s is the surface pressure.
The horizontal wind velocity is V , with u and v being its zonal and meridional components, respectively.D = ∇ • V and ζ = k • ∇ × V are the horizontal divergence and vertical component of relative vorticity; k = r/a is the vertical unit vector.The vertical velocity in the hybrid coordinate system is η.The Coriolis parameter f = 2| | sin ϕ, is the Earth's angular velocity.
is the geopotential, T is the temperature, and s is the surface geopotential.We also use virtual temperature , where R v is the gas constant of water vapor, q is water vapor specific concentration, and q i are liquid and solid water species specific concentrations.The heat capacity of moist and dry air at constant pressure are c p and c pd ; c p includes contributions from all water species presented.The sources/sinks of an arbitrary quantity ψ due to subgrid/diabatic processes are denoted by F ψ .
We begin with the momentum (horizontal wind) equation in the vector form of Bates et al. (1993): Subscript H denotes projection onto the surface of the sphere.The Coriolis term is cast in the advective form (inside d/dt) following Rochas (1990) (see also Temperton, 1997).
Using the vector form of momentum equations, we avoid the problems related to the treatment of metric terms proportional to tan ϕ near the poles.The p(η) = A(η)p 0 + B(η)p s is substituted into the term ∇p p and the latter is rewritten as We recast the horizontal wind Eq. (1) using the vertical component of absolute vorticity ζ and horizontal divergence D. The equation for ζ is obtained by applying the k • ∇× operator: The equation for the horizontal divergence D is derived from the time-discrete form of Eq. (1) (see Sect. 4 for details).Also, there is an option to use the divergence equation obtained in an analytic way applying the ∇• operator to momentum equations (Eq. 1 can be used; however, the derivation is simpler with the component form - Holton, 2004): Using the analytic divergence Eq. ( 4) gives a possibility of using the same spatial approximation for ∇ 2 in both explicit and implicit parts of time-discrete equations.Therefore, we can expect better dispersion properties for the shortest inertia-gravity waves (according to Caluwaerts et al., 2015), however, at the cost of computing additional nonlinear terms.It is interesting to note that, unlike Heikes and Randall (1995), who formulated vorticity and divergence equations in terms of true scalars (ζ , D, , stream function and velocity potential), we use components of vector quantities V and ∇ .
The thermodynamic equation is readily formulated for virtual temperature T v .This can be done considering dT and then expanding dT dt and dR moist dt on the right-hand side using the thermodynamic equation (McDonald and Haugen, 1993) and water species transport equations (see below).
where d H dt is the horizontal Lagrangian derivative with neglected vertical displacement.The energy conversion term 1 p dp dt is rewritten in terms of ln p s and an analog of vertical velocity ṡ = 1 p s ∂p ∂η η following McDonald and Haugen (1993).The term γ (η) s was proposed by Ritchie and Tanguay (1996) to suppress spurious orographic resonance; also, it smooths the temperature field near orographic variations www.geosci-model-dev.net/10/1961/2017/Geosci.Model Dev., 10,2017 and contributes positively to the accuracy of temperature advection.
The continuity equation is written in terms of ln p s and ṡ, similar to Eq. ( 5) of McDonald and Haugen (1993): The term s R d T const was also proposed by Ritchie and Tanguay (1996).The continuity equation can also be written in the form of mass conservation in an arbitrary Lagrangian cell V(t) (Lauritzen et al., 2008): Recall that g −1 ∂p/∂η is the density in the η-coordinate.
We use this form of continuity equation for locally massconserving SL discretization.
The hydrostatic balance equation is The equations for transport of water vapor and other water species are all written in the same form: d dt The finite-volume form ( 12) is used for locally massconserving discretization.Boundary conditions for the above Eqs.(2)-( 12) are η = 0 at the lower η = 1 and upper η = η top boundaries.Also, it is assumed that B(η) = 1, A(η) = 0 when η = 1, normally B = 0 above some η p > η top ; however, the model can work in the pure σ -coordinate mode, when B(η) = η, A(η) = 0 everywhere.

Conventional semi-Lagrangian advection
Lagrangian time derivatives in the prognostic equations given in Sect. 2 are approximated in time as dψ/dt = (ψ n+1 − ψ n * )/ t, where the superscript indicates time level t n = n t; subscript * indicates that quantity ψ is evaluated at the departure position of the Lagrangian parcel (at t n ).Under the SL approach, each point of a fixed computational grid is the arrival position of some Lagrangian particle.
The advection equation dψ/dt = 0 is discretized in time as ψ n+1 = ψ n * .SL approximation is not subject to the Courant-Friedrichs-Lewy stability limitation, so t can be chosen from accuracy considerations that greatly contribute to the computational efficiency.
The departure position of the Lagrangian parcel arriving at some grid point can be approximately found by integrating the equation dr dt = V one step back in time.Following the SETTLS scheme (Hortal, 2002), we approximate this equation as where The iterative approach is used to solve Eq. ( 13): where * m indicates the departure point position found at mth iteration.Additional geometric approximations are needed iterating Eq. ( 14) to account for spherical geometry and shallow atmosphere approximation (see Temperton et al., 2001).The quantities at the departure positions of Lagrangian particles are evaluated using interpolation.We use 3-D tensor-product cubic Hermite interpolation for advective terms of equations (the terms that arise from SL approximation of d/dt).Trilinear interpolation is used for components of V in iterative process ( 14) and non-advective terms of equations.The change in the direction of coordinate system unit vectors between departure and arrival positions should be taken into account when interpolating the vector components: where R is the rotation matrix (see Temperton et al., 2001, detailed derivation in Staniforth et al., 2010).

Mass-conservative semi-Lagrangian advection
The disadvantage of a semi-Lagrangian approach as formulated in Sect.3.1 is the lack of tracer mass conservation.To address this problem, SL-AV20 incorporates the finite-volume SL Conservative Cascade Scheme (CCS-3D, Shashkin et al., 2016).Finite-volume SL discretization starts with the advection equation in form (12) where the arrival volume V(t n+1 ) coincides with grid cell V ij l (l is the vertical index): (.) V ij l is quantity averaged over V ij l ; S ij and S ij η l are the cell horizontal square and volume, respectively.Cellaveraged tracer density ∂p ∂η q is chosen as a prognostic variable.
The departure cell V(t n ) = V * ij l is defined by its vertices, i.e., the departure positions of Lagrangian parcels arriving at the vertices of V ij l at t n+1 .Departure cell vertices are found with interpolation of known coordinates of departure cell centers.The latter are departure positions of the Lagrangian parcels arriving at the grid points computed using Eq. ( 13).
To approximate the integrals over the departure cells in Eq. ( 16), we extend the 2-D Conservative Cascade Scheme by Nair et al. (2002) to 3-D space.Following the ideas of Nair et al. (2002), we split the 3-D integration problem into the sequence of three 1-D integrations that greatly contributes to the computational efficiency.The resulting locally mass-conserving SL advection scheme Shashkin et al. (2016) uses a piecewise parabolic subgrid tracer density distribution (Colella and Woodward, 1984) and incorporates several monotonicity and positivity preserving options.

Basic semi-implicit formulation
The non-advective terms of prognostic equations (from Sect.2) are approximated using the combination of the Crank-Nicolson scheme with the pseudo-second-order decentering (Temperton et al., 2001) for linear terms and the SETTLS scheme (Hortal, 2002) for nonlinear terms.For the equation where ψ is an arbitrary scalar or vector variable the scheme writes as The absolute vorticity Eq. ( 2) is discretized in time as follows: where R ζ is a combination of time level n and extrapolated in time ((n + 1) e ) quantities (see Appendix B for details on the right-hand-side terms of the SI system of equations); no decentering is applied to term f D. We use the background state with constant reference temperature T and reference pressure p = A(η)p 0 + B(η) ps ( ps is a constant) to extract the linear terms in Eqs. ( 1), ( 4), (6), and (8).First, the pressure gradient term of momentum equations ( 1) is split into linear and nonlinear contributions: Here we use the fact that pressure in the analytic divergence Eq. ( 4).The approximation of vertical integrals is described in Sect.5.5.
The time-discretized divergence equation is written as Equation ( 22a) is the divergence equation derived from the momentum equation ( 1) discretized in time via Eq.( 18); R V is the right-hand side (RHS) of the corresponding vector equation.The RHS term ∇•R V is calculated with the discrete divergence operator from Sect.5.2.Equation ( 22b) is based on the analytic divergence Eq. ( 4).This equation allows us to use the same Crank-Nicolson discretization in time for −f ζ and ∇ 2 G, the two terms representing geostrophic balance.
In the thermodynamic equation ( 6), the Lagrangian time derivative of ln p s is discretized in the SL manner in the linear part of the energy conversion term, whereas it is replaced by the RHS of continuity Eq. ( 8) in the nonlinear part.The resulting time-discrete equation is The continuity equation ( 8) is discretized as follows: Integration of Eq. ( 24) from model top to the ground provides the expression for ln p n+1 s independent of ṡn+1 , whereas the integration to the arbitrary level η gives the ṡn+1 : The RHS of Eq. ( 25) can easily be substituted for the ln p n+1 s term in Eq. ( 26), providing an equation for ṡn+1 which depends only on the D n+1 unknown quantity.Equations ( 19), ( 22), ( 23), ( 25), and ( 26) and the definition of linear geopotential G (21) form the system of linear equations for time-level n+1 variables (ζ, D, T v , ln p s , ṡ, G).The solution procedure for this system is as follows.First, if an analytically derived divergence equation ( 22b) is used, the relative vorticity ζ n+1 is excluded by substitution from Eq. ( 19).Then, Eqs. ( 25) and ( 26) are substituted in Eq. ( 23) to obtain an expression for T n+1 v that depends only on D n+1 .This expression and Eq. ( 25) are used to eliminate T n+1 v and ln p n+1 s in the G n+1 definition Eq. ( 21).Therefore, we obtain the pair of equations that contains only G n+1 and D n+1 unknowns.With the aid of vertical discretization (see Sect. 5.5), this pair can be written as where G, D and R ψ are columns of Nlev (number of model levels) components, with the lth component representing the corresponding horizontal field at the level l; H = s + R d AR T + M R P , M, M , and A are the matrices of discrete vertical operators (resulting from vertical integration; see details in Appendix C).The coefficient α = 0 is used for the standard divergence equation ( 22a) and α = 1 for the analytically derived divergence equation (22b); R D corresponds to the RHS term of the relevant divergence equation.Substituting D from Eq. ( 28) in Eq. ( 27) and using eigenvalue decomposition M = P P −1 results in Nlev 2-D Helmholtz problems for components of P −1 G vector.The solution procedure for Helmholtz problems is described in Sect.6.2.Once G is found, we can find the divergence D with Eq. ( 27) and use it to calculate ln p n+1 s , ṡn+1 , T n+1 v and ζ n+1 (see Eqs. 25,26,23,19).The n + 1th time-step calculations are finalized by reconstruction of the horizontal wind V n+1 as described in Sect.6.1 and calculation of η and recalculation of ṡ (Sect.5.5).
We have also implemented an iterative time-integration scheme (Goyman, 2015) in addition to the time discretization described above.The basic discretization is used at first iteration to obtain first-guess n + 1 time-step fields (ζ, D, T , ln p s , V ).At the second iteration, we replace the time-extrapolated nonlinear terms N (n+1) e of the equations and time-extrapolated wind velocity (u, v, η) (n+1) e involved in the upstream trajectory computations with those obtained using n + 1 time-step first-guess values.Using this iterative approach allows us to improve stability and use larger time steps without degrading accuracy.

Inherently mass-conserving model semi-implicit formulation
The inherently mass-conserving (IMC) version of the model (Shashkin and Tolstykh, 2014) uses a continuity equation in the finite-volume form (9). The nonlinearity of this equation is hidden in the trajectory computation or equivalently in the evolution of the Lagrangian cell V(t).The equation is linearized using the orography-dependent reference pressure profile where p = p − p ref .
The equation for mass-conservative update of p s is obtained with discretization of Eq. ( 29) using the scheme ( 18) and its integration from model top to bottom: where (p s ) S ij is the surface pressure averaged over S ij .
Using an orography-dependent reference pressure profile is crucial for stability in mountainous regions.The term , however, cannot be expanded in terms of model prognostic variables, so it is very difficult to solve the system of equations similar to those formulated in Sect.4.1, but with a mass-conservative equation for p n+1 s (30) instead of Eq. ( 25).Therefore, the following procedure is implemented.First, we run a time step of the non-mass-conserving model version, solve the semi-implicit system as it is formulated in Sect.4.1, and obtain the n + 1 time-step horizontal wind V n+1 .Then the mass-conservative update of p s , Eq. ( 30), is calculated.
We achieve inherent mass conservation, however, at the cost of introducing some inconsistency between the surface pressure and horizontal wind fields.This results in a slight noise in the regions with steep orography, but does not significantly affect the model accuracy.The greater problem not solved yet is the consistency between dry air and tracer mass, so the IMC model version now is rather a research option.

Horizontal grid
The SL-AV20 model incorporates a reduced latitudelongitude grid with variable resolution in latitude.The grid consists of points located on the latitudes ϕ = ϕ j , j ∈ [0, Nlat] with longitude spacing λ j , (λ, ϕ) ij = (i λ j , ϕ j ).The number of points on a grid latitude is reduced polewards of the Equator.The pole points are assumed to be the grid latitudes.The regular latitude-longitude grid is the special case of the grid described above with constant latitude spacing and the same number of points on each grid latitude.
Formally, the model algorithms can work with any set of ϕ j and λ j .However, the choice of these crucially affects accuracy.The grids are constructed with the algorithm of Fadeev (2013).This algorithm generates the set of ϕ j that satisfy grid smoothness constraints (to avoid spurious wave reflection at the regions of abrupt spacing change) and match the desired grid spacing latitude dependence as close as possible.Then, given ϕ j and the total number of grid points, the algorithm optimizes λ j to minimize the error of interpolation of an analytic test function to some finer grid.

Discretization of horizontal gradient, divergence and vorticity operators
The following fourth-order accurate formula is applied to evaluate the first derivative: where x can be either ϕ or λ; ψ is a scalar quantity or vector component.However, we use an unstaggered grid and need the values of ∂ψ ∂x to be located at the same points as ψ; thus, we interpolate ∂ψ ∂x i+1/2 to the integer nodes of the grid using fourth-order Lagrangian interpolation.To calculate divergence and vorticity, fourth-order Lagrangian interpolation is used to obtain vector components at the grid half-nodes.Then a second-order locally conservative formula is applied to calculate the meridional derivative: The longitudinal derivative is evaluated with Eq. ( 31) as in the gradient operator.
To account for the variable resolution in latitude, we introduce pseudo-latitude ϕ , such that the grid is equally spaced in ϕ (following Tolstykh, 2003).Then the meridional derivative ∂ψ ∂ϕ = ∂ψ ∂ϕ ∂ϕ ∂ϕ .The derivative ∂ψ ∂ϕ and inverse mapping factor M = ∂ϕ ∂ϕ are calculated using Eq. ( 31).The longitudinal derivatives are calculated in the gridpoint space, whereas Fourier representation in longitude is used to calculate meridional derivatives on the reduced lat-lon grid.The meridional derivatives of Âk , Bk are calculated and then the inverse Fourier transform is applied to obtain the values of derivatives in grid-point space.If N j is the number of points at the grid latitude ϕ j , the wavenumbers k > N j /2 − 1 can not be represented, so it is natural to set Âk , Bk and their derivatives to zero at this latitude.
The disadvantage of Fourier transform is the necessity of global communications in parallel implementation, which could lead to poor scalability.The calculation of meridional derivatives in the grid point space is implemented as an option.The points in the reduced grid are not aligned along meridian lines.Therefore, the interpolation along longitude is carried out first to obtain the values of ψ at the needed longitude and then Eq. ( 31) is applied to obtain half-integer values of derivatives.The similar approach was applied in (Jablonowski et al., 2009).We have tested this grid point algorithm and there is no impact on the results for the baroclinic test case presented in Sect.9 as compared with the computations in Fourier space as described above.
To evaluate divergence and vorticity at the pole point, one uses the fact that the scalar quantities can have only Â0 nonzero Fourier coefficients at the pole point to be uniquely defined.The fourth-order accurate Lagrangian interpolation is applied to obtain Â0 polar values from known Â0 at adjacent latitudes (including virtual latitudes).Similarly, vector components can have only Â1 , B1 non-zero Fourier coefficients at the pole point (this corresponds to the unique value of the vector quantity at the pole point and basis vectors dependent on longitude).Fourier coefficients of gradient components are also interpolated to the pole point.
The IMC version of the model (Shashkin and Tolstykh, 2014) also requires calculation of a flux-divergence operator ∇ • (ψV ), where ψ is some scalar function.For mass conservation i,j ∇ • (ψV ) ij S ij = 0 over the whole sphere must hold up to machine precision.This operator is discretized (as described in Tolstykh and Shashkin, 2012) in the flux form with fourth-order accuracy.In the IMC version, this operator with ψ = 1 replaces the divergence operator described above.

Discretization of the horizontal Laplace operator
The Laplace operator arises in the time-discrete divergence equation ( 22).In the case of Eq. ( 22a) derived in a discrete way from momentum equations (1) (which is considered standard in SL-AV20) ∇ 2 appears in the implicit part, whereas in the explicit part it stands as the product of discrete divergence and gradient operators (see Sect. 4 for details).If analytical divergence equation ( 22b) is used, ∇ 2 is present in both implicit and explicit parts of equation.
If the discrete-derived divergence equation is used, the compact finite-difference (CompFD) approximation of the meridional derivatives in the Laplace operator is applied, for their smaller truncation errors.CompFDs discretization require matrix inversion that generaly causes the parallel scalability problems.However, using CompFDs to discretize Laplace operator in the implicit part of divergence equation is not a computational burden, since it is already involved in the matrix inversion problem (see Sects. 4, 6.2).
The Laplace operator is discretized using Fourier representation in longitude.The meridional part of Laplacian is discretized with double application of compact formula for the first derivative: 1 24 The longitudinal part of ∇ 2 is the Fourier image of product of two longitudinal derivatives defined by Eq. ( 31).
Further details about compact finite-difference approximation of ∇ 2 operator are in Sect.6.2.
The main objective of using the analytically derived divergence equation ( 22b) is time-symmetric discretization, which theoretically leads to better inertia-gravity wave dispersion properties (Caluwaerts et al., 2015).Temporal symmetry means that the same discretization of ∇ 2 is used in implicit and explicit parts of the divergence equation.We do not use CompFDs for approximation of the Laplace operator to avoid matrix inversion in the explicit part of the equation.The following approximation to ∇ 2 is used: At the pole points, Eq. ( 36) is transformed to which preserves the formal order of accuracy and local conservation of the whole ∇ 2 discretization.(∂ψ/∂ϕ) i,j ±1/2 in the above equations are evaluated using Eq. ( 31); the overbar indicates the zonal average.

Nonlinear terms
The calculation of nonlinear terms can give rise to nonlinear instability, when the interaction of the shortest scales leads to the exaggerated contribution to the largest scales.We use spatial averaging to suppress this kind of instability.The following formulae do not have a strict theoretical basis, but they are rather a result of experience in optimizing model performance.
The first type of nonlinear term is a scalar-by-scalar product, namely ζ D, D 2 in vorticity and analytic divergence equations ( 2) and ( 4): The term D 2 is computed in a similar way.Surprisingly, this leads to better medium-range forecast scores as compared to the formula with the increased weight of the central term ζ i,j D i,j .Before computing nonlinear terms involving first-order derivatives in λ or ϕ (components of ∇ ln p s , ∂ η/∂x, ∂T v /∂x), these derivatives are averaged in the direction perpendicular to the direction of differentiation; e.g., ∂T v /∂λ is averaged in latitude.The averaging formula is where j is index in the direction perpendicular to x.The constant c = 3 for T v and η derivatives, c = 4 for ∇ ln p s components.The horizontal wind components and temperature are averaged in 2-D for computation of J ζ Eq. (3), J D Eq. ( 5) and terms involving T v in momentum (1) and thermodynamic (6) equations: Equations ( 39)-( 41) are valid in the case of constant latitudinal resolution.To account for the variable resolution, we mention that ψdϕ = ψMdϕ and multiply all (ζ i,j + ζ i,j ±1 )(D i,j + D i,j ±1 )-type combinations in Eq. ( 39) and ψ i,j ±1 terms of Eqs. ( 40) and ( 41) by 2M j ±1/2 /(M j +1/2 + M j −1/2 ).
No averaging is applied for computations of terms involving the coefficients of hybrid coordinate system A, B, and their derivatives.Also, no averaging is applied for computation of R moist /c p term in thermodynamic equation ( 6) and hydrostatic equation (10) terms.

Vertical discretization
We use a Lorenz-staggered grid (Lorenz, 1960) in the vertical where all variables, except vertical velocity η, are located at integer levels η l (cell centers) and η is located at halflevels η l+1/2 (cell interfaces).Integer grid levels are set to η l = (η l+1/2 + η l−1/2 )/2, η 1/2 corresponds to the model top, and η Nlev+1/2 is at the Earth's surface.Similarly to η, the hybrid coordinate system coefficient A l = (A l+1/2 +A l−1/2 )/2, and the same holds for B l , η l = η l+1/2 − η l−1/2 and the same is implied for A l , B l and other variables.The vertical derivatives are approximated with a second order of accuracy ∂ψ ∂η l = ψ l η l + O( η 2 l ).It is almost standard for hydrostatic atmospheric models to use Simmons and Burridge (1981) vertical discretization, where the geopotential is calculated at half-levels η l+1/2 by integrating the hydrostatic Eq. ( 10) with the mid-point quadrature rule and then interpolated to the integer levels.It was found that the trapezoidal rule allowing us to calculate geopotential directly at the integer levels is more accurate.Equation ( 10) is integrated in the following way: The vertical integrals in the time-discrete continuity equation (24) and also Eqs. ( 25) and ( 26) are calculated using the mid-point rule.At the end of each time step, the vertical velocity is recalculated via the diagnostic relation derived from the Eulerian form of the continuity equation ( 8): Integration of Eq. ( 44) from η 1/2 to η L+1/2 using mid-point rule and vertical boundary conditions gives The expression for ∂p s ∂t can be obtained by setting L = Nlev in Eq. ( 45) and using ηNlev+1/2 = 0.After substituting this expression back to Eq. ( 45) for arbitrary L, (∂p/∂η • η) L+1/2 can be evaluated.The term ∇ • (p s V ) is calculated using standard divergence discretization (see Sect. 5.2), secondorder averaging is applied to p s in latitude (longitude) before computing p s u (p s v).To retrieve the vertical velocity η for calculation of upstream trajectories (see Sect. 3) and terms of Eq. ( 2), (∂p/∂η • η) is interpolated to integer levels with second-order accuracy and divided by (p 0 A l + p s B l )/ η l which is proxy for ∂p/∂η.
6 Elliptic problem solution

Wind velocity reconstruction
Following Tolstykh and Shashkin (2012), the definitions of ζ = k • ∇ × V and D = ∇ • V are applied to reconstruct the horizontal wind from the known vertical component of relative vorticity and horizontal divergence.Fourier representation in longitude Eq. ( 33) is used.The relations for the u and v 0th Fourier components are These relations are integrated in latitude with the compact formula (Lele, 1992): 1 24 where Âζ 0 cos ϕ ( ÂD 0 cos ϕ) is substituted for ∂ψ/∂ϕ on the left-hand side and Âu o cos ϕ ( Âv o cos ϕ) is substituted for ψ on the right-hand side.The resulting Fourier coefficients for horizontal wind components at half-grid latitudes are interpolated to the integer grid latitudes using sixth-order compact interpolation (Lele, 1992).
The kth Fourier coefficients of horizontal wind components are determined solving where ∂ψ/∂ϕ is approximated using Numerov's scheme: 1 6 If Âψ k , Bψ k are Nlat + 1-element columns, with the j th element representing Âψ k , Bψ k at the j th grid latitude, and then system (49) can be written as where C is the diagonal matrix with C jj = cos ϕ j , j th component of column δ Â is Âj+1 − Âj−1 , M is matrix with diagonals (1/6, 2/3, 1/6).We multiply the system of Eqs. ( 52) by matrix M from the left and rewrite it for ( Âv k , Bu k ) T j pairs, j ∈ [0, Nlat] that results in 2 × 2 block tri-diagonal system of equations.The same operations repeated for system (50) yields 2 × 2 block tri-diagonal system of equations for ( Âu k , Bv k ) T j .The details of this solver are given in Tolstykh and Shashkin (2012).As shown in Caluwaerts et al. (2015), this solver alleviates the problems associated with non-symmetric discretization in time.Also, it avoids the solution of Poisson problem on the sphere which requires certain care due to non-trivial kernel of Laplace operator.

Helmholtz problem solution
The discrete Helmholtz problem for ψ -the kth Fourier harmonic of arbitrary quantity can be written in the columnmatrix notation of Sect.6.1 as Lψ + µ 2 ψ = R, where L is a discrete Laplace operator, µ 2 is some scalar, R is kth Fourier harmonic of RHS.When divergence equation (22a) derived from time-discrete momentum equation ( 1) is used, the meridional part of ∇ 2 operator is approximated by double application of compact finite-difference formula for the first derivative Eqs. ( 34) and ( 35) is used for the approximation of longitudinal part of Laplacian.The resulting 4th order accurate Helmholtz problem discretization is where operator δ 1/2 acts on quantities defined at half-integer grid points, the j th component of δ 1/2 ψ is ψ j +1/2 −ψ j −1/2 , δ acts on quantities defined at integer grid points, the j th component of δψ is ψ j +1 − ψ j , and M and M 1/2 are tri-diagonal matrices with diagonals (1/24, 11/12, 1/24) acting on quantities defined at integer and half-integer points, respectively.Matrix C is the same as in Eq. ( 52), and matrix C 1/2 is diagonal with a j th diagonal element cos ϕ j +1/2 .Following Tolstykh (2002), Eq. ( 53) is multiplied by MC from the left side and reformulated using auxiliary variable z = M −1 1/2 δψ: Then Eq. ( 54) is rewritten for the pairs (ψ j , z j +1/2 ) T , j = [0, Nlat] that results in 2 × 2 block tri-diagonal system as in Sect.6.1.Approximation Eq. ( 36) to the ∇ 2 operator is used when the analytically derived divergence equation ( 22b) is employed in the system of Eqs. ( 27) and ( 28).This approximation leads to a five-diagonal system of equations for ψ components which is solved by five-diagonal Gaussian elimination.
7 Dissipation mechanisms 7.1 Fourth-order hyper-diffusion Nonlinear interactions in the complex large-scale atmospheric flows result in generation of progressively smaller eddies until these eddies are converted to heat by molecular viscosity at the scales on the order of 1 cm.Such scales are far beyond the affordable resolution of atmospheric models.Therefore, the parameterization of unresolved scale interactions and dissipation is needed to avoid accumulation of energy at smaller scales.Such parameterizations are often considered the integral part of the atmospheric model dynamical core (Williamson, 2007).The SL-AV20 model includes the implicit in time fourth-order diffusion implemented in Fourier space with the finite volume representation in latitude (Tolstykh, 1997).This algorithm, allowing variable resolution in latitude, is a generalization of that one presented in (Li and Bates, 1994).We start with the equation where ψ n+1 is one of D n+1 , T n+1 v , ζ n+1 , ηn+1 and ψ n+1 f is the filtered quantity.Equation ( 56) uses implicit timestepping that allows us to circumvent severe limitation on K near the grid poles, t is the same as in the rest of the model dynamics.Also, as shown in Li and Bates (1994), the ∇ 4 diffusion with implicit time-stepping effectively dumps the shortest zonal waves in the vicinity of poles that helps to avoid the instability associated with finite-difference discretization of spherical differential operators containing cos ϕ in denominator.SL-AV20 uses variable-resolution grids, so the filter should not excessively dissipate smaller-scale features in high-resolution regions.However, these features should be damped out before they spread to the low-resolution regions where they cannot be resolved.We use an anisotropic diffusion coefficient that varies with latitude; the ∇ 4 is substituted for ∇ • (K∇ 3 ψ) to preserve local conservation, K = diag(K λ , K ϕ ).To facilitate numerical solution, Eq. ( 56) is reformulated as We use a second-order finite-volume formula to approximate the latitudinal part of ∇ • K∇ and ∇ 2 operators: where ϕ j +1/2 = 1/2(ϕ j + ϕ j +1 ), ϕ j +1/2 = ϕ j +1 − ϕ j , (sin ϕ) j = sin ϕ j +1/2 − sin ϕ j −1/2 .Longitudinal part of ∇ • K∇ and ∇ 2 is approximated in Fourier space as the Fourier image of second-order discretization.
The system Eqs.( 57) using ( 58) can be written for zonal wavenumber k as where − k 2 = −(1 − cos k λ)/ λ 2 is the Fourier image of longitudinal part of discrete Laplace operator.This system is solved using 2 × 2 block tri-diagonal version of Gauss elimination.

Sponge layer
The divergence damping at the vertical levels close to the model top is used to avoid spurious reflection of vertically propagating waves from the upper rigid lid ( η = 0 at the η 1/2 boundary condition).The term −ϑ(η)D n+1 is included in the RHS of the discrete divergence equation ( 22).Implicit time-stepping allows a wide admissible range of values for damping coefficient ϑ without complicating the solution of system ( 27), ( 28).

Parallel implementation
The SL-AV20 model uses hybrid distributed-shared memory parallelism based on a combination of MPI and OpenMP technologies.Each MPI process performs computations in the band of grid latitudes during the first phase of the time step that includes upstream trajectory computations, interpolation and combination of known terms into the RHS of an SI system of equations (see Sect. 4).OpenMP threads are used to parallelize longitude loops, thus dividing the latitude belt into a number of parts.
The second phase of the SL-AV20 time step consists in solving the SI system of equations, application of fourthorder hyper-diffusion (Sect.7) and reconstruction of wind velocity from vorticity and divergence (Sect.6.1).This phase is mostly the solution of elliptic problems.To apply the direct solvers described in Sects.6 and 7, it is convenient to gather kth Fourier coefficients from all grid latitudes in the memory of a specific MPI process.Therefore, the second phase of the SL-AV20 time step is preceded by data transposition.Each MPI process performs computations within the set of longitude wavenumbers from pole to pole.OpenMP parallelization of loops in the vertical is applied.
To balance the workload of MPI processes, the widths of corresponding latitude bands are calculated accounting for the grid reduction, so each process treats approximately the same number of grid points.There is an additional constraint of current implementation that all grid points for a given latitude should be located at the same processor.There is no load balance problem in Fourier-space computations, as the processor partition is the same as for a regular grid.One can additionally adjust load balance by taking into account that some parameterizations only work in some latitude bands.
The data transpositions before and after the solution of elliptical problems require global communications between the processors.This will become a problem on future massively parallel computers and so the work has started to implement scalable iterative grid-point solvers for elliptic problems into SL-AV model.It is known that iterative solvers for problems described in Sects.4, 6, 7 can scale up to tens of thousands processors (Müller and Scheichl, 2014).The semi-Lagrangian advection code is also known to scale at 10 4 processors (White III and Dongarra, 2011).
Currently, the full SL-AV20 code with parameterizations runs at 3024 cores with 70 % efficiency, at 4536 cores with 63 % efficiency, and at 9072 cores with 45 % efficiency when using the grid of 3024 by 1513 points in longitude and latitude, respectively (see Fig. 1).This grid corresponds to 13 km resolution at the Equator and has 51 levels in the vertical.The parallel efficiency is defined here as e N c = 100 %a N c 504 N c , where N c is the number of computational cores, and a N c is the parallel acceleration at N c cores relative to the run at 504 cores (a N c is equal to the ratio of wall times at 504 cores and N c cores).Only 288 cores are needed to meet HMCR operational NWP requirements (20 min per 24 model hours) at the current operational horizontal grid of 1600 by 865 points.
The parallel efficiency of the current SL-AV20 implementation is limited by use of 1-D MPI decomposition and need for global communications (transpositions).The global communications due to data transpositions are also needed in spectral hydrostatic semi-implicit semi-Lagrangian dynamical cores.However, SL-AV20 can compute all the horizontal derivatives in the grid-point space, while they are computed in spectral space and so included in data transposition procedures in spectral models.So the number of variables that undergo data transposition in SL-AV20 code is significantly smaller.

Numerical experiments
Verification of atmospheric models' dynamical cores with idealized adiabatic (moist-adiabatic, quasi-adiabatic) test cases such as Held and Suarez (1994), Jablonowski and Williamson (2006b), Reed and Jablonowski (2012), and Thatcher and Jablonowski (2016) has gained in popularity in the last few years.The reason is that various aspects of model numerics can be evaluated in an easier way using an idealized setup than real atmospheric flow with all its com-G L Figure 2. Latitude grid spacing (solid lines) of 400 × 250 (red), 800 × 500 (green) and 1600 × 865 (blue) variable-resolution latitude-longitude grids.Dotted lines indicate longitude grid spacing of the corresponding reduced grids.Dashed lines show latitude grid spacing in the constant-resolution grids with the same number of latitudes.plexity.Also, the numerical solutions to idealized tests are independent of sub-grid-scale process parameterizations.
The complete verification of the SL-AV20 dynamical core with all its options would require a separate article.Here we concentrate on the impact of the horizontal grid aspects on the numerical solution, namely, the grid reduction and resolution refinement (see Sect. 5.1).The results presented below are obtained with the baseline NWP setup of the SL-AV20 dynamical core with no inherent mass conservation and divergence equation (22a) derived from the discrete form of the momentum equation (1).
The grids with 400×250, 800×500, and 1600×865 points in longitude and latitude, respectively, and 28 levels in the vertical are used in this study.The first and last horizontal grids correspond to the old and new SL-AV operational NWP configurations at HMCR.The algorithm proposed by Fadeev (2013) is used to obtain variable resolution and reduced variants of these grids.The latitude grid spacing is refined in the Northern Hemisphere mid-latitudes.The reduced grids used in this study have about 25 % less points than their regular counterparts.Following our experience with shallow water model (Tolstykh and Shashkin, 2012) and advection (Shashkin et al., 2016) experiments, 25 % reduction is almost the greatest reduction that does not significantly deteriorate the solution accuracy.The characteristics of grids used in this study are summarized in Fig. 2.
The time step for all variants of the 400×250 grid is 1200 s and is reduced by the factor c = 400 Nlon for finer grids (Nlon is the number of grid points along the Equator).These time-step values are consistent with the SL-AV20 operational NWP setup.The basic ∇ 4 hyper-diffusion coefficients in 400×250 grid experiments are 1.5 × 10 15 m 4 s −1 for relative vorticity and temperature and 1.9 × 10 15 m 4 s −1 for divergence; the hyper-diffusion coefficients for finer grids are reduced by the factor c 3 .The latitudinal component of the hyper-diffusion coefficient (see Sect. 7) varies with latitudinal grid spacing K ϕ = ( ϕ/ ϕ ) 3 K 0 , where ϕ = 2π Nlat and K 0 is the basic coefficient value for the given grid.

Baroclinic instability test case
In the baroclinic instability test case by Jablonowski and Williamson (2006b), the small zonal wind perturbation of the geostrophically balanced background state is placed in the Northern Hemisphere mid-latitudes.After initial adjustment, the perturbation causes exponential growth of the baroclinic wave (days 1-7) until it breaks down (days 7-10) and transforms to the fully chaotic solution in the course of another 20 days.
As can be seen from figures in Jablonowski and Williamson (2006b) and Wan et al. (2013), the surface pressure and temperature fields rapidly converge to the smooth large-scale pattern with the resolution increase.Contrarily, the relative vorticity field of the breaking wave consists of filaments that become more and more twisted and thinner with greater amplitude when the grid resolution is increased.Each step of grid refinement reveals new smaller-scale features in this field.
Our particular interest is the net effect of increasing latitudinal and reducing longitudinal resolution in the zone of interest, i.e., Northern Hemisphere mid-latitudes. Figure 3 presents the comparison of a day 9 850 hPa relative vorticity field between numerical solutions on regular lat-lon grids with constant grid spacing in latitude and reduced lat-lon grids with variable latitudinal resolution.The absolute values of vorticity and its gradients are greater and the inner structure of the vortices is more developed in the solutions at 400 × 250 and 800 × 500 variable-resolution reduced grids as compared to the solutions obtained at the corresponding regular grids.The improvement in the 1600 × 865 grid solution is less obvious, which can be partly attributed to the convergence.Slightly stronger development of the leftmost and middle vortices can be noticed in the variable-resolution solution after closer consideration.However, some subtle details of the rightmost vortex are lost due to the grid reduction (see discussion below).
To isolate the effect of the grid reduction, we compare the day 9 850 hPa relative vorticity snapshots from the variableresolution grid solutions with and without reduction of longitudinal resolution (Fig. 4).Except for the leading vortex on the 1600 × 865 grid, the difference between the solutions using the reduced and non-reduced grids can hardly be found.Therefore, the reduction of longitudinal resolution does not lead to the degradation of a solution.At the same time, one can hope that reproduction of mid-latitude pressure system dynamics can be improved using variable resolution in latitude.Nevertheless, it should be noted that the effect of latitudinal resolution refinement is weaker than the effect of increasing the resolution of the basic grid.
One can see in Figs. 3 and 4 that SL-AV20 solutions are smooth and no evidence of grid-scale noise or any other type of noise can be found.This does not compromise the reproduction of smaller-scale details, gradient strength and amplitude of the relative vorticity field, which are in good agreement with the solutions of comparable resolution: ICON (Zangl et al., 2015), ECHAM (presented in Wan et al., 2013), CAM-FV, CAM-EUL, CAM-SLD, and GME (compared in Jablonowski and Williamson, 2006a).The surface pressure error norms with respect to the CAM-SLD reference solution (not shown) are below the uncertainty level (Jablonowski and Williamson, 2006b), similar to the previous version of the SL-AV model (Shashkin and Tolstykh, 2014).
The time-step values consistent with our operational NWP practice lead to only modest advective CFL numbers of about 1 at day 9 and 2 at day 20 in this test case.So the model stability is tested with larger time steps.There are no problems with stability of solution and no accuracy degradation (in terms of comparison with the reference solution), with the time step 2 times larger than the basic one.As the time step is increased by a factor of 3, we face a small-scale noise in the horizontal divergence field that contaminates the physical solution after day 9 of simulation.The issue can be circumvented by increasing the ∇ 4 hyper-diffusion coefficient 10 times only for divergence.Increasing the diffusion coefficient for divergence does not significantly affect the accuracy (in terms of the difference to the reference solution) and reproduction of smaller-scale relative vorticity features.
We believe that the stability challenge in this test is not the advection, but the treatment of nonlinear terms of equations.Therefore, it is difficult to obtain a stable solution with larger time steps even if a semi-implicit SL formulation is used.Also, it should be noted that the semi-implicit SL reference solution from Jablonowski and Williamson (2006b) is obtained using a time-step value even smaller than our basic one (at a similar horizontal resolution).

Held-Suarez test
The goal of the Held and Suarez (1994) experiment is to verify statistical properties of the atmospheric model circulation driven by highly idealized forcing -the relaxation of temperature to the equilibrium profile and Rayleigh friction near the surface.The resulting atmospheric regime is a balance between non-uniform heating, polewards transport of heat and momentum by baroclinic eddies and destruction of the momentum in the boundary layer.Thus, the accurate representation of baroclinic motions in the mid-latitudes plays a crucial role for the test results.Both lat-lon grid reduction and local increase in latitudinal resolution can modify the intensity of eddy heat and momentum fluxes and thus alter the net circulation characteristics.Figure 5a-f presents the 1000-day time-zonal averaged zonal wind speed, temperature, eddy heat and momentum fluxes, eddy kinetic energy and temperature variance.Shading shows the solution at a 400 × 250 reduced grid with variable resolution in latitude; a regular grid solution with constant latitudinal resolution is shown by contours.One can note a slight increase of Northern Hemisphere eddy momentum flux (near 300 hPa) and temperature vari-ance (near 900 hPa) maximum values.The eddy heat flux in the Southern Hemisphere at 300 hPa is slightly degraded in the variable-resolution solution that can be a co-effect of grid reduction and increased latitudinal grid spacing.The color levels of eddy kinetic energy to the north of 45 • N in Fig. 5e are extended polewards as compared to the corresponding contours of a regular grid solution.That means more intensive eddy motions at high latitudes in the variable-resolution reduced grid solution.
The standalone effect of grid reduction can be evaluated from panels (g) to (l) in Fig. 5.The shading shows the dif-   ference between constant latitudinal resolution regular and reduced grid solutions (regular minus reduced); the contours of regular grid solution are overlaid for reference.The majority of features depicted in these panels are explained by the shift in jet-stream cores which is rather the result of internal low-frequency variability of test solutions (Wan et al., 2008) than a consequence of grid reduction.We can find no significant decrease in polewards eddy fluxes due to suppression of the shortest-scale motions in the regions with reduced longitudinal resolution.Figure 6 presents 500 hPa kinetic energy spectra from three numerical solutions (regular, reduced, reduced with variable resolution in latitude) in Northern Hemisphere midlatitudes (left panel) and high latitudes (right panel).To obtain the corresponding spectra, u and v are multiplied by the window function equal to 1 in a 30-60 • N (to the north of 70 • N in the case of a high-latitude spectrum) and 10 • width smooth transition to the zero value outside.The lines corresponding to the regular and reduced grid solutions are almost identical at mid-latitudes for all wavenumbers, whereas the variable-resolution solution shows more intensive smallerscale motions.The effect of variable resolution at high latitudes also outweighs the effect of grid reduction.
We also repeated a numerical experiment with the time step of 2700 s (400 × 250 grid) that yields the maximum advective CFL of about 4 after day 200.The results are generally the same as for the basic time-step value of 1200 s.

Mountain-induced Rossby wave test case
The accuracy and stability of orography treatment in SL-AV20 is tested with the mountain-induced Rossby wave case (Sect.1.5 of Jablonowski et al., 2008).In this test case, the initially balanced zonally symmetric flow interferes with the idealized mountain located at 30 • N that results in generating a Rossby wave train type solution.
It is well known that semi-implicit semi-Lagrangian models prone to resonantly amplifying the stationary gravity wave feed back to the orography forcing when the advective CFL number is greater than 1 (the so-called spurious orographic resonance).Figure 7 compares two SL-AV20 solutions at a 1600 × 865 regular grid with constant latitudinal resolution and t equal to 300 and 900 s that result in a maximum zonal CFL of about 0.66 and 2, respectively.The two solutions in Fig. 7, however, look very similar and closely agree with the reference solution from Jablonowski et al. (2008) and the solution from the spectral non-hydrostatic dynamical core Simarro et al. (2013).The geopotential height field is smooth in both SL-AV20 solutions.The Rossby wave structure of wind component fields is slightly distorted by small-scale noise downstream of the mountain, which we attribute to the presence of physical inertia-gravity waves.The u and v fields with t = 900 s look even less noisy than in the t = 300 s solution.This can be explained by more intensive gravity wave damping by the off-centered Crank-Nicolson scheme at t = 900 s.The experiments using reduced grids with constant and variable resolution in latitude lead to very similar test results.

Conclusions
We have described the dynamical core of the SL-AV20 (semi-Lagrangian, based on the absolute vorticity equation) atmospheric model, thus summarizing all recent developments in the model numerics (Fadeev, 2013;Tolstykh and Shashkin, 2012;Shashkin and Tolstykh, 2014;Shashkin et al., 2016).The model is applied to operational global numerical weather prediction with 20 km resolution over Russia.Also, its lower-resolution configuration is expected to be applied for probabilistic long-range forecasts and is the basis for the development of the new atmospheric component of the INMCM Earth system model (Volodin et al., 2017).
The main features of the model dynamical core are the vorticity-divergence formulation at the unstaggered grid, a reduced latitude-longitude grid with variable resolution in latitude, using high-order finite-difference formulae, and semi-Lagrangian semi-implicit discretization.The number of grid points in longitude at near-polar latitudes for the reduced grids constructed with the algorithm by Fadeev ( 2013) is an order of magnitude smaller than the number of points at the Equator.
The effects of the grid reduction and variable latitudinal resolution on numerical solutions are investigated with two idealized tests -the Jablonowski and Williamson (2006b) deterministic case and the Held and Suarez (1994) long-run experiment.The results agree with other published model solutions.It is shown that the reduction in the number of grid points by 25 % leads to only moderate damping of the shortest longitudinal waves and does not degrade the solution.At the same time, variable resolution in latitude allows us to refine the grid in the region of interest and thus improve the solution accuracy.Particularly, increasing latitudinal resolution in the Northern Hemisphere results in sharper gradients and stronger development in the relative vorticity field during unstable baroclinic wave breaking in the Jablonowski and Williamson (2006b) test and more intensive mid-latitude eddy motions in the Held and Suarez (1994) test.In all tests, the evident positive effect of increase in latitudinal resolution outweighs the slight negative effect of grid reduction.
The stability and accuracy of SL-AV20 in the presence of orography forcing is tested using the mountain-induced Rossby wave test from Jablonowski et al. (2008).The solutions at various grids using different time-step values (CFL from 0.6 to 2) are very similar and are in good agreement with the results from other models.We find no sign of resonant amplification of inertia-gravity wave response to the orography.
We hope that the dynamical core presented in the article will be used in numerical weather prediction setups with a www.geosci-model-dev.net/10/1961/2017/Geosci.Model Dev., 10,2017 horizontal resolution of about 9 km.Also, the ongoing application of SL-AV20 is foreseen in seasonal forecasting and climate simulations.Currently, the experimental SL-AV20 configuration with approximately 13 km horizontal resolution can efficiently use up to 9000 cores.This is sufficient in a short-term perspective, but will become a bottleneck after some years once exascale computer architectures have been implemented.Therefore, our plans for future development are mainly concerned with improvement of parallel efficiency.We work on complete abandoning of Fourier-space computations to reduce global communications (to date, we have successfully tested computation of meridional derivatives at the reduced grid in the grid-point space).Furthermore, we plan to implement 2-D domain decomposition and scalable iterative grid-point solvers for elliptic problems which are known to scale up to tens of thousands of processors (Müller and Scheichl, 2014).Also, we test the SL-AV20 model with the Intel MIC architecture.
Appendix A: List of notations a mean Earth radius A(η), B(η) coefficients of Simmons and Burridge (1981)  orography-dependent reference surface pressure q, q i specific concentrations of water vapor and liquid (solid) water species r vector pointing from the center of the sphere to the given point on its surface R d , R v , R moist gas constants of dry air, water vapor and moist air, respectively R ζ , R V , R D , R T , R P RHS of time-discrete absolute vorticity (19), momentum, divergence (22b), thermodynamic (23) and continuity (24) equations, respectively R the rotation matrix ṡ analog of vertical velocity used in thermodynamic and continuity equations (8,6) S ij horizontal square of i, j th vertical column of computational cells T , T v temperature, virtual temperature T constant reference temperature V = (u, v) horizontal wind V ij l computational cell V(t) Lagrangian volume γ (η) s Ritchie and Tanguay (1996) , 10, 1961-1983, 2017 One can see that elimination of T , P , S from Eq. (C4) using Eqs.(C1)-(C3) results in Eq. ( 27) with

Figure 1 .
Figure 1.Parallel acceleration of the SL-AV20 experimental configuration on the grid of 3024 by 1513 points with respect to 504 cores (red line); linear acceleration -black line.

Figure 5 .
Figure 5.The 1000-day time-zonal averaged zonal wind speed (U ), temperature (T ), momentum flux (U V ), heat flux (V T ), eddy kinetic energy and temperature variance in the Held-Suarez test.Panels (a)-(f) -shading shows a reduced grid solution with variable resolution in latitude; contours show a regular grid solution with constant latitudinal resolution.Panels (g)-(l) -shading shows the difference between regular and reduced grid solutions with constant resolution in latitude (regular minus reduced); overlaid contours depict the regular grid solution.

Figure 6 .
Figure 6.The 500 hPa kinetic energy spectra at middle and high latitudes in the Held-Suarez test.Reduced grid with variable resolution in latitude (red line), reduced grid with constant latitudinal resolution (green line), and regular grid with constant latitudinal resolution (blue line).

Figure 7 .
Figure 7. Day 15 u and v wind components and geopotential height at 700 hPa in the mountain-induced Rossby wave test.Left column -SL-AV20 solution at 1600 × 865 with t = 300 s.Right column -the same as the left column, but with t = 900 s.
hybrid vertical coordinate Âψ k , Bψ k kth zonal wavenumber coefficients of ψ Fourier representation in longitude c pd , c p heat capacity of dry and moist air under constant pressure Nlev number of grid points in latitude, longitude and the vertical, respectively p, p 0 , p s pressure, constant reference pressure, surface pressure, respectively p, ps reference pressure profile, reference surface pressure p ref s ψ is evaluated at the departure point of upstream trajectory (ψ) V ij l , (ψ) S ij average of ψ over the V ij l cell or the S ij -horizontal square *www.geosci-model-dev.net/10/1961/2017/Geosci.Model Dev.