Interactive comment on “ A High-order Staggered Finite-Element Vertical Discretization for Non-Hydrostatic Atmospheric Models ” by

Here are the results you requested. These are 2 hour simulations using the Schär mountain profile with zero background wind. Here we are again looking at vertical orders 2, 4, and 10 in the same manner as the original manuscript. The fields are plotted relative to the initial state of the atmosphere, but the reference state is not being used in the computations. We observe that there is an improvement in the error with increasing vertical order, however that is limited by the constant 4th order horizontal terms. We note the presence of the Lorenz vertical mode which does vanish for Charney-Phillips.


Introduction
The accurate representation of vertical wave motion is essential for models of the atmosphere. The vertical coordinate for the non-hydrostatic fluid equations has traditionally been discretized in the Eulerian frame via a second-order Charney-Phillips (Charney and Phillips, 1953) or Lorenz grid (Arakawa and Moorthi, 1988), or via Lagrangian layers, such as in Lin (2004).
staggering of prognostic variables for maintaining correct behavior for wave motions relevant to the atmosphere. Third, unstaggered discretizations (that is, discretizations where all prognostic variables are stored on model levels) possess stationary computational modes which represent gross errors in the dispersion properties of the solution (Melvin et al., 2012;Ullrich, 2014b). As in the horizontal, unstaggered FEM leads to waves with zero phase speed in the limit as the 5 wavelength tends to 2∆x, where ∆x is the average grid spacing between degrees of freedom. However, unlike the horizontal, these wave modes can be dramatically enhanced by an implicit treatment of the vertical at high Courant number.
This paper describes a new discretization for the vertical that combines the accuracy of finite element methods with the desirable wave propagation properties of staggered methods.
This approach is referred to as the Staggered Nodal Finite Element Method (SNFEM). Notably, this formulation is sufficiently general to be compatible with essentially any form of the fluid equations. The SNFEM discretization can be easily composed in differential form using interpolation and differentiation operators built in accordance with the Discontinuous Galerkin and Spectral Element discretizations that arise from the Flux Reconstruction method of Huynh 15 (2007) (see Table 1). The use of SNFEM is natural for vertical discretizations, as no-flux conditions are easily imposed on top and bottom boundaries in the general finite element framework (Zienkiewicz et al., 2005). Also, the structure of individual elements (namely, the fact that Gauss and Gauss-Lobatto-Legendre nodes are optimally clustered near element edges) lends itself readily to improved resolution in the atmospheric boundary layer. Further, SNFEM inherits 20 the mimetic properties of the spectral element method so the vertical operator will automatically conserve both mass and discrete linear energy.
To assess the performance of SNFEM, this discretization has been implemented in the spectral element Tempest model (Ullrich, 2014a) and run through a suite of mesoscale test cases. The test cases are as follows: A rising thermal convective bubble (Giraldo and Restelli, 2008), transport, and near boundary dynamics corresponding to a high-order vertical coordinate with and without the influence of topography. Therefore, the objectives of this paper are as follows: 1. To introduce our approach for the construction of a generalized, staggered, variable orderof-accuracy, finite element vertical discretization. We emphasize discretization of the nonconservative differential form of the Navier-Stokes equations (in vector invariant or so-5 called Clark form), which is independent of coordinate system.

2.
To validate the implementation of this discretization within the Tempest framework using a selection of test cases in Cartesian geometry through a range of horizontal scales from 1 to 1000 km.
3. To determine the qualitative and quantitative effect of vertical order of accuracy on solu-10 tions by conducting validation experiments at coarse resolutions relative to finer reference solutions. We consider the effects of Lorenz (LOR) and Charney-Phillips (CPH) staggering both in the interior flow and at the lower boundary.
4. To determine whether a high-order vertical discretization greatly improves the simulation quality, and consequently to recommend whether there is an optimal order-of-accuracy 15 that provides the best tradeoff between accuracy and computational cost.
We will show that a high-order vertical discretization at coarse resolution more accurately approximates the reference solution relative to the low vertical order alternative. Since the interpolation and derivative operators in the finite element approach are easily expressed as linear matrix operators, there is minimal cost in adjusting the order-of-accuracy. We will present con- 20 trol experiments 4 where only the resolution and vertical order-of-accuracy vary. We leave the rigorous analysis of staggered wave modes and discrete energy conservation using the interpolation/differentiation operators for a subsequent work.
The remainder of this manuscript is as follows: Section 2 describes the non-hydrostatic equations of fluid motion on an arbitrary coordinate frame. Section 3 describes the discrete form by the SNFEM vertical discretization and the time-stepping scheme employed. In section 4, we describe the test case suite and discuss the corresponding model results. The summary and conclusions follow in section 5.

The non-hydrostatic equations of fluid motion
In an arbitrary coordinate frame (α, β, ξ), the vector velocity can be written as where g i (i ∈ {α, β, ξ}) are the local coordinate basis vectors and u i are the contravariant velocity components. The associated covariant components are (2) 10 Covariant components can be obtained in terms of contravariant components via contraction with the covariant metric g ij = g i · g j , The reverse operation uses the contravariant metric g ij , defined as the matrix inverse of the 15 covariant metric. Contraction of the covariant components with the contravariant metric returns the contravariant vector components, The volume element J is computed in terms of the covariant metric as Using covariant horizontal velocity components, vertical velocity, potential temperature θ and dry air density ρ as prognostic variables, the Euler equations with shallow-atmosphere 5 approximation can be written an arbitrary coordinate frame as ∂r ∂ξ The vertical velocity w is closely related to u ξ via The specific Kinetic energy is while the geopotential function given the gravitational acceleration g c is and the Exner pressure is 15 Here p 0 denotes the constant reference pressure, R d is the ideal gas constant and c v and c p refer to the specific heat capacity at constant volume and pressure, respectively. The absolute vorticity vector is given by where the relative vorticity vector is and, under the shallow-atmosphere approximation, the planetary vorticity vector is Consequently, the rotational terms in the equation of motion take the form 10 Note that this formulation does not specify a coordinate system. Consequently, these equations can be used for either Cartesian or spherical geometry. To account for topography, terrainfollowing σ-coordinates are imposed by defining the radius r = r(α, β, ξ) so that r(α, β, 0) is coincident with the surface. For example, Gal-Chen and Somerville (1975) coordinates arise from the choice where r e is the radius of the Earth, z top denotes the model height and z s (α, β) denotes the surface elevation.

20
Spherical shells are discretized using an equiangular cubed-sphere grid (Sadourny, 1972;Ronchi et al., 1996), which consists of six Cartesian patches arranged along the faces of a cube which 7 is then inflated onto a spherical shell. More information on this choice of grid can be found in Ullrich (2014a). On the equiangular cubed-sphere grid, coordinates are given as (α, β, p), with central angles α, β ∈ [− π 4 , π 4 ] and panel index p ∈ {1, 2, 3, 4, 5, 6}. By convention, we choose panels 1-4 to be along the equator and panels 5 and 6 to be centered on the northern and southern pole, respectively. With uniform grid spacing, each panel consists of a square array of 5 n e × n e finite elements.

Vertical Discretization
Each vertical column consists of n ve nodal finite elements, indexed a ∈ {0, . . . , n ve −1}. Throughout this manuscript, all vertical indices are assumed to increase with altitude. Within each ele-15 ment, levels are placed at the n vp Gaussian quadrature nodes and interfaces at n vp + 1 Gauss-Lobatto quadrature nodes, leading to a staggering of levels and interfaces. With vertical coordinate ξ, the location of model levels denoted ξ a,k with k ∈ {0, . . . , n vp − 1} and model interfaces denotedξ a,k with k ∈ {0, . . . , n vp }. Each finite element is then bounded within the interval [ξ a,0 ,ξ a,n vp ] with two associated sets of basis functions -one over model levels, denoted by the 20 set φ a = {φ a,j |j = 0, . . . , n vp − 1} that includes characteristic polynomials of degree n vp − 1, and one over model interfaces, denoted by the setφ a = {φ a,j |j = 0, . . . , n vp − 1} that includes characteristic polynomials of degree n vp . A depiction of the vertical staggering associated with levels and interfaces is given in Fig. 1, along with basis functions in each case. A scalar field q(ξ, t) can then be written approximately, either as a linear combination of basis functions on levels, or on interfaces, For the remainder of this manuscript we will use script n to denote variables stored on model 5 levels and script i to denote variables stored on interfaces.

Interpolation Operators
Note that (22) and (23) are not equivalent discretizations since (22) cannot represent polynomials of degree n vp and (23) cannot represent fields that are discontinuous at element interfaces. Nonetheless, we can define interpolation operators between these fields via I n i , representing in-10 terpolation from levels to interfaces, and I i n , representing interpolation from interfaces to nodes. First, interpolation from interfaces to levels is defined as To define the interpolant from levels to interfaces, a two-step procedure is employed. Since 15 basis functions on levels are discontinuous, we define the left and right interpolants at element boundaries as (I n L q) a,0 = n vp −1 j=0 q a,j φ a,j (ξ a,0 ), (I n R q) a,n vp −1 = n vp −1 j=0 q a,j φ a,j (ξ a,n vp −1 ) and then define the total interpolant as These interpolation operators can also be obtained from equivalence via the variational (weak) form. At model interfaces, the accuracy of (26) degrades for unequally spaced finite elements.

5
For the case of stacked finite elements with unequal thickness ∆ξ a =ξ a,n vp −ξ a,0 , a more accurate formula can be obtained from which arises on noting that the one-sided interpolant has error O(∆ξ n vp a ).

Differentiation Operators
Differentiation is required for all combinations of model levels and interfaces: D i i represents differentiation from interfaces to interfaces, D i n represents differentiation from interfaces to levels, D n n denotes differentiation from levels to levels and D n i denotes differentiation from levels to interfaces. A depiction of the behavior of these derivative operators is shown in Fig. 2. 15 Differentiation from interfaces to levels is obtained by simply differentiating (24), Differentiation from levels to levels is computed via the composed operator where boundary conditions are enforced after application of the interpolation operator.
Differentiation from interfaces to interfaces requires averaging the one-sided derivatives at element interfaces, but is otherwise simply the derivative of (24) on the element interior, Differentiation from levels to interfaces (D n i ) should not be defined via the composition D i i I n i 5 since this procedure would introduce a non-zero null space that can trigger an unphysical computational mode in the discrete equations. Instead we define D n i using the robust differentiation technique discussed in Ullrich (2014a), based on the flux reconstruction methods of Huynh (2007). This strategy leads to the discrete operator where and g L and g R are the local flux correction functions, which are chosen to satisfy 5 and otherwise approximate zero throughout [ξ a,0 , ξ a,n vp −1 ].
There is some flexibility in the discretization that depends on the specific choice of flux correction functions. Huynh (2007) gives a family of flux correction functions on the interval [−1, 1] denoted by g k for k = 1, 2, . . .. In particular, we are interested in g 1 (the Radau poly-10 nomials) and g 2 , which have the special property that dg 2 /dx = 0 at all Gauss-Lobatto points. Although either choice of flux correction function leads to a valid discretization for n vp > 1, when n vp = 1 a consistent differential operator is recovered only with g 2 . Hence, for the remainder of this text we will adopt the flux correction function g 2 . For this choice, the flux correction function satisfies where P N (x) is the Legendre polynomial of order N . In the limit as x approaches the boundaries of the reference element, a simplified expression emerges:

Second Derivative Operators in the Vertical
The second derivative operators are used in viscosity and hyperviscosity calculations. They are 5 obtained as approximations to the equation subject to Neumann (no-flux) boundary condition ∂q ∂ξ = 0 at ξ = 0 and ξ = 1.
For the viscous operator from interfaces to interfaces, the discretization is obtained from the 10 variational (weak) formulation. Specifically, from (37) and integration by parts, Then using (23), (38) and the assumption of orthogonality of basis functionsφ under quadrature,

15
For model interfaces on Gauss-Lobatto nodes, the integral is discretized via Gauss-Lobatto quadrature. 13 The viscous operator from levels to levels is derived in a similar manner, although the nondifferentiability of q at interfaces in the discontinuous basis means that we must rely on differentiation via (31). Consequently, the weak form 5 then leads to discrete operator where 10 For model levels on Gauss nodes, the integral is discretized directly via Gaussian quadrature. Note that the boundary condition implies that we must impose

Flow-dependent vertical hyperviscosity 15
The basic spectral element method is an energy conservative scheme (Taylor and Fournier, 2010) that allows for the accumulation of energy at the shortest wavelengths. Following Ullrich (2014a) and Dennis et al. (2011), we impose explicit dissipation in the horizontal using a constant coefficient hyperviscosity. In the vertical, a constant coefficient hyperviscosity would have 14 a rapid and adverse affect on hydrostatic balance in the absence of a hydrostatic reference state (Giraldo and Restelli, 2008). Consequently, in this paper we apply a localized hyperviscosity in the vertical column that is weighted by the contravariant vertical flow velocity u ξ , where q ∈ {u α , u β , w, θ, ρ} and k is a positive integer. The motivation for using u ξ stems from the observation that advective transport in the vertical occurs with speed u ξ , and so this would be the corresponding wave speed that would enter into, for example, the Rusanov Riemann solver in the context of discontinuous Galerkin or finite volume methods. In this sense, the flow-dependent hyperviscosity is a generalization of advective up-winding if applied simulta-10 neously with the vertical advective operator. The Riemann solver interpretation also yields an appropriate estimate for the value of ν z , k = 2 : where ∆ξ = 1/(an vp ) is the average spacing between nodes in the vertical direction.

The Staggered Nodal Finite Element Method (SNFEM)
The interpolation and differentiation operators given in the previous sections provide a framework for constructing staggered vertical grids in the context of the nonlinear system (6)- (10). 20 Furthermore, the SNFEM allows for discretizations of arbitrary order-of-accuracy via adjustments in n vp . For the present work, we investigate unstaggered (on interfaces), Lorenz (LOR) (u, v, ρ, θ on levels, w on interfaces), and Charney-Phillips (CPH) (u, v, ρ on levels, w, θ on interfaces) configurations. The two key diagnosed variables, Π and u ξ are co-located with ρ and w respectively. Table 1 provides a reference nomenclature for the various discrete derivative 25 operators that arise in the SNFEM. 15 When needed, the contravariant ξ velocity is computed on interfaces via where all contravariant metric terms are evaluated on interfaces.

Temporal Discretization
The equations (6)- (10) are written in the form where f (x, ψ) denotes terms associated with non-stiff modes, i.e. horizontally-propagating modes and vertical advection of horizontal velocity. The function g(x, ψ) denotes geometrically stiff terms associated with all vertical derivatives except for vertical advection of horizontal velocity. The model follows the approach of Ullrich and Jablonowski (2012) by treating non-stiff 10 terms using an explicit temporal operator and stiff terms using an implicit operator. For the first time step, an implicit update is applied, where G(ψ n ) represents the discretization described in section 3.2 and DG(ψ n ) = ∂G/∂ψ n . For later time steps, the implicit update is instead obtained from a stored tendency, Explicit terms are evolved using a Runge-Kutta method which supports a large stability bound for spatial discretizations with purely imaginary eigenvalues. This particular scheme is based 16 Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-275, 2016 Manuscript under review for journal Geosci. Model Dev. Published: 19 January 2016 c Author(s) 2016. CC-BY 3.0 License. on Kinnmark and Gray (1984a,b) and takes the form . Hyperviscosity is then applied in accordance with Ullrich (2014a), with scalar hyperviscosity used for all scalar quantities and vector hyperviscosity used for the horizontal velocity field. Mathematically, the update takes the form, where ψ s ∈ {θ, w, ρ}. When active, Rayleigh friction is applied via backward Euler to relax all prognostic variables 15 to a specified reference state, where γ = [1 + ν r (x)∆t] −1 is in terms of the Rayleigh friction strength ν r (x).
In accordance with Strang splitting, a final implicit update is applied,

Validation
In this section we present a set of test cases with the purpose of investigating the performance of 25 the SNFEM for mesoscale atmospheric modeling. Our emphasis is on a wide range of resolu-17 tions from the global scale (200 km) to the large eddy scale (5 m). These scales transition from hydrostatic to scales where all non-linear terms in the equations (6) -(10) become significant. For our experiments we will hold the following components of the computations constant: (1) The horizontal discretization is kept as a standard 4 th order spectral element formulation for all simulations, as outlined in section 3.1.

5
(2) The time integration scheme is based on Strang-split IMplicit EXplicit (IMEX) outlined in section 3.3.
(3) Vertical terms ∂ ∂z are integrated implicitly using the generalized minimal residual method (GMRES) with no preconditioner.
(4) Reference solutions employ consistent 4 th order vertical and horizontal discretizations at For these tests, we investigate the effect of a relatively high-order n vp = 10 vertical coordinate on flow results at resolutions coarser than the reference solutions. Our hypothesis is that flow structures and measures of interest will be better approximated using the high-order discretization. We consider the properties of our arbitrary order methods in the context of meshes 20 with mixed grid resolutions such as static and adaptive variable resolution experiments. A primary benefit of using the higher order SNFEM is improved accuracy even with a coarser vertical grid.
Reference results are computed with a consistent spatial (horizontal and vertical) discretization or order "O4". Experiments done at coarser resolutions with varying vertical order of ac- respectively, where f 0 = 2Ω sin ϕ 0 and β 0 = 2a −1 Ω cos ϕ 0 at latitude ϕ 0 = 45 • N. Here, the radius of the Earth is a = 6371.229 × 10 3 m, its angular velocity is Ω = 7.292 × 10 −5 s −1 and 15 y 0 = L y /2 is the center point of the domain in the y-direction. The simulation is performed for the original β-plane configuration outlined in Ullrich et al. (2015) where the jet is perturbed directly by a "bump" in the zonal wind that is vertically uniform where u p = 1.0 m s −1 centered at x c = 2000 km and y c = 2500 km.
where ∆x is the element length in the x direction and L ref = 11.0 × 10 5 m is the reference length used for this test case. For this test, vertical flow-dependent viscosity is disabled since it did not have a clear impact on the solution. The baroclinic instability is a primary mechanism for the development of mid-latitude storm systems and so it is important that an atmospheric modeling platform reproduce these phenom-10 ena accurately. We present a reference solution of the baroclinic wave shown in Fig. 3 that is approaching the transition into the non-hydrostatic regime. We are interested in estimates of vertical motion where the reference solution shows maxima on the order of 2 cm s −1 . Regions of strong vertical motion correspond to strong horizontal gradients in the vorticity and temperature fields and we expect that non-hydrostatic effects will be locally significant.

15
The reference solution for temperature and vorticity at 500m elevation shown here can be compared at day 10 with the original results from Ullrich et al. (2015) produced with MCore Ullrich and Jablonowski (2012) to verify that Tempest is computing a consistent solution. In particular we expect that vertical motion will be under-predicted in coarser models at a given order of accuracy. 20 The vorticity field at coarse resolution (Fig. 4) is largely unaffected by changes in vertical order. However, the vertical velocitiy (Fig. 5), and by association the horizontal divergence (not shown), shows a substantial increase in magnitude as order increases. This increase aligns the vertical velocity more closely with the reference solution magnitiude (greater than 1 cm s −1 ) using the 10 th order vertical coordinate as shown in Fig. 5. We conclude that although the higher 25 20 Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-275, 2016 Manuscript under review for journal Geosci. Model Dev. Published: 19 January 2016 c Author(s) 2016. CC-BY 3.0 License. order vertical coordinate does not substantially impact the horizontal character of the solution, it does better capture the magnitude of vertical velocity, particularly in frontal regions. We note that the coarse resolution chosen here is nearly double that of current operational climate modeling systems and well within the hydrostatic regime.

5
Atmospheric flows are strongly influenced by the lower boundary, where topographically-induced waves transport momentum and energy vertically. Schär et al. (2002) describes a uniform zonal flow field over orography that leads to the generation of a stationary mountain response, consisting of a linear combination of hydrostatic and non-hydrostatic waves. The atmosphere is initially under uniform stratification with constant Brunt-Väisälä frequency N = 0.01 s −1 . The temperature and pressure are p 0 = 1000 hPa and T 0 = 280 K at z = 0 m. To trigger the standing waves, an initial uniform mean flow of u = 10 m s −1 is prescribed over the topographic profile given by state. For these simulations, no explicit dissipation is applied in either the horizontal or vertical and Coriolis forcing is set to zero throughout. To validate that Tempest produces the correct mountain wave response, the Schär mountain test was performed until t = 10h with a relatively fine resolution of ∆x = 100 m, ∆z = 100 m and ∆t = 0.2 s. For the Schär test case results, we present the LOR configuration only as CPH shows no apparent difference for these flow conditions. As shown in Fig. 6 (left) Tempest accurately reproduces the vertical velocity field at the reference resolution (for comparison with another numerically derived solution, see Giraldo and Restelli (2008)). We also show the analytical solution based on linear mountain theory following Klemp et al. (2003); Smith (1979) overlaid in dotted contours. As pointed out by Klemp et al. (2003), an inconsistent treatment of 5 the topographic metric terms in this formulation can lead to the generation of spurious waves which is not observed in this case.
As discussed in Thuburn and Woollings (2005) and Thuburn (2006), staggering is necessary for eliminating stationary computational modes that arise in collocated discretizations. To better understand the impact of staggering, Fig. 7 demonstrates the use of the collocated or unstag-10 gered configuration which shows a highly-oscillatory stationary mode that pollutes the solution relative to the Lorenz configuration at the same resolution. The plots show errors in the vertical velocity near the bottom boundary condition and errors throughout the flow field due to the vertical mode. This artifact is conspicuously absent from both LOR and CPH runs.
Experiments are conducted at vertical order 2, 4, 10, and 40 (in the limit where the polyno-15 mial order is equal to the total number of levels, denoted ST) at a relatively coarse resolution of ∆x = 500 m, ∆z = 500 m and ∆t = 0.4 s. Results are depicted in Fig. 8 and the difference against the reference solution in Fig. 9. The 2 nd order results show substantial disagreement with the reference solution that is enhanced at altitude. This result appears to be associated with an overestimation of the vertical wavelength of the mountain response that arises from 20 the lower order discretization. At 4 th order the upper atmosphere does not show substantial errors, and most differences are instead constrained to the near-surface. These near-surface errors generally show consistent improvement as the vertical order-of-accuracy is increased. The discrepancy that appears at the highest peak of the Schär mountain (x = 0) is associated with slight differences in resolving the topography at coarser horizontal resolution than the reference 25 solution.
We further compare the resulting profiles of momentum flux for all experiments in the Lorenz configuration (Fig. 10). We observe that the flux profile for the 2 nd -order method has the greatest error, as expected from dispersion errors typical of low order centered schemes (particularly 22 in the upper atmosphere and near the surface). The higher-order methods show improvements in the structure (especially near the surface) and magnitude of the profiles, but again appear to be influenced by the lower order horizontal discretization. Furthermore, the results are strongly influenced by the Rayleigh layer showing a pronounced deviation in the flux profiles throughout the domain. The Rayleigh layer approximation to a free-flow boundary condition clearly 5 introduces deficiencies that are exacerbated in the flux provides.

Straka density current
The density current test case of Straka et al. (1993)  The initial state consists of a hydrostatically balanced state with a uniform potential temperature of θ = 300 K. A standard pressure of p 0 = 1000 hPa is assumed. The cold bubble 15 perturbation is applied to the θ field and is given by where θ c = −15 K and The 4 th order horizontal hyperdiffusion coefficients for all fields are given by where ∆x is the element length in the x direction and L ref = 51200.0 m is the reference length used for this test case.
For the experiments with vertical flow-dependent hyperviscosity, the viscous coefficients are given by (46). The uniform Laplacian diffusion requires further stabilization via the addition of 4 th order scalar hyperviscosity in the horizontal and 4 th order vertical flow-dependent diffusion 10 on all variables. This added diffusivity is necessary to control a horizontal stationary mode in the scalar fields and fast moving vertical modes that are a consequence of sound waves accumulating energy at the grid scale. However, the highly scale-selective nature of the high degree operators does not significantly affect the structure of the reference solution as shown in Fig. 11. 15 The grid spacing for the reference solution is ∆x = 25 m and ∆z = 25 m with ∆t = 0.01 s. Experiments are further conducted at vertical order 2 and 10 at a resolution of ∆x = 200 m and ∆z = 200 m with ∆t = 0.01 s.
For the density current, we emphasize results from the Lorenz (LOR) staggering. Since only w is specified at the lower boundary, we can implement the correct kinematic condition with-20 out triggering a further constraint on the vertical transport of potential temperature (computed on levels) near the boundary. In contrast, since vertical transport of θ emerges from the term u ξ ∂θ/∂ξ, the Charney-Phillips (CPH) configuration effectively blocks transport into the interfacial layer at the bottom boundary and so generates strong vertical gradients of θ near the surface. These gradients then enhance vertical heat fluxes above the surface, slowing the prop- 25 agating cold pool as momentum is transported vertically. This inconsistency is counteracted by 24 the application of uniform diffusion, which provides a mechanism by which θ can be exchanged with the bottom interface. However, flow-dependent vertical diffusion, which is weighted by |u ξ |, does not permit exchange with the interface and so leads to inconsistency between the LOR and CPH staggerings. In Fig. 11, the CPH staggering with flow-dependent diffusion leads to a relatively slow density current that is more convective near the boundary. Nonetheless, a 5 better choice of flow-dependent coefficient could be made to mitigate this issue.
In practice, we often desire diffusion to be as weak as possible while still preserving the stability of the underlying method. However, as can be seen here, the structure of the density current is also strongly dependent on the dissipation mechanisms employed in the simulation.
Here we present the reference solution equivalent to Straka et al. (1993) at the converged resolution. We also compare solutions with different diffusion mechanisms in Fig. 11. In Table 2 it is apparent the reference solutions are sensitive to diffusion and differ significantly in structure, but the wave front positions compare with good precision to the solution given by Straka et al. (1993). This would indicate that momentum fluxes are comparable, but close inspection of the eddy structure suggests significant differences exist throughout, as noted above, and with 15 the appearance of detached eddies when the high-order flow-dependent viscosity is used exclusively.
From Table 2 it is apparent our coarse-resolution experimental solutions are slow with reference damping and 2 nd order flow-dependent viscosity, but are closer to the reference solution with 4 th order diffusion. Both low-and high-order simulations show wave front positions that 20 accurately approximate the reference results. However, the structure of the Kelvin-Helmholtz rotors changes significantly with vertical order-of-accuracy and dissipation method shown in Fig. 12. The more scale-selective 4 th order flow-dependent viscosity shows greater detail in the structure of the rotors. In general, it is not recommended to use hyperdiffusion with a higher order than the dynamical discretization (bottom left) since the impact of the hyperdiffusion will 25 be in the truncation order of the method.
Curiously, the 10 th order vertical discretization with 4 th order flow-dependent viscosity produces a flow that more closely approximates the reference solutions at a resolution that would otherwise be considered too poor for the dynamical features considered. However, the authors 25 have not found a dynamical reason for correlation involving high-order dissipation schemes and the reference solution with uniform damping. Wave front position at the −1.0 • C contour further given in 2 confirm that momentum fluxes are also captured more accurately as these are associated to the propagation speed of the wave front.

Rising thermal bubble 5
Thermal bubble experiments have become a standard in the development of non-hydrostatic mesoscale modeling systems. At very fine resolutions (< 10 m) we test the ability to reproduce the simplest form of convection. This is a precursor to simulations of real atmospheric phenomena such as thunderstorms and other convective systems. A positive, symmetric perturbation to the potential temperature (buoyancy imbalance) causes a vertical acceleration that moves the bubble upward. Subsequently, shearing and compensating subsidence leads to two primary symmetrical eddies that further break down as the simulation progresses. We are interested in the evolution of the flow in terms of structure and conservative properties on θ.
We present two flow scenarios: a) the bubble rises and is allowed to interact with the top and lateral boundaries and b) the so-called Robert smooth bubble experiment (as outlined in Giraldo 15 and Restelli (2008)) that are a variation of the experiments of Robert (1993). In the former, the bubble will meet the boundaries and develop shear instabilities and in the Robert bubble, shear instabilities develop in the interior of the flow. For these experiments, 4 th order viscosity is applied in the horizontal and vertical primarily to the potential temperature field. Furthermore, at finer resolutions we observe more fine-scale features of the thermal bubble, including tighter 20 winding of the trailing edges at later times and sharper spatial gradients. Nonetheless, our comparisons for this test case are purely qualitative but remain consistent with previous results.
The background consists of a constant potential temperature field θ = 300 K, with a small perturbation of the form θ = 0 for r > r c , Here we choose the amplitude and radius of the perturbation to be θ c = 0 No sponge layer is used, and Coriolis forces are set to zero. The reference grid spacing is ∆x = 5 m and ∆z = 5 m respectively. This is considered the reference resolution following Giraldo and Restelli (2008). Experiments are conducted at a  20 where ∆x is the element length in the x direction and L ref = 1000.0 m is the reference length used for this test case.
Rising bubble experiments show the non-linear dynamics of dry 2D convection. The classic thermal bubble test shown in Fig. 13 shows potential temperature being advected conservatively 27 throughout the domain at the reference resolution. These results use a dissipation mechanism that combines 4 th order hyperdiffusion of θ for horizontal modes and scale-adaptive 4 th order flow-dependent hyperviscosity of θ for vertical modes. In this case, no diffusion is needed in the velocity or density fields to obtain a stable simulation. The rising thermal bubble experiment is typically carried out and compared at 700 seconds 5 precisely before the convective bubble interacts with the top boundary of the domain. We present this result for comparison with previous results in Fig. 15. However, it is also important to evaluate the conservative properties of the method and we carry out the simulation to 1200 seconds.
Since (9) is a strict statement of constant potential temperature following fluid parcels, the results of Fig. 15 compared to Fig. 13 demonstrate that our method is stable and approximates High-order vertical discretizations are typically associated with strong oscillations that can induce perturbations that grow into unstable eddies. The net effect is that a high-order vertical discretization, given the same horizontal discretization, changes the local mixing characteristics 20 of the flow. This effect is seen clearly in Fig. 16. The 10 th order simulation has a structure that more closely approximates the reference result in Fig. 14. In the context of studies that seek to represent convective processes, we would expect entrainment fluxes to be improved at a coarser resolution with the higher-order vertical discretizations.

Conclusions
The idea of separating the vertical and horizontal dynamics in atmospheric modeling systems has roots in the scale differences that characterize atmospheric flows. This principle has been fully exploited in the development of global and mesoscale models, along with the application of the hydrostatic approximation. This paper adds to the modern literature on modeling 5 atmospheric dynamics by analyzing a novel discretization technique for achieving high-order accuracy in the vertical while maintaining the desirable properties of staggered methods. We refer to this technique as the Staggered Nodal Finite Element Method (SNFEM).
The test suite we present in this work is not exhaustive, but it is intended to evaluate the performance of the numerical schemes under conditions of near hydrostatic synoptic scale flow in section 4.1, linear, mesoscale, non-hydrostatic flow with topography in section 4.2, and fully nonlinear, non-hydrostatic, Large Eddy Simulation (LES) scale, flow in section 4.3 and section 4.4. As global models progress into into the regime of non-hydrostatic flows, real flow cases will be characterized by one or more of the properties mentioned, and likely in combination when variable or adaptive meshing methods are used. More importantly, we expect that uniform or 15 mixed grids being prepared in research will begin to span the scale range that includes the transition to non-hydrostatic dynamics and on to large-eddy flows.
In general, we postulate that a higher-order method based on finite elements will be more accurate at a given resolution with minimal computational cost relative to a low order method. Our results demonstrate that a high-order vertical coordinate approximates well resolved, refer-20 ence results at coarser resolutions that would be otherwise considered poorly represented. Our experiments nonetheless are constrained by the order of horizontal and temporal discretizations. Therefore, we restrict our recommendation to the use of 4 th order SNFEM as optimum for the tests given here. In general the combined spatial order of accuracy should be consistent to maximize the effect of increased accuracy. The high-order approximation provides an improvement 25 to the vertical dynamics and so reduces the need for higher vertical resolution. This benefit would prove effective when variable-grid methods are considered and nesting mesh levels can be saved by employing the SNFEM at high-order. The use of staggering in conjunction with 29 high-order has further benefits, in particular the avoidance of stationary computational modes that are known to persist with co-located methods.
However, there are some trade offs when increasing the vertical order: 1) for a vertically implicit method, fewer high-order elements lead to a dense matrix structure that is more expensive to invert, 2) the oscillatory nature of the polynomial functions that make up the interpolants 5 within an element have physical consequences (involving nonlinear processes) at the smallest scales, and 3) higher-order spatial discretizations often require smaller time steps or higher order temporal discretizations. Fig. 3 shows the times required for computations of varying vertical order and processor scaling. The results confirm that the relative cost in moving to 4 th order is indeed modest relative to the use of higher orders. The first point can be addressed in the 10 construction of the software where parallelization and correct use of hardware resources minimizes the dense operations that high-order elements imply. We saw in Fig. 16 that oscillations associated with high-order interpolants helped to approximate fine scale structures, but these oscillations can also be harmful depending on the flow condition. While vertical order of accuracy can be increased up to the total number of vertical levels, e.g. results from the Schär cases 15 in Fig. 8, increasing computational expense indicates that intermediate orders of accuracy will generally be most effective. In this study, many of the results at 4 th order sufficiently improve solutions relative to low order alternatives.
Furthermore, when physical instabilities arise, a consistent, high-order, and scale selective dissipation strategy is necessary. In this regard, finite element methods allow for the construc-20 tion of diffusion operators for this purpose e.g. section 3.2.3.We can experiment with different combinations of diffusion operators including coefficients that are variable in space. While scale-selective 4 th order operators with some grid resoltuion dependence are sufficient for this work, we intend to explore a wider range of strategies based on polynomial filtering, variational multiscale methods, etc. with the goal of eliminating the tuning procedure associated with user-25 provided coefficients.
The numerical dissipation strategy implemented here serves two primary goals: 1) stabilization of the computations and 2) as a form of closure for the Euler equations solved on a truncated grid. The methods we employ allow for the construction of derivative operators of various or-30 ders in a consistent manner. Tempest features a system that allows for diffusion to be applied in a selective manner on variables that is split according to the time integration scheme.
Further experiments are necessary to test the extent of the third point above. For this work, we used a 2 nd order Strang time integration scheme (section 3.3) that was sufficiently robust to carry out all of the experiments up to 10 th order without overly restricting time step size relative 5 to the 2 nd order simulations.
The authors conclude the following based on the experiments conducted and properties of the SNFEM: 1. Staggering has been generalized to finite element methods combining continuous and discontinuous element formalisms. The result is a method that closely parallels the behavior 10 of staggered finite differences eliminating stationary modes.
2. Variable order of accuracy is an effective strategy that can compensate for limitations in grid scale resolution. However, the effects at very high order must be understood and controlled with appropriate stabilization methods. In general, "intermediate" orders (about 4 th order) are recommended with consideration for consistency in overall spatial order 15 given an IMEX partitioned architecture We note that, while the equations are formulated in a coordinate-free manner, the results given all correspond to regular Cartesian coordinates. Experiments corresponding to small planet and global domains are left for a subsequent work. However, any curved geometry with a terrainfollowing surface topography can be applied to the equations since all grid information is held 20 in the metric terms described in section 2. As such, the effects of curved geometry and variable vertical order-of-accuracy are only addressed here in the Schär and Baroclinic wave cases (using the β plane approximation). From a design perspective, metric terms are precomputed and derivative operators are built in the natural, local coordinate frame when any grid is used. 31 Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-275, 2016 Manuscript under review for journal Geosci. Model Dev. Published: 19 January 2016 c Author(s) 2016. CC-BY 3.0 License.
The Tempest codebase used to generate the results in this publication are available through the following Git repository: https://github.com/paullric/tempestmodel.

Choice of Staggering Variable
Term Table 1. Composition of interpolation I and differentiation D operators for several choices of staggering, including co-located spectral elements (SE), SNFEM with Lorenz staggering (SNFEM-LOR) and SNFEM with Charney-Phillips staggering (SNFEM-ChP). Script i denotes variables defined on interfaces (Gauss-Lobatto nodes) and n represents variables defined on model levels (Gauss nodes). For operator I and D, the subscript denotes the target (i or n) and the superscript denotes the origin.

Fig. 2.
A depiction of the derivative operators D i n and D n i , which remap from interfaces to levels and levels to interfaces, respectively. The gray line depicts a typical field variable within element a that emerges from the expansion (left) (23) or (center) (22).