Trajectory errors of different numerical integration schemes diagnosed with the MPTRAC advection module driven by ECMWF operational analyses

. The accuracy of trajectory calculations performed by Lagrangian particle dispersion models (LPDMs) depends on various factors. The optimization of numerical integration schemes used to solve the trajectory equation helps to maximize the computational efﬁciency of large-scale LPDM simulations. We analyzed global truncation errors of six explicit integration schemes of the Runge–Kutta family, which we implemented in the Massive-Parallel Trajectory Calculations (MPTRAC) advection module. The simulations were driven by wind ﬁelds from operational analysis and forecasts of the European Centre for Medium-Range Weather Forecasts (ECMWF) at T1279L137 spatial resolution and 3 h temporal sampling. We deﬁned separate test cases for 15 distinct regions of the atmosphere, covering the polar regions, the midlatitudes, and the tropics in the free troposphere, in the upper troposphere and lower stratosphere (UT/LS) region, and in the middle stratosphere. In total, more than 5000 different transport simulations were performed, covering the months of January, April, July, and October for the years 2014 and 2015. We quantiﬁed


Introduction
Lagrangian particle dispersion models (LPDMs) have proven to be useful for understanding the properties of atmospheric flows, particularly for problems related to transport, dispersion, and mixing of tracers and other atmospheric properties (e.g., Lin et al., 2012;Bowman et al., 2013).Commonly used LPDMs include the Flexible Particle (FLEXPART) model (Stohl et al., 2005), the Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model (Draxler and Hess, 1998), the Numerical Atmospheric-dispersion Modelling Environment (NAME) (Jones et al., 2007), and the Stochas-Published by Copernicus Publications on behalf of the European Geosciences Union.
tic Time-Inverted Lagrangian Transport (STILT) model (Lin et al., 2003).While all these models are applied to solve similar tasks, they differ in specific choices such as the numerical methods or vertical coordinates that are used.In this study, we apply the rather new Massive-Parallel Trajectory Calculations (MPTRAC) model (Hoffmann et al., 2016), which was recently developed at the Jülich Supercomputing Centre, Germany.MPTRAC was primarily designed to conduct trajectory calculations for large-scale simulations on massiveparallel computing architectures.Computational efficiency is an important aspect during the development of such a model.
LPDMs simulate transport and diffusion of atmospheric tracers based on trajectory calculations for many air parcels that move with the fluid flow in the atmosphere.The accuracy of these calculations has been the subject of numerous studies (e.g., Kuo et al., 1985;Rolph and Draxler, 1990;Seibert, 1993;Stohl et al., 1995;Stohl and Seibert, 1998;Stohl et al., 2001;Davis and Dacre, 2009).According to reviews of Stohl (1998) and Bowman et al. (2013), trajectory calculations have errors that arise from three sources: (i) errors in the gridded winds themselves, which could result from measurement errors that enter the analyzed fields through the data assimilation process or from Eulerian model approximations, such as subgrid-scale parameterizations; (ii) sampling errors that follow from the fact that velocity fields are available only at finite spatial and temporal resolution and must be interpolated to particle locations; and (iii) local and global truncation errors that originate from the use of an approximate numerical scheme to integrate the trajectory equation in time at a single time step or integrated over multiple time steps, respectively.Bowman et al. (2013) point out that (i) and (ii) are usually the limiting factors for the accuracy of trajectory calculations, whereas high numerical accuracy and significant reduction of local truncation errors can be achieved by reducing the size of the time step of the numerical integration scheme, which will also reduce the global truncation errors.However, note that the final transport deviation is not just an accumulation of local truncation errors, because local errors will be amplified in a way that is flow-dependent.The dependency on the flow may cause significant variability in the global truncation errors, in particular if the integration covers time periods of several days.The size of the time step is usually the most important factor that controls the trade-off between numerical accuracy and computation time.An appropriate selection of the numerical scheme and the size of the time step is mandatory to maximize computational efficiency.This is particularly important for large-scale simulations, like Lagrangian transport simulations aiming at emission estimation by means of inverse modeling (e.g., Stohl et al., 2011;Heng et al., 2016) or long-term simulations coupled to chemistry climate models (e.g., Hoppe et al., 2014).
In the following, we present an assessment of six numerical integration schemes, all belonging to the class of explicit Runge-Kutta methods (e.g., Press et al., 2002;Butcher, 2008), for atmospheric trajectory calculations.Seibert (1993) studied the truncation errors of some of these schemes based on analytic flow types such as purely rotational flow, purely deformational flow, wave flow, and accelerated deformational flow.Here, we decided to focus on tests with realistic wind fields obtained from high-resolution operational analyses and forecasts provided by the European Centre for Medium-Range Weather Forecasts (ECMWF).The T1279L137 ECMWF operational analysis data used here have 16 km effective horizontal resolution and 180-750 m vertical resolution at 2-32 km altitude.We estimated the global truncation errors of the numerical methods for five latitude bands and three altitude ranges of the atmosphere, covering the free troposphere, the upper troposphere and lower stratosphere (UT/LS) region, and the middle stratosphere.The simulations were used to study the seasonal error variability for the years 2014 and 2015.We systematically assessed trade-offs between accuracy and computation time to infer the computational efficiency of the integration methods.Using the most recent meteorological data, the results will be of interest for many current and future LPDM studies using ECMWF operational data or other meteorological data sets with comparable resolution.
In Sect.2, we present the advection module of the Lagrangian particle dispersion model MPTRAC together with an overview on the meteorological data.The selected numerical integration schemes and the diagnostic variables are introduced and the experimental setup is described.Section 3 shows transport deviations from case studies followed by a general analysis of the error behavior in terms of error growth rates and region-specific characteristics.Scalability and performance on a high-performance computing system are discussed.In Sect.4, we conclude with suggestions for the bestsuited integration schemes and optimal time step choice in order to achieve the most effective simulations of large-scale problems on current high-performance computing systems.
2 Methods and data

Lagrangian particle dispersion model
In this study, we apply the Lagrangian particle dispersion model MPTRAC (Hoffmann et al., 2016) to conduct trajectory calculations.MPTRAC has been developed to support the analysis of atmospheric transport processes in the free troposphere and stratosphere.In recent studies, it has been used to perform transport simulations for volcanic eruptions and to reconstruct time-and height-resolved emission rates for these events (Heng et al., 2016;Hoffmann et al., 2016;Wu et al., 2017).An intercomparison of meteorological analyses and MPTRAC trajectory calculations with superpressure balloon observations in the Antarctic lower stratosphere was presented by Hoffmann et al. (2017).The primary task of MPTRAC is to solve the trajectory equation for atmo- In another module, turbulent diffusion and subgrid-scale wind fluctuations are simulated by adding stochastic perturbations to the trajectories, following the approach of the FLEXPART model (Stohl et al., 2005).Additional modules can simulate the sedimentation of air parcels and the decay of particle mass.For this study, only the advection module was activated.MPTRAC is particularly suited for ensemble simulations on supercomputers due to its efficient Message Passing Interface (MPI)/Open Multi-Processing (OpenMP) hybrid parallelization.

ECMWF operational analysis
Air parcel transport in MPTRAC is driven by given wind fields.In principle, any gridded data produced by general circulation models, atmospheric reanalyses, or operational analyses and forecasts can be used for this purpose.Reanalyses and forecasts benefit from well-established meteorological data assimilation methods (Rabier et al., 2000;Buizza et al., 2005) which help to better constrain the modeled circulation fields to reality.While atmospheric reanalyses (e.g., Kalnay et al., 1996;Dee et al., 2011;Rienecker et al., 2011) typically have a horizontal resolution of ≈ 100 km or less, the resolution of operational forecast products has been continuously improving during the last decades.In this study, we use horizontal and vertical winds from ECMWF operational analyses and forecasts (ECMWF, 2013(ECMWF, , 2015) ) for the years 2014 and 2015 produced in spectral truncation T1279, which corresponds to a horizontal resolution of about 16 km.Vertically, the data consist of 137 levels reaching from the surface up to 0.01 hPa.For usage with MPTRAC, the wind fields were retrieved on model levels with a longitude-latitude grid with 0.125 • × 0.125 • resolution and have been interpolated vertically to 114 pressure levels in the troposphere and stratosphere up to 5 hPa with the help of the Climate Data Operators (CDO, 2015).The 12-hourly analyses were combined with short-term forecasts in between to obtain data with a 3-hourly time step.Hoffmann et al. (2016) showed that ECMWF operational analyses and forecasts outperform current reanalysis data products in terms of transport deviations for simulations of volcanic sulfur dioxide emissions in the upper troposphere and stratosphere.Hoffmann et al. (2017) showed the same for the trajectory calculations for superpressure balloon observations in the Antarctic lower stratosphere.
Example wind fields from the operational data are presented in Fig. 1.Horizontal and vertical wind velocities from the ECMWF operational analysis for 1 January 2015 (00:00 UTC) are shown for three pressure levels in the stratosphere, in the UT/LS region, and in the free troposphere.At about 24 km altitude, the global wind fields are dominated by a meandering band of high horizontal wind speed at high northern latitudes indicating the wintertime polar vortex, together with weaker tropical easterlies.Stratospheric wind speeds in the extratropical summer hemisphere are generally slow compared to the winter hemisphere.Enhanced horizontal wind speeds at about 12 km altitude are connected with UT/LS jet streams over both hemispheres and are highest for the subtropical jet stream situated at around 30 • N with maxima over the western Pacific reaching more than 100 m s −1 locally.In the free troposphere, typical weather patterns from the moving high-and low-pressure systems over the midlatitudes exhibit the highest horizontal wind speeds but with stronger spatial variability than in the stratosphere.The vertical wind velocities mostly vary on short spatial scales of several 100 km or less, often associated with atmospheric gravity waves (e.g., Preusse et al., 2009;Hoffmann et al., 2013).In the troposphere, also contiguous areas of high vertical velocities with extensions of 1000 km or more occur close to strong pressure systems.Other high vertical wind speeds are connected with the polar vortex and the jet streams.Strong vertical winds are also observed at the Intertropical Convergence Zone (ITCZ) which is located around 10 • N-20 • S for January.Note that many of the small-scale features identified here cannot be found in lower-resolution data sets such as global meteorological reanalyses.

Numerical methods for trajectory calculations
Lagrangian particle dispersion models calculate the trajectories of individual particles or infinitesimally small air parcels over time.The trajectory of each air parcel is defined by the trajectory equation dx dt = v (x(t), t) . (1) Here, x = (x, y, z) denotes the position and v = (u, v, ω) the velocity of the air parcel at time t.In MPTRAC, the horizontal position (x, y) of the air parcel is defined by longitude and latitude, which requires spherical coordinate transformations to relate it to the horizontal wind (u, v).The vertical coordinate z is related to pressure p by the hydrostatic equation, and the vertical velocity is given by ω = dp/dt.The wind vector v at any position x is obtained by means of a 4-D linear interpolation of the meteorological data, which is a common approach in many LPDMs (Bowman et al., 2013).
The analytic solution of the trajectory equation is given by with initial position x 0 at start time t 0 and end time t 1 .In this study, the performance of six numerical schemes to solve the trajectory equation is analyzed.All schemes belong to the class of explicit Runge-Kutta methods; for an overview of these methods, see, e.g., (Butcher, 2008).The explicit Euler method likely poses the most simple way to solve the trajectory equation.The numerical solution is obtained from Eq. ( 2) by means of a first-order Taylor series approximation.Hence, it is also referred to as the "zero acceleration" scheme.The iteration scheme of the explicit Euler method (referred to as the Euler method below) is given by where t = t n+1 − t n refers to the time step.The Euler method is a first-order Runge-Kutta method; i.e., the local truncation error for each time step is on the order of O( t 2 ), whereas the total accumulated error at any given time is on the order of O( t).
MPTRAC currently uses the explicit midpoint method as its default numerical integration scheme: First, the "midpoint" is calculated using an Euler step with half the time step, t/2.The final step is calculated using the wind vector at the midpoint of the Euler step.The midpoint method is a second-order Runge-Kutta method.The local truncation error is on the order of O( t 3 ), giving a global error on the order of O( t 2 ).The method is computationally more expensive than the Euler method, but errors generally decrease faster in the limit t → 0.
The scheme of Petterssen (1940) is popular in many LPDMs (e.g., Stohl, 1998;Bowman et al., 2013).It is defined by (5) with l being an index counting the number of inner iterations carried out as part of each time step.If no inner iterations are performed, the scheme is equivalent to the Euler method.If one inner iteration is carried out, the method is also known as Heun's method, another type of a second-order explicit Runge-Kutta method.An increasing number of inner iterations can help to improve the accuracy of the solution in situations with rather complex wind fields.If the local wind field is smooth, it results in fewer iterations and less computing time.We applied the Petterssen scheme with up to seven inner iterations and did not tune the convergence limit for the inner iterations for efficiency, as we were mostly interested in good accuracy of the solutions.An additional test was conducted with a fixed number of two iterations to assess possible improvements of the Petterssen scheme with respect to Heun's method.
In this study, we also evaluated specific third-and fourthorder explicit Runge-Kutta methods (RK3 and RK4).The third-order method used here is defined by with wind vectors k i at the specific quadrature nodes: The classical fourth-order Runge-Kutta method is defined by with wind vectors For these methods, the local truncation error is on the order of O( t p+1 ), while the total accumulated error is on the order of O( t p ), with p referring to the order of the method.
The classical fourth-order Runge-Kutta method is the highest order Runge-Kutta method for which the number of function calls matches its order.It typically provides a good ratio of accuracy and computation time.Any fifth-order method requires at least six function calls, which causes more overhead.

Evaluation of trajectory calculations
A common way to compare sets of test and reference trajectories is to calculate transport deviations (Kuo et al., 1985;Stohl et al., 1995;Stohl, 1998).Transport deviations are calculated by averaging the individual distances of corresponding air parcels from the test and reference data sets at a given time.The reference data set could be the known analytical solution for an idealized test case, it could be based on observations like balloon trajectories, or it could be obtained by using a numerical integration method known to be highly accurate for real wind data.Absolute horizontal and vertical transport deviations at time t are calculated according to with X i (t), Y i (t), and Z i (t) as well as x i (t), y i (t), and z i (t) referring to the air parcel coordinates of the test and reference data sets, respectively.Each data set contains N air parcels.
To calculate the horizontal distances, we first converted the spherical coordinates of the air parcels to Cartesian coordinates and then calculated the Euclidean distance of the Cartesian coordinates.This approach approximates spherical distances with ≥ 99 % accuracy for distances up to 3000 km.Vertical distances are calculated based on pressure and the hydrostatic equation.Relative horizontal transport deviations (RHTDs) and relative vertical transport deviations (RVTDs) are calculated by dividing the absolute transport deviations by the horizontal or vertical path lengths of the trajectories, respectively.
According to the definition, the transport deviations are calculated as mean absolute deviations of the air parcel distances.Although the mean absolute deviation is a rather intuitive approach to measure statistical dispersion, we note that it is not necessarily the most robust measure, as it can be influenced significantly by outliers.Such outliers of rather large individual transport deviations exist in some of our simulations.Strong error growth of individual trajectories can occur once the test and reference trajectories are significantly separated from each other, meaning that the air parcels are located in completely different wind regimes.To mitigate this issue, we decided to report also the median of the absolute and relative transport deviations of the individual air parcels as an additional statistical measure.The median absolute deviation is a much more robust statistical measure.In all cases 580 T. Rößler et al.: Truncation errors of trajectory calculations considered here, we found that the median absolute deviation is smaller than the mean absolute deviation.This indicates that the distributions of transport deviations are skewed towards larger outliers.Note that skewed distributions of transport deviations have also been reported in other LPDM intercomparison and validation studies (e.g., Stohl et al., 2001).

Considerations on time steps and error limits
Since our test cases are based on real meteorological data, we obtained the reference data to calculate the transport deviations using the most accurate integration method available to us with a sufficiently short time step.Sensitivity tests using variable time steps down to 30 s showed that the numerical solution from the RK4 method converges for time steps of 60 s or less, in the sense that transport deviations relative to simulations with a time step of 120 s do not change significantly.In particular, comparing simulations with time steps of 120 s and 60 s, the median horizontal deviation is less than 7 km and the median vertical deviation is less than 10 m for up to 10 days of simulation time.Alternatively, following Seibert (1993), we may also evaluate the Courant-Friedrichs-Lewy (CFL) criterion, t ≤ x/u max , to establish a time step estimate for the reference simulations.Based on an effective horizontal resolution of x ≈ 16 km and a maximum horizontal wind speed of u max ≈ 120 m s −1 , we find that t ≤ 130 s is needed to ensure sufficiently fine sampling of the ECMWF data.Therefore, we selected a time step of 60 s to calculate the reference trajectories.
The maximum tolerable error limits for trajectory calculations depend on the individual application of course.However, as a guideline, we here provide physically motivated error limits that are of particular interest regarding LPDM simulations.LPDMs consider both advection and diffusion to simulate dispersion.Clearly, the numerical errors of the trajectory calculations representing the advective part should be smaller than the particle spread caused by diffusion.Considering a simple model of Gaussian diffusion, the standard deviations of the horizontal and vertical particle distributions are given by σ x = √ 2D x t and σ z = √ 2D z t, respectively.Typical vertical diffusivity coefficients are D z ≈ 1 m 2 s −1 in the free troposphere (Pisso et al., 2009) and D z ≈ 0.1 m 2 s −1 in the lower stratosphere (Legras et al., 2003).Assuming a typical scale ratio of horizontal to vertical wind velocity of ≈ 200 (Pisso et al., 2009), corresponding horizontal diffusivity coefficients are D x ≈ 40 000 m 2 s −1 in the troposphere and D x ≈ 4000 m 2 s −1 in the stratosphere.The corresponding horizontal spread after 10 days is σ x ≈ 260 km in the troposphere and σ x ≈ 85 km in the stratosphere.The vertical spread is σ z ≈ 1300 m in the troposphere and σ z ≈ 415 m in the stratosphere.However, note that these values should only be considered as a guideline.Local diffusivities may be an order of magnitude smaller or larger than these values, depending on the individual atmospheric conditions.

Experiment configuration
In this study, we analyzed the errors of trajectory calculations in 15 regions of the atmosphere, covering rather distinct conditions in terms of pressure, temperature, and winds.The globe was divided into five latitude bands: polar latitudes (90 to 65 • S and 65 to 90 • N; 23.9 × 10 6 km 2 surface area in each hemisphere), midlatitudes (65 to 20 • S and 20 to 65 • N; 143.9 × 10 6 km 2 surface area in each hemisphere), and tropical latitudes (20 • S to 20 • N; 174.2 × 10 6 km 2 total surface area).The selected three altitude layers cover the free troposphere (2 to 8 km; 24 ECMWF model levels), the UT/LS region (8 to 16 km; 24 levels), and the lower and middle stratosphere (16 to 32 km; 31 levels).These regions are of major interest regarding various applications of transport simulations using MPTRAC and other LPDMs.The planetary boundary layer was not considered here, because MPTRAC lacks a more sophisticated parameterization scheme for diffusion needed for simulations in this layer.As the atmospheric conditions depend on the season and vary from year to year, we selected 1 January, 1 April, 1 July, and 1 October of the years 2014 and 2015 as start times for the simulations.All simulations cover a time period of 10 days.In each region, 500 000 trajectory seeds were uniformly distributed.Although this is already a large number of trajectory seeds, this is still undersampling of the effective resolution of the ECMWF data by as much as a factor of 4.5 in the polar troposphere and up to a factor of 42 in the tropical stratosphere.Nevertheless, initial tests with different numbers of trajectory seeds showed that our results are statistically significant.In all regions, we tested time steps of 60, 120, 240, 480, 900, 1800, and 3600 s for each of the six integration schemes.In total, more than 5000 individual transport simulations were performed, consisting of more than 2.5 × 10 9 individual trajectories.
Here, the atmospheric regions have been defined by means of fixed latitude and altitude boundaries.This is arguably a rather simple approach compared to physically motivated separation criteria based on equivalent latitudes or the dynamical tropopause.However, the simple approach may still reflect how the model is initialized and used in different applications in practice.An important consequence of our approach is that part of the air parcels leave their initial region during the course of simulation.Table 1 provides the fraction of air parcels that remain in their initial region after 5 and 10 days of simulation time.In the stratosphere, we found fractions of 48-88 % after 5 days and 36-78 % after 10 days in the different latitude bands.In the UT/LS region, the fractions are lower, i.e., 32-55 % after 5 days and 14-40 % after 10 days.In the troposphere, the fractions are even lower, i.e., 32-48 % after 5 days and 10-24 % after 10 days.The lowest fractions are found for the polar latitudes for all altitude layers, being the smallest regions in terms of surface area.The horizontal wind maps shown in Fig. 1 suggest that planetary wave activity and meandering of the westerly jets between midlatitudes and high latitudes are responsible for the low fractions at polar latitudes.We also found that the fractions decrease from the stratosphere to the troposphere.This may be attributed to stronger fluctuations in the wind field associated with deep convection and eddy diffusivity in the troposphere.Although a substantial fraction of air parcels may leave their initial region during the simulations, we decided to not filter and exclude those trajectories in our analyses.
The trajectories that leave the regions are more likely related to higher wind velocities.Excluding those trajectories would cause a strong bias towards short trajectories, representing only the lower wind velocities in the statistical analysis.

Case studies of trajectory calculations
First, we present two case studies that illustrate some of the common features related to trajectory calculations using different numerical integration schemes.Figure 2 shows maps of trajectories that were calculated using the six numerical schemes introduced in Sect.2.3 with a time step of 120 s. Figure 3 provides the corresponding absolute transport deviations with respect to the reference calculations (RK4 method with 60 s time step).Both case studies show trajectories that were launched on 1 January 2014 at about 10 km altitude.In the example for the Northern Hemisphere, the trajectories calculated using the different schemes agree well (AHTD ≤ 200 km and AVTD ≤ 600 m) for the first 6 days.After this point, the Euler solution shows rapidly growing errors, with an AHTD up to 3900 km and an AVTD up to 4800 m after 8 days.The Petterssen scheme and Heun's method yield AHTDs ≤ 200 km and AVTDs ≤ 800 m for about 8 days, before they diverge from the reference calculation.The midpoint and RK3 method provide AHTDs ≤ 200 km and AVTDs ≤ 800 m until the end of the simulation (after 10 days).The example for the Southern Hemisphere illustrates that the onset of rapid er-ror growth may occur much earlier in time.Here, an AHTD of 200 km and an AVTD of 800 m are already exceeded after 3 days by the Euler solution and after 4-6 days by the solutions from Heun's method and the Petterssen scheme.However, although error growth starts earlier, in the Southern Hemisphere example, the maximum AHTD remains below 2200 km and the AVTD below 2200 m, which is lower by a factor of 2 compared with the Northern Hemisphere example.Relative transport deviations between the examples are more similar, as the horizontal trajectory length is about 36 400 km in the Northern Hemisphere but only 14 400 km in the Southern Hemisphere.
A common feature of the trajectory calculations we found in the case studies and also in many other situations is that the numerical integration schemes yield solutions that typically agree well up to a specific point in time before rapid error growth begins.Errors grow slowly in the beginning, but at some point, e.g., if there is strong wind shear locally, the trajectories may begin to diverge significantly.Shorter time steps or high-order integration schemes are needed to properly cope with such situations.The case studies also show that transport deviations do not necessarily grow monotonically over time.Trajectories may first diverge from and then reapproach the reference data.Individual local wind fields can bring trajectories back together by chance.The case studies also seem to suggest that vertical errors start to grow earlier than horizontal errors.Furthermore, we note that the Petterssen scheme mostly provides smaller errors than Heun's method.This was expected because the Petterssen scheme provides iterative refinements compared with Heun's method.In both case studies, the midpoint method performs better than the other second-order methods.However, this is not valid in general; we also found counterexamples with the midpoint method performing worse than the other secondorder methods.Both examples generally exhibit large variability of the errors.This indicates that transport deviations need to be calculated for large numbers of air parcels to obtain statistically meaningful results.

Growth rates of trajectory errors
In this section, we discuss the temporal growth rates of the trajectory calculation errors from a more general point of view.Although the magnitude of the truncation errors varies largely between the schemes and with the time step used for numerical integration, we found that the transport deviations typically grow rather monotonically over time if large numbers of particles are considered.Hence, we decided to present errors here using a fixed time step of 120 s for the numerical integration as a representative example.As the magnitude of the calculation errors varies largely between the troposphere and stratosphere, we present the analysis for both regions separately.The results for the UT/LS region are not shown, as they just fall in between.We calculated combined transport deviations considering all seasons and latitude bands in the given altitude range.A more detailed analysis of the total errors in individual latitude bands and for different seasons will follow in Sect.3.3.The influence of the choice of the time step on the accuracy and performance of the trajectory calculations will be discussed in Sect.3.4.
Figure 4 shows the AHTDs and AVTDs of the trajectory calculations for the troposphere and stratosphere as obtained with the different numerical schemes.A common feature is the clustering of the results into three groups, which we attribute to the numerical order of the integration schemes.The largest truncation errors are produced by the Euler method, which is a first-order scheme.After 10 days of simulation time, we found absolute (relative) horizontal transport deviations of 1450 km (14.6 %) in the troposphere and 170 km (1.4 %) in the stratosphere as well as vertical trans-port deviations of 1150 m (13.3 %) in the troposphere and 194 m (3.5 %) in the stratosphere.The errors derived with the second-order methods (midpoint, Heun, and Petterssen scheme) are typically 1-2 orders of magnitude smaller compared to the Euler method.For the midpoint method, we found horizontal transport deviations of up to 320 km (3.4 %) in the troposphere and 11 km (0.086 %) in the stratosphere as well as vertical transport deviations of up to 361 m (3.9 %) in the troposphere and 14 m (0.18 %) in the stratosphere.The RK3 and RK4 methods cluster in the third group, with truncation errors being another factor of 2-4 lower than for the second-order schemes.For the RK3 method, we found horizontal transport deviations of up to 228 km (2.5 %) in the troposphere and 6.7 km (0.048 %) in the stratosphere as well as vertical transport deviations of up to 272 m (2.9 %) in the troposphere and 8 m (0.099 %) in the stratosphere.We attribute the fact that there are nearly no differences between the RK3 and RK4 methods to the use of the 4-D linear interpolation scheme for the meteorological data.Any high-order numerical integration scheme is not expected to provide large benefits in combination with a low-order interpolation scheme.
From the data presented in Fig. 4, we can also estimate the temporal growth rates of the calculation errors as well as the leading polynomial order of the error growth.We found that error growth typically starts off linear, i.e., with a polynomial order close to 1, but gets nonlinear already after 1-2 days, with the polynomial order getting significantly larger than one.For the troposphere, we found a maximum polynomial order of ≈ 3 after 5 days for the AHTDs and of an order ≈ 2 after 4 days for the AVTDs for the Euler method.The higher-order methods show nonlinearity at even higher levels, with a maximum polynomial order of ≈ 5 after 8 days for the AHTDs and of ≈ 4 after 6 days for the AVTDs for the RK3 and RK4 methods.The second-order methods are in between.Due to this nonlinear error growth, the error growth rates also increase rapidly over time until they reach their maxima after 10 days; maximal error growth rates for three selected methods representing first-to third-order Runge-Kutta schemes are given in Table 2.During the course of the simulations, the observed error growth is largely dependent on atmospheric flow patterns.To better assess the influence of individual atmospheric conditions on the trajectory errors, we will in the following discriminate between errors after 1 and 10 days, respectively.

Spatial and temporal variations of trajectory errors
For a more detailed analysis of the regional and seasonal variations of the total trajectory errors, we focus on the errors after 10 days of simulation time for simulations using the third-order Runge-Kutta method with a single time step of 120 s.This is considered to be a representative example, as other schemes and time steps show similar variations.We calculated individual transport deviations for all 15 altitudelatitude regions and for simulations starting at the beginning of January, April, July, and October 2014 and 2015, respectively.The results are shown in Figs. 5 and 6.
Our simulations show that horizontal errors increase from typically 20 km in the stratosphere to 100 km in the UT/LS region and about 200 km in the troposphere.The correspond-  ing maximum AHTDs are 116, 177, and 470 km, respectively.The corresponding relative errors increase from 0.0 to 0.4 % in the stratosphere, to around 0.1 to 1.0 % in the UT/LS region, and 1.0 to 4.0 % in the troposphere.As shown in Fig. 4, the calculation errors in the stratosphere comply on average with the error limit defined in Sect.2.5, while the error limit in the troposphere is reached after about 10 days.However, as can be seen from Figs. 5 and 6, the total errors  can vary considerably seasonally and from year to year as well as between the regions, causing maximum errors of specific regions to exceed the defined limit.In the stratosphere, generally the horizontal transport deviations are smaller than 40 km, far below the error limit.An exception is found for the Northern Hemisphere polar stratosphere in January 2015 with an AHTD of 116 km.Errors are growing from the stratosphere towards the troposphere.While stratospheric wind fields are rather uniform, fluctuations of the wind field become stronger and more frequent at lower altitudes causing wind speeds and wind directions to vary more strongly.So, even if travel distances in the troposphere may be relatively short, transport deviations typically increase with decreasing altitude.
The trajectory errors at all altitude layers vary with latitude.We focus on the horizontal errors in this case, but vertical errors show similar results.The largest trajectory errors in the troposphere are found at northern midlatitudes with errors between 245 and 470 km.The meteorological conditions in tropospheric midlatitudes were expected to cause relatively large errors because of the nature of global circulation: Rossby waves and baroclinic instability occurring predominantly in this region come along with highly variable wind patterns.In addition, stronger fluctuations are expected in the northern midlatitudes compared to the southern midlatitudes due to the larger land-sea ratio and more complex orography of the Northern Hemisphere.The errors obtained in the polar regions are second largest with an average over all seasonal samples of around 200 km and peak errors in polar summer of up to 380 km.The simulations for the tropics and southern midlatitudes show smaller errors of less than 200 km and adhere to the error limit in all test cases.The simulations for the UT/LS region have their largest AHTDs in the northern midlatitudes with 95 to 177 km.These errors are caused by the north-south meandering of the jet (Woollings et al., 2014) and higher small-scale variability of the wind field in this region.The second largest errors of the simulations in the UT/LS region occur in the tropics with about 75 km on average followed by the Arctic and southern midlatitudes with about 50 km on average.Simulations that cover the Antarctic show the smallest errors in the UT/LS region with about 30 km on average.The errors of the simulations in the stratosphere are typically below 25 km, except for January 2015.Stratospheric trajectory errors in the tropics are larger than in the other regions, which is probably due to the close vicinity of the tropical tropopause, which reaches an average altitude of 16 km near the ITCZ.
The variation of the horizontal errors also exhibits some seasonal dependencies.This is most prominent for the northern midlatitudes, where maximum errors in all cases occur in January.During Northern Hemisphere wintertime, land-sea temperature differences as well as the temperature gradient between the Arctic and the subtropical regions are largest, which allows for more intense and complex dynamic patterns to occur than in summer.Our test cases for the Southern Hemisphere and for the Arctic region do not show a seasonal behavior as clearly as one could expect.We need to stress that each simulation lasts only 10 days, which is a relatively short time interval to analyze seasonal effects.Fast temporal variations and changes in medium-range weather patterns can blur out the impact of seasons that is observed here.The small error differences between polar summer and winter additionally can be attributed to the small fraction of parcels that stay in that region.Only 13 % of the parcels that are represented by the statistic remained in the polar regions after 10 days of simulation, which weakens our statistics.
Most of our simulations for the corresponding months in 2014 and 2015 differ by less than 20 %; only deviations of a few individual months differ more strongly but in a similar range than the seasonal variations.The most striking differences occur in January in the stratosphere of the northern polar region.The simulation of 2014 shows small errors of 4 km, while the simulation of 2015 reaches an error of up to 116 km and exceeds the stratospheric error limit.This particular behavior (which is also present in Fig. 6 with an AVTD of 132 m) may be related to a specific meteorological situation during the winter 2014/2015, where a sudden stratospheric warming event occurred during the first days of January 2015 and temporarily caused nearly a split of the Arctic vortex in the lower stratosphere (Manney et al., 2015).Significant disturbances of the wind field during this event may be a reason why trajectory calculations exhibit larger errors.
Vertical and horizontal errors behave very similarly; extrema are found in the same regions.The errors in the stratosphere are usually very small and below 10 m.Typical errors in the UT/LS region and in the troposphere are about 100 and 250 m, respectively.Corresponding maximum errors are 130 m in the stratosphere, 168 m in the UT/LS region, and 470 m in the troposphere.The vertical error limits of 415 m in the stratosphere and 1300 m in the troposphere are easily adhered to.Relative vertical errors range 0.0-0.9% in the stratosphere, 0.2-1.6 % in the UT/LS region, and 1.2-4.4% in the troposphere.
We also calculated the horizontal and vertical median errors for the regions.In general, horizontal and vertical median errors are much smaller than the mean errors.Small median deviations show that most trajectories closely follow the reference.Those parcels that part from the reference usually diverge strongly, which leads to a high average deviation.The median error is somewhat larger for simulations in the troposphere, where particle paths are more likely being affected by synoptic-scale fluctuations of the wind field.
To summarize, the relative errors of 2-4 % in the troposphere show that this layer is more difficult to solve and that relatively large uncertainties remain even if the absolute error limit is adhered to.The stratospheric relative errors of about 1 % are less critical for the integration method.The large difference of the trajectory errors between altitude regions suggests that lower-order integration schemes or larger time steps could be used in the stratosphere to save compu-tation time without causing significant errors.Tropospheric northern midlatitudes are the most challenging areas for numerical integration.

Computational efficiency
In this section, we focus on the computational efficiency of the numerical integration schemes, which is assessed in terms of the trade-off between computational accuracy of and the computation time required for the trajectory calculations.As the computational efficiency depends, to some extent, on the problem size and the computer architecture that is applied, we will discuss the scalability of the application first.Our scalability tests were performed on the Jülich Research on Exascale Cluster Architectures (JURECA) supercomputer (Jülich Supercomputing Centre, 2016).JURECA is equipped with two Intel Xeon E5-2680 v3 Haswell central processing units (CPUs) per compute node.Each node is equipped with 2 × 12 = 24 physical compute cores, operating at 2.5 GHz clock speed.The CPUs support two-way simultaneous multithreading (SMT); i.e., each node provides up to 48 logical cores.A runtime improvement of up to 50 % can be expected due to the SMT feature. 1  As an example, Fig. 7 shows results of scaling tests of the advection module using the midpoint scheme with a time step of 120 s for different numbers of particles and OpenMP threads.Note that the MPI parallelization of MPTRAC is only used for ensemble simulations, which are conducted independently on multiple nodes.The scalability of the MPI parallelization is mostly limited by I/O issues, which is beyond the scope of this study.For the OpenMP parallelization, we found that the CPU time scales linearly with the number of particles for large numbers of particles (on the order of 10 4 to 10 7 ).The computation per time step and particle requires between 0.31 × 10 −6 and 9.0 × 10 −6 s computing time, depending on the number of the OpenMP threads.For small numbers of particles (less than or equal to 10 4 ), the minimum computing time is limited by a constant offset of 6.3 × 10 −5 s to 4.3 × 10 −3 s (depending on the number of threads) that can be attributed to the OpenMP parallelization overhead and load imbalances.Figure 7 also shows the speedup of the OpenMP parallelization for growing numbers of threads.We found that the advection module of MPTRAC provides good to excellent parallel efficiency for large numbers of particles.The computational efficiency is about 83 % for up to 24 physical threads and for 10 5 to 10 6 particles.It is also found that the code provides additional speedup if the SMT capabilities of the compute nodes are used, in particular for very large numbers of particles (on the order of 10 6 to 10 7 ).For smaller number of particles (10 4 or less), the speedup is limited by the overhead of the OpenMP parallelization and by load imbalances, which can also become  significant for larger numbers of parcels if SMT is not enabled for all cores jointly.The fastest simulations for a set of about 10 2 parcels are possible with four cores.A total of 12 cores should be used when 10 3 parcels are simulated.For 10 4 parcels, the simulations with 24 cores are fastest.For 10 5 or more parcels, all 48 cores (which includes SMT) should be used.LPDM simulations will typically use large numbers of particles (more than 10 5 ) to obtain statistically significant results.MPTRAC will not be affected by significant scaling issues on the JURECA supercomputer in this regime.
As a measure of computational efficiency, Fig. 8 illustrates the trade-off between computational accuracy, in terms of the AHTD, and computational time.In particular, Fig. 8 illustrates how this trade-off depends on the selection of the time step for the different integration schemes.Note that the hardware, especially the memory cache, affects the six integration schemes differently.A single call to the wind interpola-  Horizontal lines refer to the maximum tolerable error limits as defined in Sect.2.5.The Petterssen a scheme always uses two inner iterations, which is one more than for Heun's method.The Petterssen b scheme uses up to seven iterations with strict convergence criteria.
tion function is up to 50 % cheaper for a higher-order method compared to Euler's method, because the cache is used more efficiently.Results are shown separately for the troposphere and stratosphere after 24 h and 10 days.The trajectory errors after 24 h are shown, as they are expected to be less affected by individual meteorological conditions than the errors after 10 days.The errors of the higher-order integration schemes with a time step of 120 s are on the order of 80-200 m in the troposphere.The errors of the simulations in the stratosphere are about 10 times smaller and the discrepancy between the error in the troposphere and stratosphere becomes even larger to a factor of about 25 when the global truncation errors after 10 days are analyzed.The errors in the troposphere using a time step of 120 s are on the order of 200-350 km af-ter 10 days, which shows the nonlinear error growth and the large impact of the atmospheric conditions on trajectory errors.The troposphere is much more challenging for the integration methods than the stratosphere, as already discussed in Sect.3.2 and 3.3.From this analysis, we find that despite being the fastest method, the Euler method usually has the lowest computational efficiency because of its low accuracy.The second-order methods as well as the RK3 and RK4 methods yield much smaller truncation errors, in particular for short time steps.Among the second-order methods, our implementation of the Petterssen scheme has the lowest computational efficiency, which is due to the fact that we tuned the convergence criteria for this method for accuracy rather than speed.The accuracy of the Petterssen scheme with one iteration (Heun's method) is somewhat worse than the midpoint method.When two iterations of the Petterssen scheme are computed, the transport deviations are closer to those obtained with the midpoint method.The best efficiency, which we define as lowest computational costs when adhering to our error limit, is mostly obtained with the midpoint and RK3 methods.The additional iterations of the Petterssen scheme improve the accuracy, but they are too computationally expensive for our model.In general, a well-defined convergence limit for the number of iterations is needed for an efficient application.The RK4 method does not provide any benefits in combination with the low-order 4-D linear interpolation scheme for the meteorological data.In fact, the RK4 method is slightly less efficient than the RK3 method due to the higher numerical costs.
Figure 8 also allows us to more accurately establish the individual optimal time steps of the integration methods with respect to the error limits defined in Sect.2.5.This approach is similar to the well-known discrepancy principle (Engl et al., 1996), where the time step is considered as a tuning factor so that the truncation errors of the methods match an a priori known error bound.To provide estimates for all methods, we use linear extra-and interpolation to determine the largest time step that still adheres to the error limit.After 24 h, when trajectory errors are mostly influenced by truncation errors, the diffusivity-based error limit is not particularly strict, which allows us to use large time steps for the calculations.In fact, even the results obtained with the longest time step of 1 h adhere to the error limit for the higher-order methods as shown in Fig. 8.After 10 days, the diffusivitybased error limit is a lot more difficult to achieve.In addition to the truncation errors, the trajectory errors are also significantly affected by the atmospheric flow patterns (e.g., diffluent flows or bifurcations).For the troposphere, we derived time steps of about 100 s for the Petterssen scheme, Heun's method, and the midpoint method, and about 170 s for the RK3 and RK4 methods.For the stratosphere, we found time steps of about 800 s for the Petterssen scheme, Heun's method, and the midpoint method, and time steps of about 1100 s for the RK3 and RK4 methods.

Summary and conclusions
In this study, we characterized global truncation errors of trajectory calculations after 1 and 10 days in the free troposphere, in the UT/LS region, and in the stratosphere.Transport simulations were conducted with the LPDM MPTRAC, driven by wind fields from T1279L137 ECMWF operational analyses and forecasts in 2014 and 2015, with an effective horizontal resolution of about 16 km and 3 h time intervals.We analyzed the computational performance of the simulations in terms of accuracy and CPU-time costs of six explicit integration schemes that belong to the Runge-Kutta family.The truncation errors of the schemes for a given time step were found to cluster into three groups that are related to the order of the method: (i) the first-order Euler method, (ii) the second-order methods (midpoint method, Heun's method, and Petterssen's scheme), and (iii) the higher-order methods, which are the common RK3 and RK4 methods.Different methods within each group provide similar accuracy in terms of error growth rates and transport deviations.
Based on more than 5000 individual transport simulations, each consisting of 500 000 trajectories, we further analyzed horizontal and vertical transport deviations in relation to altitude, latitude, as well as seasonal and year-to-year variability.The trajectory errors after 24 h were analyzed as they are expected to be less affected by individual flow patterns.The errors of the simulations in the troposphere have 10 times larger errors compared to the simulations in the stratosphere.After 10 days, the trajectory errors vary more substantially inside the climatological regions because of the stronger influence of individual atmospheric flow patterns.We found that tropospheric simulations require more accurate integration methods or significantly shorter time steps to keep errors within physically motivated error limits than simulations for the stratosphere.We attribute this to larger small-scale variations in the high-resolution meteorological input data.Calculation errors also depend on the latitude band, with the northern midlatitudes having the largest errors in each altitude layer.Seasonal error variations and differences from year to year are clearly visible from our simulations, but in some cases the number of samples still seems to be too small to deduce robust statistics.One example is the large errors that are associated with a sudden stratospheric warming in the northern stratosphere in January 2015, which suggests that part of the total error is due to situation-dependent factors.However, a robust feature seems to be a northern midlatitude winter maximum in the troposphere and stratosphere, existent in both years (2014 and 2015).
All integration methods discussed here are in principle suited and have already been used for Lagrangian particle dispersion and trajectory model simulations.To decide which method is most efficient in state-of-the-art high-performance computing systems, we analyzed the trade-off between computational accuracy and computational time.This trade-off is largely controlled by the time step used for numerical integration.The Euler method requires very short time steps to achieve reasonably accurate results and is generally not considered to be an efficient method.Heun's method and the iterative Petterssen scheme are more accurate at the same computational costs.The midpoint method and the RK3 method usually provided the most efficient simulations with MPTRAC; i.e., these methods provide the most accurate results at the lowest computational costs.Note that the RK4 method is slightly more expensive than the RK3 method if it is applied together with a low-order linear interpolation scheme for the meteorological data.
This study uses up-to-date meteorological data as provided by current global weather forecast models, with a spatial res-olution that is much finer than in former trajectory studies (Seibert, 1993;Stohl and Seibert, 1998;Stohl et al., 2001;Harris et al., 2005).Previous studies suggest that a time interval of 3 h for the meteorological data may be too large for the fine spatial resolution (Stohl et al., 1995;Brioude et al., 2012;Bowman et al., 2013).Pisso et al. (2010) found that using hourly wind fields in their ensemble LPDM reconstructions did not add significant information on the largescale atmospheric dynamics and suggested to use 3-hourly wind fields for ECMWF operational data in T511 spectral resolution (corresponding to a horizontal resolution of about 40 km).Given the higher spatial resolution of meteorological data and the short integration time steps used here, it would be interesting to study the effect of shorter time intervals for the driving data on the trajectory errors; however, for the time being, we decided to restrict ourselves to the temporal resolution as specified in the user guide to ECMWF forecast products even if finer data are available.
The high resolution requires adjustment of the time step, as the commonly used time steps of 10 min to 1 h are far beyond yielding convergence with high-resolution meteorological data.Given an effective horizontal resolution of 16 km and applying the CFL criterion, the time step needs to be shorter than about 130 s to achieve convergence.From our simulations, we found that time steps of 100 s for the midpoint method and 170 s for the RK3 method provide accurate results in the troposphere for up to 10 days.Purely stratospheric applications can be solved with time steps of 800 s (midpoint method) and 1100 s (RK3 method) because of lower total errors in this altitude layer.
In this study, we considered a range of popular and wellestablished integration schemes for trajectory calculations in LPDMs.However, the large variability of regional and seasonal errors found here suggests that applications may benefit from more advanced numerical techniques.Adaptive quadrature by means of variable time stepping as recommended by earlier studies (Walmsley and Mailhot, 1983;Maryon and Heasman, 1988;Seibert, 1993) could be taken up for future research.
Code and data availability.We downloaded operational analyses and forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF, 2013(ECMWF, , 2015)).See reference for further details on data availability and restrictions.ECMWF data have been processed for usage with MPTRAC by means of the Climate Data Operators (CDO, 2015).The version of the MPTRAC model that was used for this study along with the model initializations are available under the terms and conditions of the GNU General Public License, version 3, from the repository at https: //github.com/slcs-jsc/mptrac-advect(last access: 3 May 2017).
Competing interests.The authors declare that they have no conflict of interest.

578T.
Figure 1.ECMWF operational analysis horizontal wind speed (a, c, e) and vertical velocity (b, d, f) at about 24 km (a, b), 12 km (c, d), and 5 km (e, f) altitude on 1 January 2015 (00:00 UTC).Black lines indicate the latitude bands considered in our analysis.Note that color bar ranges have been adjusted for each height level to make local structures visible.

Figure 2 .
Figure 2. Examples of trajectory calculations using different numerical integration schemes.Circles mark the start positions of the trajectories.Trajectories were launched at an altitude of 10.8 km (a) and 9.7 km (b).The start time is 1 January 2014 (00:00 UTC) for both.Triangles mark trajectory positions at 00:00 UTC on each day.

Figure 3 .
Figure 3. Absolute horizontal (a, b) and vertical (c, d) transport deviations for the case studies for the Northern Hemisphere (a, c) and Southern Hemisphere (b, d) presented in Fig. 2. Please note the different ranges of the y axes.Results of the RK3 and RK4 methods are close to zero in most cases.

Figure 4 .
Figure 4. Absolute horizontal (a, c) and vertical (b, d) transport deviations of trajectory calculations for the stratosphere (a, b) and troposphere (c, d).The trajectory calculations are based on different numerical schemes but use the same time step ( t = 120 s).Grey lines show error limits based on particle diffusion.

Figure 5 .
Figure 5. Mean (thin bars) and median (thick bars) horizontal transport deviations after 10 days of simulation time in different regions for the RK3 method and 120 s time step.Orange lines show the averages of the four months (January, April, July, and October) and both years (2014 and 2015).Grey lines show error limits based on diffusion.

Figure 6 .
Figure 6.Same as Fig. 5 but for vertical transport deviations.Error limits based on diffusion are about 1300 m for the troposphere and 415 m for the stratosphere, which is beyond the AVTD ranges shown here.

Figure 7 .
Figure 7. Scaling behavior in terms of CPU time (a) and speedup of the code (b) used to calculate trajectories with the midpoint method and a time step of 120 s for different numbers of particles (n p ) and OpenMP threads (n t ).Colored curves refer to different numbers of OpenMP threads (a) or different total numbers of particles (b).Dotted lines show ideal scaling behavior.

Figure 8 .
Figure8.Trade-off between computational accuracy and total CPU-time requirements of the trajectory calculations after 24 h (a, c) and 10 days (b, d).Colored curves refer to different integration schemes.Dots along the curves indicate time steps of 3600, 1800, 900, 480, 240, and 120 s (from left to right).Horizontal lines refer to the maximum tolerable error limits as defined in Sect.2.5.The Petterssen a scheme always uses two inner iterations, which is one more than for Heun's method.The Petterssen b scheme uses up to seven iterations with strict convergence criteria.

Table 1 .
Fractions of air parcels remaining in initial regions during the course of the simulations.SH indicates the Southern Hemisphere and NH indicates the Northern Hemisphere.

Table 2 .
Maximal error growth rates of trajectories.Relative growth rates in pp day −1 are given in parenthesis.