The accuracy of trajectory calculations performed by Lagrangian particle dispersion models (LPDMs) depends on various factors. The optimization of numerical integration schemes used to solve the trajectory equation helps to maximize the computational efficiency of large-scale LPDM simulations. We analyzed global truncation errors of six explicit integration schemes of the Runge–Kutta family, which we implemented in the Massive-Parallel Trajectory Calculations (MPTRAC) advection module. The simulations were driven by wind fields from operational analysis and forecasts of the European Centre for Medium-Range Weather Forecasts (ECMWF) at T1279L137 spatial resolution and 3 h temporal sampling. We defined separate test cases for 15 distinct regions of the atmosphere, covering the polar regions, the midlatitudes, and the tropics in the free troposphere, in the upper troposphere and lower stratosphere (UT/LS) region, and in the middle stratosphere. In total, more than 5000 different transport simulations were performed, covering the months of January, April, July, and October for the years 2014 and 2015. We quantified the accuracy of the trajectories by calculating transport deviations with respect to reference simulations using a fourth-order Runge–Kutta integration scheme with a sufficiently fine time step. Transport deviations were assessed with respect to error limits based on turbulent diffusion. Independent of the numerical scheme, the global truncation errors vary significantly between the different regions. Horizontal transport deviations in the stratosphere are typically an order of magnitude smaller compared with the free troposphere. We found that the truncation errors of the six numerical schemes fall into three distinct groups, which mostly depend on the numerical order of the scheme. Schemes of the same order differ little in accuracy, but some methods need less computational time, which gives them an advantage in efficiency. The selection of the integration scheme and the appropriate time step should possibly take into account the typical altitude ranges as well as the total length of the simulations to achieve the most efficient simulations. However, trying to summarize, we recommend the third-order Runge–Kutta method with a time step of 170 s or the midpoint scheme with a time step of 100 s for efficient simulations of up to 10 days of simulation time for the specific ECMWF high-resolution data set considered in this study. Purely stratospheric simulations can use significantly larger time steps of 800 and 1100 s for the midpoint scheme and the third-order Runge–Kutta method, respectively.

Lagrangian particle dispersion models (LPDMs) have proven to be
useful for understanding the properties of atmospheric flows, particularly
for problems related to transport, dispersion, and mixing of tracers and
other atmospheric properties

LPDMs simulate transport and diffusion of atmospheric tracers based on
trajectory calculations for many air parcels that move with the fluid
flow in the atmosphere. The accuracy of these calculations has been
the subject of numerous studies

In the following, we present an assessment of six numerical integration
schemes, all belonging to the class of explicit Runge–Kutta methods

In Sect. 2, we present the advection module of the Lagrangian particle dispersion model MPTRAC together with an overview on the meteorological data. The selected numerical integration schemes and the diagnostic variables are introduced and the experimental setup is described. Section 3 shows transport deviations from case studies followed by a general analysis of the error behavior in terms of error growth rates and region-specific characteristics. Scalability and performance on a high-performance computing system are discussed. In Sect. 4, we conclude with suggestions for the best-suited integration schemes and optimal time step choice in order to achieve the most effective simulations of large-scale problems on current high-performance computing systems.

In this study, we apply the Lagrangian particle dispersion model MPTRAC

Air parcel transport in MPTRAC is driven by given wind fields. In principle,
any gridded data produced by general circulation models, atmospheric
reanalyses, or operational analyses and forecasts can be used for this
purpose. Reanalyses and forecasts benefit from well-established
meteorological data assimilation methods

ECMWF operational analysis horizontal wind speed

Example wind fields from the operational data are presented in
Fig.

Lagrangian particle dispersion models calculate the trajectories of
individual particles or infinitesimally small air parcels over time.
The trajectory of each air parcel is defined by the trajectory
equation

The explicit Euler method likely poses the most simple way to solve the
trajectory equation. The numerical solution is obtained from
Eq. (

MPTRAC currently uses the explicit midpoint method as its default
numerical integration scheme:

The scheme of

In this study, we also evaluated specific third- and fourth-order
explicit Runge–Kutta methods (RK3 and RK4). The third-order method
used here is defined by

A common way to compare sets of test and reference trajectories is to
calculate transport deviations

According to the definition, the transport deviations are calculated
as mean absolute deviations of the air parcel distances. Although the
mean absolute deviation is a rather intuitive approach to measure
statistical dispersion, we note that it is not necessarily the most
robust measure, as it can be influenced significantly by outliers.
Such outliers of rather large individual transport deviations exist in
some of our simulations. Strong error growth of individual
trajectories can occur once the test and reference trajectories are
significantly separated from each other, meaning that the air parcels
are located in completely different wind regimes. To mitigate this
issue, we decided to report also the median of the absolute and
relative transport deviations of the individual air parcels as an
additional statistical measure. The median absolute deviation is a
much more robust statistical measure. In all cases considered here, we
found that the median absolute deviation is smaller than the mean
absolute deviation. This indicates that the distributions of
transport deviations are skewed towards larger outliers. Note that
skewed distributions of transport deviations have also been reported
in other LPDM intercomparison and validation studies

Since our test cases are based on real meteorological data, we
obtained the reference data to calculate the transport deviations
using the most accurate integration method available to us with a
sufficiently short time step. Sensitivity tests using variable time
steps down to 30 s showed that the numerical solution from the RK4
method converges for time steps of 60 s or less, in the sense that
transport deviations relative to simulations with a time step of
120 s do not change significantly.
In particular, comparing simulations with time steps of 120 s and
60 s, the median horizontal deviation is less than 7 km and the
median vertical deviation is less than 10 m for up to 10 days of
simulation time. Alternatively, following

The maximum tolerable error limits for trajectory calculations depend on the
individual application of course. However, as a guideline, we here provide
physically motivated error limits that are of particular interest regarding
LPDM simulations. LPDMs consider both advection and diffusion to simulate
dispersion. Clearly, the numerical errors of the trajectory calculations
representing the advective part should be smaller than the particle spread
caused by diffusion. Considering a simple model of Gaussian diffusion, the
standard deviations of the horizontal and vertical particle distributions are
given by

Fractions of air parcels remaining in initial regions during the course of the simulations. SH indicates the Southern Hemisphere and NH indicates the Northern Hemisphere.

In this study, we analyzed the errors of trajectory calculations in 15 regions
of the atmosphere, covering rather distinct conditions in terms of pressure,
temperature, and winds. The globe was divided into five latitude bands: polar
latitudes (90 to 65

Here, the atmospheric regions have been defined by means of fixed
latitude and altitude boundaries. This is arguably a rather simple
approach compared to physically motivated separation criteria based on
equivalent latitudes or the dynamical tropopause. However, the simple
approach may still reflect how the model is initialized and used in
different applications in practice. An important consequence of our
approach is that part of the air parcels leave their initial region
during the course of simulation. Table

Examples of trajectory calculations using different
numerical integration schemes. Circles mark the start positions of
the trajectories. Trajectories were launched at an altitude of
10.8 km

First, we present two case studies that illustrate some of the common
features related to trajectory calculations using different numerical
integration schemes. Figure

Absolute horizontal

A common feature of the trajectory calculations we found in the case studies and also in many other situations is that the numerical integration schemes yield solutions that typically agree well up to a specific point in time before rapid error growth begins. Errors grow slowly in the beginning, but at some point, e.g., if there is strong wind shear locally, the trajectories may begin to diverge significantly. Shorter time steps or high-order integration schemes are needed to properly cope with such situations. The case studies also show that transport deviations do not necessarily grow monotonically over time. Trajectories may first diverge from and then reapproach the reference data. Individual local wind fields can bring trajectories back together by chance. The case studies also seem to suggest that vertical errors start to grow earlier than horizontal errors. Furthermore, we note that the Petterssen scheme mostly provides smaller errors than Heun's method. This was expected because the Petterssen scheme provides iterative refinements compared with Heun's method. In both case studies, the midpoint method performs better than the other second-order methods. However, this is not valid in general; we also found counterexamples with the midpoint method performing worse than the other second-order methods. Both examples generally exhibit large variability of the errors. This indicates that transport deviations need to be calculated for large numbers of air parcels to obtain statistically meaningful results.

In this section, we discuss the temporal growth rates of the trajectory
calculation errors from a more general point of view. Although the
magnitude of the truncation errors varies largely between the schemes
and with the time step used for numerical integration, we found that
the transport deviations typically grow rather monotonically over
time if large numbers of particles are considered. Hence, we decided
to present errors here using a fixed time step of 120 s for the
numerical integration as a representative example. As the magnitude of
the calculation errors varies largely between the troposphere and
stratosphere, we present the analysis for both regions separately. The
results for the UT/LS region are not shown, as they just fall in
between. We calculated combined transport deviations considering all
seasons and latitude bands in the given altitude range. A more
detailed analysis of the total errors in individual latitude bands and
for different seasons will follow in Sect.

Absolute horizontal

Figure

From the data presented in Fig.

Maximal error growth rates of trajectories. Relative growth rates in
pp day

For a more detailed analysis of the regional and seasonal variations
of the total trajectory errors, we focus on the errors after 10 days
of simulation time for simulations using the third-order Runge–Kutta
method with a single time step of 120 s. This is considered to be a
representative example, as other schemes and time steps show similar
variations. We calculated individual transport deviations for all 15
altitude–latitude regions and for simulations starting at the
beginning of January, April, July, and October 2014 and 2015,
respectively. The results are shown in Figs.

Mean (thin bars) and median (thick bars) horizontal transport deviations after 10 days of simulation time in different regions for the RK3 method and 120 s time step. Orange lines show the averages of the four months (January, April, July, and October) and both years (2014 and 2015). Grey lines show error limits based on diffusion.

Same as Fig.

Our simulations show that horizontal errors increase from typically
20 km in the stratosphere to 100 km in the UT/LS region and about
200 km in the troposphere. The corresponding maximum AHTDs are
116, 177, and 470 km, respectively. The corresponding
relative errors increase from 0.0 to 0.4 % in the stratosphere, to
around 0.1 to 1.0 % in the UT/LS region, and 1.0 to 4.0 % in the
troposphere. As shown in Fig.

The trajectory errors at all altitude layers vary with latitude. We
focus on the horizontal errors in this case, but vertical errors show
similar results. The largest trajectory errors in the troposphere are
found at northern midlatitudes with errors between 245 and
470 km. The meteorological conditions in tropospheric midlatitudes
were expected to cause relatively large errors because of the nature
of global circulation: Rossby waves and baroclinic instability
occurring predominantly in this region come along with highly variable
wind patterns. In addition, stronger fluctuations are expected in the
northern midlatitudes compared to the southern midlatitudes due to
the larger land–sea ratio and more complex orography of the Northern
Hemisphere. The errors obtained in the polar regions are second
largest with an average over all seasonal samples of around 200 km
and peak errors in polar summer of up to 380 km. The simulations for
the tropics and southern midlatitudes show smaller errors of less
than 200 km and adhere to the error limit in all test cases. The
simulations for the UT/LS region have their largest AHTDs in the
northern midlatitudes with 95 to 177 km. These errors are caused by
the north–south meandering of the jet

The variation of the horizontal errors also exhibits some seasonal dependencies. This is most prominent for the northern midlatitudes, where maximum errors in all cases occur in January. During Northern Hemisphere wintertime, land–sea temperature differences as well as the temperature gradient between the Arctic and the subtropical regions are largest, which allows for more intense and complex dynamic patterns to occur than in summer. Our test cases for the Southern Hemisphere and for the Arctic region do not show a seasonal behavior as clearly as one could expect. We need to stress that each simulation lasts only 10 days, which is a relatively short time interval to analyze seasonal effects. Fast temporal variations and changes in medium-range weather patterns can blur out the impact of seasons that is observed here. The small error differences between polar summer and winter additionally can be attributed to the small fraction of parcels that stay in that region. Only 13 % of the parcels that are represented by the statistic remained in the polar regions after 10 days of simulation, which weakens our statistics.

Most of our simulations for the corresponding months in 2014 and 2015
differ by less than 20 %; only deviations of a few individual months
differ more strongly but in a similar range than the seasonal
variations. The most striking differences occur in January in the
stratosphere of the northern polar region. The simulation of 2014
shows small errors of 4 km, while the simulation of 2015 reaches an
error of up to 116 km and exceeds the stratospheric error limit. This
particular behavior (which is also present in
Fig.

Vertical and horizontal errors behave very similarly; extrema are found in the same regions. The errors in the stratosphere are usually very small and below 10 m. Typical errors in the UT/LS region and in the troposphere are about 100 and 250 m, respectively. Corresponding maximum errors are 130 m in the stratosphere, 168 m in the UT/LS region, and 470 m in the troposphere. The vertical error limits of 415 m in the stratosphere and 1300 m in the troposphere are easily adhered to. Relative vertical errors range 0.0–0.9 % in the stratosphere, 0.2–1.6 % in the UT/LS region, and 1.2–4.4 % in the troposphere.

We also calculated the horizontal and vertical median errors for the regions. In general, horizontal and vertical median errors are much smaller than the mean errors. Small median deviations show that most trajectories closely follow the reference. Those parcels that part from the reference usually diverge strongly, which leads to a high average deviation. The median error is somewhat larger for simulations in the troposphere, where particle paths are more likely being affected by synoptic-scale fluctuations of the wind field.

To summarize, the relative errors of 2–4 % in the troposphere show that this layer is more difficult to solve and that relatively large uncertainties remain even if the absolute error limit is adhered to. The stratospheric relative errors of about 1 % are less critical for the integration method. The large difference of the trajectory errors between altitude regions suggests that lower-order integration schemes or larger time steps could be used in the stratosphere to save computation time without causing significant errors. Tropospheric northern midlatitudes are the most challenging areas for numerical integration.

In this section, we focus on the computational efficiency of the
numerical integration schemes, which is assessed in terms of the
trade-off between computational accuracy of and the computation time
required for the trajectory calculations. As the computational
efficiency depends, to some extent, on the problem size and the
computer architecture that is applied, we will discuss the scalability
of the application first. Our scalability tests were performed on the
Jülich Research on Exascale Cluster Architectures (JURECA)
supercomputer

See

Scaling behavior in terms of CPU time

As an example, Fig.

Trade-off between computational accuracy and total
CPU-time requirements of the trajectory calculations after 24 h

As a measure of computational efficiency, Fig.

Figure

In this study, we characterized global truncation errors of trajectory calculations after 1 and 10 days in the free troposphere, in the UT/LS region, and in the stratosphere. Transport simulations were conducted with the LPDM MPTRAC, driven by wind fields from T1279L137 ECMWF operational analyses and forecasts in 2014 and 2015, with an effective horizontal resolution of about 16 km and 3 h time intervals. We analyzed the computational performance of the simulations in terms of accuracy and CPU-time costs of six explicit integration schemes that belong to the Runge–Kutta family. The truncation errors of the schemes for a given time step were found to cluster into three groups that are related to the order of the method: (i) the first-order Euler method, (ii) the second-order methods (midpoint method, Heun's method, and Petterssen's scheme), and (iii) the higher-order methods, which are the common RK3 and RK4 methods. Different methods within each group provide similar accuracy in terms of error growth rates and transport deviations.

Based on more than 5000 individual transport simulations, each consisting of 500 000 trajectories, we further analyzed horizontal and vertical transport deviations in relation to altitude, latitude, as well as seasonal and year-to-year variability. The trajectory errors after 24 h were analyzed as they are expected to be less affected by individual flow patterns. The errors of the simulations in the troposphere have 10 times larger errors compared to the simulations in the stratosphere. After 10 days, the trajectory errors vary more substantially inside the climatological regions because of the stronger influence of individual atmospheric flow patterns. We found that tropospheric simulations require more accurate integration methods or significantly shorter time steps to keep errors within physically motivated error limits than simulations for the stratosphere. We attribute this to larger small-scale variations in the high-resolution meteorological input data. Calculation errors also depend on the latitude band, with the northern midlatitudes having the largest errors in each altitude layer. Seasonal error variations and differences from year to year are clearly visible from our simulations, but in some cases the number of samples still seems to be too small to deduce robust statistics. One example is the large errors that are associated with a sudden stratospheric warming in the northern stratosphere in January 2015, which suggests that part of the total error is due to situation-dependent factors. However, a robust feature seems to be a northern midlatitude winter maximum in the troposphere and stratosphere, existent in both years (2014 and 2015).

All integration methods discussed here are in principle suited and have already been used for Lagrangian particle dispersion and trajectory model simulations. To decide which method is most efficient in state-of-the-art high-performance computing systems, we analyzed the trade-off between computational accuracy and computational time. This trade-off is largely controlled by the time step used for numerical integration. The Euler method requires very short time steps to achieve reasonably accurate results and is generally not considered to be an efficient method. Heun's method and the iterative Petterssen scheme are more accurate at the same computational costs. The midpoint method and the RK3 method usually provided the most efficient simulations with MPTRAC; i.e., these methods provide the most accurate results at the lowest computational costs. Note that the RK4 method is slightly more expensive than the RK3 method if it is applied together with a low-order linear interpolation scheme for the meteorological data.

This study uses up-to-date meteorological data as provided by current
global weather forecast models, with a spatial resolution that is much
finer than in former trajectory studies

The high resolution requires adjustment of the time step, as the commonly used time steps of 10 min to 1 h are far beyond yielding convergence with high-resolution meteorological data. Given an effective horizontal resolution of 16 km and applying the CFL criterion, the time step needs to be shorter than about 130 s to achieve convergence. From our simulations, we found that time steps of 100 s for the midpoint method and 170 s for the RK3 method provide accurate results in the troposphere for up to 10 days. Purely stratospheric applications can be solved with time steps of 800 s (midpoint method) and 1100 s (RK3 method) because of lower total errors in this altitude layer.

In this study, we considered a range of popular and well-established
integration schemes for trajectory calculations in LPDMs. However, the
large variability of regional and seasonal errors found here suggests
that applications may benefit from more advanced numerical
techniques. Adaptive quadrature by means of variable time stepping as
recommended by earlier studies

We downloaded operational analyses and forecasts from
the European Centre for Medium-Range Weather Forecasts (ECMWF, 2013, 2015).
See reference for further details on data availability and restrictions.
ECMWF data have been processed for usage with MPTRAC by means of the Climate
Data Operators (CDO, 2015). The version of the MPTRAC model that was used for
this study along with the model initializations are available under the terms
and conditions of the GNU General Public License, version 3, from the
repository at

The authors declare that they have no conflict of interest.

The authors acknowledge the Jülich Supercomputing Centre (JSC) for providing computing time on the supercomputer JURECA. Yi Heng acknowledges support from the “100 Talents Program” of Sun Yat-sen University, Special Program for Applied Research on Super Computation of the NSFC-Guangdong Joint Fund (the second phase), and the National Supercomputer Center in Guangzhou. We thank Xue Wu for helpful comments on an earlier draft of this paper.The article processing charges for this open-access publication were covered by a Research Centre of the Helmholtz Association. Edited by: Ignacio Pisso Reviewed by: Petra Seibert and one anonymous referee