Collection/aggregation algorithms in Lagrangian cloud microphysical models: Rigorous evaluation in box model simulations

. Recently, several Lagrangian microphysical models have been developed which use a 1 large number of (computational) particles to represent a cloud. In particular, the collision process 2 leading to coalescence of cloud droplets or aggregation of ice crystals is implemented differently 3 in the various models. Three existing implementations are reviewed and extended, and their perfor-4 mance is evaluated by a comparison with well established analytical and bin model solutions. In this 5 ﬁrst step of rigorous evaluation, box model simulations with collection/aggregation being the only 6 process considered have been performed for the three well-known kernels of Golovin, Long and 7 Hall. 8 Besides numerical parameters like the time step and the number of simulation particles (SIPs) 9 used, the details of how the initial SIP ensemble is created from a prescribed analytically deﬁned 10 size distribution is crucial for the performance of the algorithms. Using a constant weight tech-11 nique as done in previous studies greatly underestimates the quality of the algorithms. Using better 12 initialisation techniques considerably reduces the number of required SIPs to obtain realistic re-13 sults. From the box model results recommendations for the collection/aggregation implementation 14 in higher dimensional model setups are derived. Suitable algorithms are equally relevant to treating 15 the warm-rain process and aggregation in cirrus.


Introduction
The collection of cloud droplets and the aggregation of ice crystals are important processes in liquid and ice clouds. By changing the size, number and, in the case of ice, the shape of hydrometeors, collection and aggregation affect the microphysical behaviour of clouds and thereby their role in the climate system.
The warm rain process (i.e. the production of precipitation in clouds in the absence of ice) depends essentially on the collision and subsequent coalescence of cloud droplets. At its initial stage, however, condensational growth governs the activation of aerosols and the following growth of cloud droplets, which might initiate the collection process if they become sufficiently large. Then, collection produces drizzle or raindrops, which are able to precipitate from the cloud, affecting lifetime and organisation of clouds (e.g. Albrecht, 1989;Xue et al., 2008).
In ice clouds, sedimentation, deposition growth and in particular radiative properties depend on the ice crystals' habits (Sölch and Kärcher, 2011, and references therein). Ice aggregates scatter shortwave radiation more strongly than pure ice crystals of the same mass. Recent simulation results suggest that contrail cirrus and natural cirrus can be strongly interwoven. In the mixing area, with ice crystals of both origins being present, a prominent bimodal spectrum occurs and enhances the probability of collisions (Unterstrasser et al., 2016).
The temporal change of an infinite system of droplets by collision and subsequent coalescence (or any other particles) is described by the stochastic collection equation (SCE), also known as the kinetic collection equation, coagulation S. Unterstrasser et al.: Collection/aggregation in Lagrangian cloud microphysics equation, Smoluchowski or population balance equation (e.g. Wang et al., 2007). It yields where f m (m)dm is the number concentration within an infinitesimal interval around the mass m. The first term (gain term) accounts for the coalescence of two smaller droplets forming a new droplet with mass m; the second term (loss term) accounts for the coalescence of droplets with mass m with any other droplets forming a larger droplet. The collection kernel K(m, m ) describes the rate by which collections between a droplet with mass m and a droplet with mass m occur. Due to the symmetry of the collection kernel (K(m, m ) = K(m , m)) the first term on the right-hand side can also be written as For several kernel functions (mostly of polynomial form), analytic solutions exist for specific initial distributions (Golovin, 1963;Berry, 1967;Scott, 1968). The Golovin kernel (sum of masses) is given by Solutions for more realistic kernels (Long, 1974;Hall, 1980;Wang et al., 2006) and arbitrary initial distribution can be obtained with various numerical methods mainly using a bin representation of the droplet size distribution (Berry and Reinhardt, 1974;Tzivion et al., 1987;Bott, 1998;Simmel et al., 2002;Wang et al., 2007). The hydrodynamic kernel is defined as K(r, r ) = π(r + r ) 2 |w sed (r) − w sed (r )| E c (r, r ), based on the radius r and the sedimentation velocity w sed . Parametrisations of the collection efficiency E c are given, e.g. by Long (1974) or Hall (1980). In the above formula, the differential sedimentation is the driver of collections. No same-size collisions can occur, i.e. K(r, r) = 0. More sophisticated expressions for K(r, r ) have been derived to include turbulence enhancement of the collisional growth, which also allow same-size collisions (K(r, r) > 0) (e.g. Ayala et al., 2008;Grabowski and Wang, 2013;Chen et al., 2016). Solving (1) demands simplifications in the representation of the droplet spectrum for which several numerical models have been developed. Spectral-bin models (e.g. Khain et al., 2000) represent the spectrum by dividing it into several intervals (so-called bins). This approach enables the prediction of the temporal development of the droplet number concentration in each bin by using the method of finite differences (e.g. Bott, 1998). The accuracy of these models is primarily determined by the number of used bins (usually on the order of 100), which makes them computationally challenging and prohibits their use in day-to-day applications like numerical weather prediction. Less challenging but less accurate are cloud microphysical bulk models that compute the temporal change of integral quantities of the droplet spectrum (e.g. Kessler, 1969;Khairoutdinov and Kogan, 2000;Seifert and Beheng, 2001). These are usually equations for the temporal evolution of bulk mass (so-called one-moment schemes) and additionally number concentration (two-moment schemes) or radar reflectivity (three-moment schemes), which describe the change of the entities of cloud droplets and rain drops (in the case of warm clouds). The separation radius between cloud droplets and rain drops depends on the details of the bulk scheme, but generally cloud droplets (up to 20 to 40 µm in radius) are assumed to have negligible sedimentation fall velocities, while larger drops, frequently subsumed as rain drops, have a sufficient sedimentation velocity to cause collision/coalescence. The interactions of cloud and rain drops are therefore described in terms of self-collection (coalescence of cloud (rain) drops resulting in cloud (rain) drops), autoconversion (coalescence of cloud droplets resulting in rain drops) and accretion (collection of cloud droplets by rain drops). A third alternative for computing cloud microphysics has been developed in the recent years: Lagrangian cloud models (LCMs). These models represent cloud microphysics on the basis of individual computational particles (SIPs). Similar to spectral-bin models, LCMs enable the detailed representation of droplet spectra.
Due to their specific construction, LCMs offer a variety of advantages in comparison to spectral-bin and bulk cloud models. Their representation of aerosol activation and subsequent diffusional growth closely follows fundamental equations and therefore avoids the possible perils of parametrisations (e.g. Andrejczuk et al., 2008;Hoffmann, 2016). The same applies for the representation of collection or aggregation, which is based on the interaction of individual SIPs. Accordingly, LCMs approximate pure stochastic growth (e.g. Gillespie, 1975), which is the correct description of collection/aggregation within a limited system of interacting particles and results in the SCE, which is used as the basis for spectral-bin and bulk models if the system becomes infinite (e.g. Bayewitz et al., 1974). Moreover, LCMs do not apply the finite differences method to compute microphysics. Accordingly, LCMs are not prone to numerical diffusion and dispersion, and do not suffer from the numerical broadening of a droplet spectrum, which can affect spectral-bin cloud models (Khain et al., 2000). The effect of sedimentation is incorporated in a straightforward manner in the transport equation of the SIPs and avoids numerical artefacts (Wacker and Seifert, 2001). Finally, LCMs enable new ways of analysis by the tracking of individual SIPs. They can be used to reveal the origins of droplets, as well as conditions associated with their growth (e.g. Hoffmann et al., 2015;Naumann and Seifert, 2016). The largest disadvantage of LCMs, so far, might be their relative novelty due to their higher computational demand. Many aspects of this approach have not been validated adequately and there is potential for future improvements. For the process of collection/aggregation, this study will offer a first rigorous evaluation of the available numerical approaches.
So far, no consistent terminology has been used in the latter publications. Various names have been used for the same things by various authors. We point out that super droplet, computational droplet and simulation particle (SIP) all have the same meaning and refer to several identical real cloud droplets (or ice crystals) represented by one Lagrangian particle. The number of real droplets represented in a SIP is denoted as the weighting factor or multiplicity. Moreover, Lagrangian approaches in cloud physics have been named the Lagrangian cloud model (LCM), super droplet method (SDM) or particle-based method. In this paper, we use the terms SIP, weighting factor ν sim and LCM. Here, droplet refers to either real droplets or ice crystals. If we say in the following that "SIP i is larger than SIP j ", this means that the droplets represented in SIP i are larger than those in SIP j . Such a statement it is not related to the weighting factor of the SIPs.
Usually, only the liquid water or the ice of a cloud are described with a Lagrangian representation, whereas all other physical quantities (like velocity, temperature and water vapour concentration) are described in Eulerian space (see also discussion in Hoffmann, 2016). SIPs have discrete positions x p = (x p , y p , z p ) within a grid box. The position is regularly updated, obeying the transport equation ∂x p /∂t = u. Microphysical processes like sedimentation and droplet growth are treated individually for each SIP. Interpolation methods can be used to evaluate the Eulerian fields at the specific SIP positions. This implicitly assumes that all ν sim droplets of the SIPs are located at the same position. On the other hand, the droplets of a SIP are assumed to be well-mixed in the grid box in the LCM treatment of collection and sometimes condensation. Then, the number concentration represented by a single SIP, e.g. is given by ν sim / V , where V is the volume of the grid box.
Lists of used symbols and abbreviation are given in Tables 1 and 2.

Description of the various collection/aggregation implementations
We use the terminology of Berry (1967), where f ln r and g ln r denote the number and mass density function with respect to the logarithm of droplet radius ln r. The relations g ln r (r) = mf ln r (r) and f ln r (r) = 3mf m (m) hold. The latter designates the number density function with respect to mass and obeys the transformation property of distributions: f y (y)dy = f x (x(y))dx. For consistency with previous studies, g ln r is used for plotting purposes, whereas f m and g m are more relevant in the following analytical derivations. The moments of order k of the mass distribution f m (equivalent to the number density function with respect to mass) are defined as The low-order moments represent the droplet number concentration (DNC = λ 0 ) and the mass concentration (liquid water content; LWC = λ 1 ). The analogous extensive properties λ k (t) V are the total droplet number N , total droplet mass M and radar reflectivity (Z = λ 2 V ). For a given SIP ensemble, the moments can be computed by where µ i is the single droplet mass of SIP i and N SIP is the number of SIPs inside a grid box. For reasons of consistency with Wang et al. (2007), we translate the SIP ensemble into a mass distribution g m in bin representation and then compute the moments with the formula (cf. with their Eq. 48). The initialisation is successful for a given parameter set if the moments of the SIP ensemble λ k,SIP are close to the analytical values λ k,anal . For an exponential distribution (as used in this study), the probability density function (PDF) reads as the moments are given analytically by kg bin boundaries of the bin grid m = λ 1 /λ 0 = M/N kg mean mass of all droplets n bin,l 1 droplet number in bin l r, r m droplet radius r lb m threshold radius in ν random,lb -init r critmin m lower cut-off radius in singleSIP-init w sed m s −1 sedimentation velocity DNC = λ 0 m −3 droplet number concentration E c 1 collection/aggregation efficiency K m 3 s −1 collection/aggregation kernel LWC = λ 1 kg m −3 droplet mass concentration, liquid water content M bin,l kg total droplet mass in bin l N SIP 1 number of SIPs N BIN 1 number of bins α low , α med , α high 1 parameters of the ν random -init method t s time step V m 3 grid box volume η 1 parameter in RMA and singleSIP-init method κ 1 number of bins per mass decade λ k kg k m −3 moments of the order k µ kg single droplet mass of a SIP ν critmax 1 maximum number of droplets represented by a SIP ν critmin 1 minimum number of droplets represented by a SIP ν 1 number of droplets represented by a SIP ξ 1 splitting parameter of AON χ = µ ν, χ = χ/M kg, 1 total droplet mass of a SIP N = λ 0 V 1 total droplet number M = λ 1 V kg total droplet mass Z = λ 2 V kg 2 second moment of droplet mass distribution (radar reflectivity) where k! is the factorial of k andm = M/N the mean mass (Rade and Westergren, 2000). Throughout this study, the initial parameters of the droplet size distribution (DSD) are DNC 0 = 2.97 × 10 8 m −3 and LWC 0 = 10 −3 kg m −3 (implying a mean radius of 9.3 µm) as in Wang et al. (2007). The higher moments are λ 2,anal = 6.74 × 10 −15 kg 2 m −3 and λ 3,anal = 6.81 × 10 −26 kg 3 m −3 .

Initialisation
In our test cases, all microphysical processes except collection are neglected and an exponential DSD is initialised. In the results section, we will demonstrate that the outcome of the various collection algorithms critically depends on how this initial, analytically defined, continuous DSD is translated into a discrete ensemble of SIPs. Hence, the SIP initialisation is described in some detail.

SingleSIP-init and multiSIP-init
First, the mass distribution is discretized on a logarithmic scale. The boundaries of bin l are given by m bb,l = m low 10 l/κ and m bb,l+1 , where m low is the minimum droplet mass considered. The bin centre is computed using the arithmetic meanm bb,l = 0.5 (m bb,l+1 + m bb,l ). The bin size is m bb,l = (m bb,l+1 − m bb,l ). The mass increases 10-fold every κ bin. Several previous studies used the parameter s with m bb,l+1 /m bb,l = 2 1/s to characterise the bin resolution. The parameters s and κ are related via s = κ log 10 (2) ≈ 0.3 κ.
For each bin, the droplet number is approximated by ν b = f m (m bb,l ) m bb,l V , and one SIP with weighting factor ν sim = ν b and droplet mass µ sim =m bb,l is created if ν b is greater than a lower cut-off threshold ν critmin . No SIP is created if ν b < ν critmin . Moreover, no SIPs are created from bins with radius r < r critmin . We will refer to this as deterministic singleSIP-init. In its probabilistic version, the mass µ sim is randomly chosen within each bin l and ν sim = f m (µ sim ) m bb,l V is adapted accordingly. By default, r critmin = 0.6 µm and ν critmin = η × ν max , which is determined from the maximal weighting factor within the entire SIP ensemble ν max and the prescribed ratio of the minimal to the maximal weighting factor η = 10 −9 . For larger r critmin , it is advantageous to initialise one additional "residual" SIP that contains the sum of all neglected contributions.
Following Unterstrasser and Sölch (2014, see their Appendix A), we introduce the multiSIP-init technique. It is similar to the singleSIP-init technique, except that we additionally introduce an upper threshold ν critmax . If ν b > ν critmax is fulfilled for a specific bin, then this bin is divided into κ sub = ν b /ν critmax sub-bins and a SIP is created for each sub-bin. The multiSIP-init technique gives a good trade-off between resolving low concentrations at the DSD tails and high concentrations of the most abundant droplet masses. By default, ν critmax = 0.1 ν max .
So far, we introduced initialisation techniques with a strict lower threshold ν critmin with no SIPs created in bins with ν b < ν critmin . We can relax this condition by introducingwhat we call -a weak threshold. This means that, in such a low-contribution bin (with ν b < ν critmin ), we create a SIP with the probability p create = ν b /ν critmin and weighting factor ν sim = ν critmin . Having many realisations of initial SIP ensembles, the expectation value of the droplet number represented by such SIPs, ν critmin ·p create +0·(1−p create ), equals the analytically prescribed value ν b . Using a strict threshold the droplet number would be simply 0 in those low-contribution bins. In a related problem, such a probabilistic approach has been shown to strongly leverage the sensitivity of ice crystal nucleation on the numerical parameter ν critmin . This led to a substantial reduction of the number of SIPs that are required for converging simulation results (Unterstrasser and Sölch, 2014).
Using the probabilistic version and a weak lower threshold is particularly important if different realisations of SIP ensembles of the same analytic DSD should be created. The number of SIPs N SIP depends on κ, ν critmin , ν critmax and the parameters of the prescribed distribution.
Moreover, the singleSIP-init is used in a hybrid version, where different κ values are used in specified radius ranges. Table 3 lists the resulting number of SIPs for the range of κ values used in simulations with the probabilistic singleSIPinit and variants of it.

ν const -init and ν draw -init
The accumulated PDF F (m) is given by m 0 f m (m )dm with the normalised PDF f m = f m /λ 0 . First, the size N SIP of the SIP ensemble that should approximate the initial DSD is specified. For each SIP, its mass µ i is reasonably picked by where rand() generates uniformly distributed random numbers ∈ [0, 1]. In the case of the ν const -init, the weighting factors of all SIPs are equally ν i = ν const = N /N SIP . This init method reproduces SIP ensembles similar to the ones in Shima et al. (2009. As a variety of the ν const -init method, the weighting factors ν i in the ν drawinit method are simply perturbed by ν i = 2 rand() ν const . For the case of an exponential distribution, the following holds for the SIPs i = 1, N SIP : In the literature, this approach is known as inverse transform sampling. A proof of correctness can be found in classical textbooks, e.g. Devroye (1986, their Sect. II.2).

ν random -init
The third approach allows specifying the spectrum of weighting factors that should be covered by the SIP ensemble. Similar to the ν draw -init method, the weighting factors are randomly determined. Whereas the latter method produced a SIP ensemble with weighting factors uniformly distributed in ν, the ν random -init produces weighting factors uniformly distributed in log(ν) and covering the range [N 10 α low , N 10 α high ]. The eventual number of SIPs depends most sensitively on the parameter α high , which controls how big the portion of a single SIP can be. SIPs with weighting factors ν i = N 10 (α low +(α high −α low )·rand()) are created until N SIP j =1 ν j exceeds N . The weighting factor of the last SIP is corrected such that N SIP j =1 ν j = N holds. Now, the mass µ i of each SIP is determined by the following technique: the first SIP represents the smallest droplets and covers the mass interval [0, m 1 ], whereas the last SIP represents the largest droplets in the interval [m N SIP −1 , ∞]. The SIPs i in between cover the adjacent mass intervals [m i−1 , m i ]. The boundaries are implicitly determined by The total mass contained in each SIP is given by χ i = m i m i−1 f m (m )m dm V and the single droplet mass by µ i = χ i /ν i .
For the case of an exponential distribution, the following holds for the interval boundaries and the SIPs i = 1, N SIP : and The above formulas, which involve several differences of similarly valued terms, must be carefully implemented such that numerical cancellation errors are kept tolerable. Experimenting with the SIP-init procedure, several optimisations have been incorporated. First, the ν spectrum is split into two intervals [N 10 α low , N 10 α med ] and [N 10 α med , N 10 α high ]. We alternately pick random values from the two intervals. Without this correction, it happens that several consecutive SIPs with small weights, and hence nearly identical droplet masses, are created, which increases the SIP number without any benefits.
Going through the list of SIPs, the droplet masses increase, and hence the individual SIPs contain gradually increasing fractions of the total grid box mass. This can lead to a rather coarse representation of the right tail of the DSD. Two options to improve this have been implemented. In the ν random,rs option, the ν i values are reduced by some factor that increases as i j =1 ν j approaches N . In the ν random,lb option, ν values are randomly picked up to a certain radius threshold r lb . Above this threshold, SIPs are created with the singleSIP method with linearly spaced bins. Figure 1 shows the weighting factors and other properties of the initial SIP ensemble, which may affect the performance of the algorithms. Each column shows one class of initialisation techniques. For a certain realisation, the first row shows the weighting factors ν i of all SIPs as a function of their represented droplet radius r i . Each dot shows the (ν i , r i ) pair of one SIP. For the singleSIP-init, the dots are homogeneously distributed along the horizontal axis, as one SIP is created from each bin (with exponentially increasing bin sizes). The according ν values relate directly to the prescribed DSD. The higher f m m, the more droplets are represented in a SIP. No SIPs smaller than r critmin = 0.6 µm are initialised and the ν values range over 9 orders of magnitude, consistent with η = 10 −9 . The multiSIP-init introduces an upper bound of ν critmax = 2.6 × 10 6 for ν. This threshold is effective over a certain radius range where the SIPs, compared to the singleSIP-init, have lower ν values and are also more densely distributed along the horizontal axis. For the ν const -init, all SIPs use ν = ν const , whereas for the ν draw -init the ν values scatter around this value. For ν const and ν draw , the ν values are chosen independently of the given DSD contrary to the latter techniques. However, for both techniques, the density of the dots along the r axis is correlated to f m m.

Comparison
The ν random -init technique randomly picks ν values which are distributed over a larger range compared to the ν draw -init. In fact, they are uniformly distributed in log(ν). The range of possible ν values can be adjusted and is chosen similar to the singleSIP/multiSIP by setting α high = −2, α med = −3 and α low = −7, which are the defaults in all simulations presented here. The present method is more flexible compared to the singleSIP approach, as the occurrence of certain ν values is not limited to a certain radius range. In the singleSIP-init, the smallest ν values occur only on the left and right tails of the DSD, whereas in the ν random approach the smallest ν values (down to N 10 α low ) can appear over the whole radius Figure 1. Characteristics of the various SIP initialisation methods (as given on top of each panel): weighting factors ν i (r i ) of an initial SIP ensemble, the mean weighting factorsν(r), the occurrence frequency of the ν i values and the resulting mass density distributions g ln r are displayed (rows 1 to 4). Row 1 displays data of a single realisation, whereas rows 2 to 4 show averages over 50 SIP ensembles. The bottom row shows the moments λ 0 , λ 1 , λ 2 and λ 3 normalised by the respective analytical value. Every symbol depicts the value of a single realisation. The nearly horizontal line connects the mean values over all realisations. In the displayed examples, κ = 10 in the singleSIP-init, κ = 10, ν critmax ≈ 2.6×10 6 in the multiSIP-init, N SIP = 80 in the ν const , ν draw -init and (α high , α med , α low ) = (−2, −3, −7) in the ν randominits. In top right panel, the dashed horizontal lines indicate the values of N 10 α low , N 10 α med and N 10 α high and the dashed vertical line the threshold radius r lb .
range. The horizontal lines in the top right panel indicate the values of N 10 α low , N 10 α med and N 10 α high and the vertical line the threshold radius r lb .
The second row shows average ν value of all SIPs in a certain size bin. All init techniques are probabilistic and the average is taken over 50 independent realisations of SIP ensembles. Not surprisingly, the average ν of the ν draw method is identical to ν const . Moreover, also for the ν random -init, the average ν value is constant over a large radius range. Only in the right tail do the ν values drop as intended. The third row shows the occurrence frequency of weighting factors.
To display DSDs represented by a SIP ensemble, a SIP ensemble must be converted back into a bin representation. For this, we establish a grid with resolution κ plot = 4 and count each SIP in its respective bin; i.e. SIP i with m bb,l < µ i ≤ m bb,l+1 contributes to bin l via M bin,l = M bin,l + µ i ν i and n bin,l = n bin,l + ν i . We note that all displayed DSDs in this study will use κ = 4, irrespective of the κ value cho-sen in the initialisation. The fourth row shows such DSDs again as an average over 50 SIP ensemble realisations. We find that any init technique is, in general, successful in producing a meaningful SIP ensemble as the "back"-translated DSD matches the originally prescribed DSD (black). Hence, the moments λ k,SIP match the analytical values λ k,anal for 0 ≤ k ≤ 3, as shown in the fifth row. Nevertheless, for the ν const -and ν draw -init, the spread between individual realisations can be large and they deviate substantially from the analytical reference. The singleSIP/multiSIP-init and ν randominit, on the other hand, guarantee that each individual realisation is fairly close to the reference. In the results section, the presented simulations mostly use the probabilistic single-SIP initialisation. Table 3 lists the number of SIPs for several init methods and parameter configurations. The right-most column indicates in which figure the simulations using the specific init method are displayed.

Description of hypothetical algorithm
First, we present a hypothetical algorithm for the treatment of collection/aggregation in an LCM, which would probably yield excellent results. However, it is prohibitively expensive in terms of computing power and memory, as N SIP increases drastically over time until the state is reached where each SIP represents exactly one real droplet. Nevertheless, the presentation of this algorithm is useful for introducing several concepts which will partly occur in the subsequently described "real-world" algorithms.
Whereas condensation/deposition and sedimentation may be computed using interpolated quantities which implicitly assume that all droplets of a SIP are located at the same point, the numerical treatment of collection usually assumes that the droplets of a SIP are spatially uniformly distributed, i.e. well-mixed within the grid box. An approach, where the vertical SIP position is retained in the collection algorithm and the process of larger droplets overtaking smaller droplets is explicitly modelled, is described in Sölch and Kärcher (2010) and not treated here.
Following Gillespie (1972) and Shima et al. (2009), the probability P ij that one droplet with mass m i collides with one droplet with mass m j inside a small volume δV within a short time interval δt is given by where K ij = K(m i , m j ).
For SIPs i and j containing ν i and ν j real droplets in a grid box with volume V , on average, ν coll = P ij ν i ν j collections between droplets from SIP i and SIP j occur. The average rate of such i − j collections (i = j ) to occur is So-called self-collections, collisions of the droplets belonging to the same SIP (i = j ), are described by assuming that the SIP is split into two portions, each containing one-half of the droplets of the original SIP. The factor of 2 originates from the collections of each half, which have to be added to gain the total number of self-collections for SIP i. Accordingly, the diagonal elements of the matrices o ij and O ij differ from the off-diagonal elements by an additional factor of 0.5. In terms of concentrations (represented by SIPs in a grid box with volume V ), we can write for collections between different SIPs and for self-collections.
In the hypothetical algorithm, the weighting factor of SIP i is reduced due to collections with all other SIPs and selfcollections and reads as The droplet mass µ i in SIP i is unchanged. For each i − j combination, a new SIP k is generated: To avoid double counting, only combinations with i ≥ j are considered. The rate equations for the weighting factors can be numerically solved by a simple Euler forward step. The weighting factor of existing SIPs is reduced by leading to or, equivalently, For new SIPs k, we have Per construction, the algorithm is mass conserving subject to rounding errors. In each time step, N SIP,add = N SIP (N SIP − 1)/2, new SIPs are produced and the new number of SIPs is N SIP * = N SIP + N SIP,add . After nt time steps, the number of SIPs would be of order (N SIP,0 ) nt which is not feasible. In the following subsections, algorithms are presented that include various approaches to keep the number of SIPs in an acceptable range.
In the following, the various algorithms are described and pseudo-code of the implementations is given. For the sake of readability, the pseudo-code examples show easyto-understand implementations. The actual codes of the algorithms are, however, optimised in terms of computational efficiency. The style conventions for the pseudo-code examples are as follows: commands of the algorithms are written in upright font with keywords in boldface. Comments appear in italic font (explanations are enclosed by {} and headings of code blocks are in boldface).

Description of the remapping algorithm (RMA)
First, the remapping algorithm is described, as its concept follows closely the hypothetical algorithm introduced in the latter section. RMA is based on ideas of Andrejczuk et al. (2010). We call their approach the "remapping algorithm" as N SIP is kept reasonably low by switching between a SIP representation and a bin representation in every time step. A temporary bin grid with a predefined κ is established which stores the total number n bin, * and total mass M bin, * of all contributions belonging to a specific bin. The bin boundaries are given by m bb, * .
Instead of creating a new SIP k (with number ν k obtained by Eq. (19) and mass µ k = µ i + µ j ) from each i − j combination, the according contribution is stored on a temporary bin grid. More explicitly, this means that the droplet number n bin,l of bin l with m bb,l < µ k ≤ m bb,l+1 is increased by ν k . Similarly, the total mass M bin,l of that bin is increased by µ k ν k . Similarly, the reduced contributions ν * i from the existing SIPs with droplet mass µ i are added to their respective bins. Figure 2 illustrates how a collection process between two SIPs is treated in RMA. In this example, ν k = 2 droplets are produced by collections which have a droplet mass of µ k = µ i + µ j = 15. Instead of creating a new SIP k (as in the hypothetical algorithm), the contribution k is recorded in the bin grid. The droplet number n in bin l3 is increased by ν k = 2 and the according total mass M l3 by ν k µ k = 30. The remaining contribution of SIP i falls into bin l1 and n l1 and M l1 are increased by ν * i = ν i − ν k = 2 and µ i ν * i = 12, respectively. The operation for SIP j is analogous.
At the end of each time step, after treating all possible i − j combinations, a SIP ensemble is created from the bin data with ν i = n bin,l and µ i = M bin,l /n bin,l , which resembles a deterministic singleSIP-init with the resolution κ.
Optionally, a lower threshold ν min,RMA can be introduced, such that SIP i is created only if n bin,l > ν min,RMA holds. However, this may destroy the property of mass conservation which can be remedied by the following.
We pick up the concept of a weak threshold introduced earlier and adjust it such that on average the total mass is conserved (instead of total number as before). We introduce the threshold M critmin = ηλ 1 . The parameter η is set to 10 −8 , which implies that each SIP contains at least a fraction of 10 −8 of the total mass in a grid box. If M bin,l > M critmin , a SIP is created representing ν i = n bin,l drops with single mass µ i = M bin,l /n bin,l . If M bin,l < M critmin , a SIP is created with probability p create = M bin,l /M critmin . In this case, the SIP represents ν i = M critmin /µ i droplets with single mass µ i = M bin,l /n bin,l . Pseudo-code of the algorithm is given in Algorithm (1).
Time steps typically used in previous collection/aggregation tests are around t = 0.1 to 10 s depending inter alia on the used kernel. From Eq. (22) it follows that the time step in RMA must satisfy Otherwise, negative ν values can occur which would inevitably lead to a crash of the simulation. In mature clouds, the Long and Hall kernels attain large values which required tiny time steps of 10 −4 s and were smaller in the first test simulations. To be of any practical relevance, RMA had to be modified in order to be able to run simulations with suitable time steps. Hence, several extensions to RMA allowing larger time steps are proposed in the following.
1. The default version uses the algorithm as outlined in Algorithm (1) (i.e. nothing is changed). Negative ν * i values obtained by Eq. (21) are acceptable, as long as n bin,l , from which the SIPs are created at the end of the time iteration, is non-negative for all l. This means that an existing SIP i (which falls into bin l) can lose more droplets (ν i ) than it actually possesses (ν i ) as long as the gain in bin l (from all suitable SIP combinations) compensates this deficit. We will later see that this approach works well for the Golovin kernel; however, it fails for the Long and Hall kernels. for i = 1 N SIP do 10: Calculate ν * i according to Eq. (22) 11: Select bin l with m bb,l < µ i ≤ m bb,l+1 12: n bin,l = n bin,l + ν * i 13: the ν evaluation of all SIPs, only ν i is updated in the subcycling steps and not the whole system of fully coupled equations is solved for a smaller time step. For sufficiently large η i , ν * i,subcycl is positive, as ν i,subcycl < ν i as desired. Basically, we now assume that all collections involving SIP i are equally reduced by a factor of η i = ν i,subcycl /ν i compared to the default time step. In the GAIN block of the algorithm (as termed in Algorithm 1), all computations use the default time step and no subcycling is applied. To be consistent with the reduction in the LOSS block, Eq. (23) is replaced by 4. By using the reduction limiter (abbreviated as RedLim), the effect of an adaptively reduced time step can be reached with simpler and cheaper means. We introduce a threshold parameter 0 < γ < 1.0 similar to the approach in Andrejczuk et al. (2012). Again, we focus on SIPs with ν * i < 0 and simply set the new weight of SIP i to ν * i,RedLim = γ ν i . As above, all contributions involving SIP i have to be rescaled, now with

Description of average impact algorithm (AIM)
The average impact algorithm by Riechelmann et al. (2012) and further developed in Maronga et al. (2015) predicts the temporal change of the weighting factor, ν i , and the total mass of all droplets represented by each SIP, χ i = ν i µ i . In this algorithm, two fundamental interactions of droplets are considered (see also Fig. 7 in Maronga et al., 2015). First, the coalescence of two SIPs of different sizes is considered. It is assumed that the larger SIP collects a certain amount of the droplets represented by the smaller SIP, which is then equally distributed among the droplets of the larger SIP. As a consequence, the total mass and the weighting factor of the smaller SIP decrease, while the total mass of the larger SIP increases accordingly. Figure 2 illustrates how a collection between two SIPs is treated. SIP j is assumed to represent larger droplets than SIP i, i.e. µ j > µ i . As in the RMA example before, we say that ν k = 2 droplets are collected. Then, SIP i loses two droplets to SIP j ; i.e. ν i is reduced by 2 and a mass of µ i ν k is transferred to SIP j where it is distributed among the existing ν j = 8 droplets. Unlike RMA, where droplets with mass µ j + µ i = 15 are produced, AIM predicts a droplet mass of µ j + µ i ν k /ν i = 10.5 in SIP j .
Usually, ν k /ν i 1, and hence the name "average impact" is given for this algorithm.
Moreover, same-size collisions are considered in each SIP. These decrease the weighting factor of each SIP but not its total mass. Accordingly, the radius of the SIP increases.
Both processes are represented in the following two equations which are solved for all colliding SIPs (assuming that and The first term on the right-hand side of Eq. (25) describes the decrease of ν due to same-size collections; the second term describes the decrease of ν due to collection by larger SIPs. The first term on the right-hand side of Eq. (26) describes the gain in total mass due to collections with smaller SIPs, while the second term describes the loss of total mass due to collection by larger SIPs. Using a Euler forward method for time integration, the above equations read as and Finally, the single droplet mass µ i of each SIP is updated: Pseudo-code of the algorithm is given in Algorithm (2). Apply (adaptive) sorting algorithm, such that µ j ≥ µ i for j > i 7: {Compute total mass χ i of each SIP} 8: {Compute reduction of weighting factor due to number loss to all larger SIPs} 11: {Compute mass transfer; mass gain from all smaller SIPs and mass loss to all larger SIPs} 13:  Figure 3 illustrates how AIM works for an example simulation with the Long kernel and singleSIP-init. The top panel shows the (r i , ν i ) evolution of selected SIPs. The black line shows the initial distribution. Each coloured line connects the data points that depict the (r i , ν i ) pair of an individual SIP every 200 s. Clearly, ν i of any SIP decreases over time; however, the decrease is much smaller for the largest SIPs and becomes zero for the largest SIP. The majority of SIPs starting from the smallest radii show an opposite behaviour, as their evolution is dominated by a strong ν i decrease at nearly constant r i . In contrast, the evolution of the two largest SIPs is dominated by a strong r i increase for constant ν i . The SIPs next to the largest SIPs undergo a transition; in the beginning, they primarily grow in size; towards the end, the decrease of ν i is dominant.
The ratio ϕ r is defined as r i (t = 3600 s)/r i (t = 0 s) and, analogously, ϕ ν = ν i (t = 3600 s)/ν i (t = 0 s). We find ϕ r ≥ 1 and ϕ ν ≤ 1. The bottom panel of Fig. 3 shows the ratios ϕ r (red curve) and (ϕ ν ) −1 (black curve) for all SIPs of the simulation. Both ratios are smooth functions of the initial r i , which is plotted on the x axis. By construction, the number of SIPs remains constant over the course of a simulation. Hence, the number of SIPs per radius or mass interval decreases when the DSD broadens over time. In our example, the SIP resolution becomes coarser, particularly in the large droplet tail.
Negative values of ν new i and χ new i may occur. However, this case never occurred in our manifold tests of the algo-rithm. The behaviour appears more benign than in RMA. Moreover, we found that the algorithm preserved the initial size sortedness of the SIP ensemble. However, for an arbitrary kernel function and initial SIP ensemble, this is not guaranteed, and we recommend to use adaptive sorting algorithms that benefit from partially presorted data sets (Estivill-Castro and Wood, 1992). Adaptive sorting is also advantageous when AIM is employed in real world applications, where sedimentation, advection and condensation changes the SIP ensemble in each individual grid box.

Description of the all-or-nothing algorithm (AON)
The all-or-nothing algorithm (AON) is based on the ideas of Sölch and Kärcher (2010) and Shima et al. (2009) . Figure 2 illustrates how a collection between two SIPs is treated. SIP i is assumed to represent fewer droplets than SIP j , i.e. ν i < ν j . Each real droplet in SIP i collects one real droplet from SIP j . Hence, SIP i contains ν i = 4 droplets, now with mass µ i + µ j = 15. SIP j now contains ν j − ν i = 8 − 4 = 4 droplets with mass µ j = 9. Following Eq. (23), only ν k = 2 pairs of droplets would, however, merge in reality. The idea behind this probabilistic AON is that such a collection event is realised only under certain circumstances in the model, namely such that the expectation values of collection events in the model and in the real world are the same. This is achieved if a collection event occurs with probability in the model. Then, the average number of collections in the model, is equal to ν k as in the real world. A collection event between two SIPs occurs if p crit >rand(). The function rand() provides uniformly distributed random numbers ∈ [0, 1]. Noticeably, no operation on a specific SIP pair is performed if p crit <rand().
The treatment of the special case ν k /ν i > 1 needs some clarification. This case is regularly encountered when the singleSIP-init is used, where SIPs with large droplets and small ν i collect small droplets from a SIP with large ν j . The large difference in droplet masses µ led to large kernel values and high ν k with ν i < ν k < ν j . In addition, the case of ν k being even larger than ν j is not considered, as it occurs only with unrealistically large time steps. If p crit > 1, we allow multiple collections, as each droplet in SIP i is allowed to collect more than one droplet from SIP j . In total, SIP i collects ν k droplets from SIP j and distributes them on ν i droplets. A total mass of ν k µ j is transferred from SIP j to SIP i and the droplet mass in SIPs i becomes µ new i = (ν i µ i + ν k µ j )/ν i . The number of droplets in SIP j is reduced by ν k and ν new j = ν j − ν k . Keeping with the exam-ple in Fig. 2 and assuming ν k = 5, each of the ν i = 4 droplets would collect ν k /ν i = 1.25 droplets. The properties of SIP i and SIP j are then ν i = 4, µ i = 17.25, ν j = 3 and µ j = 9.
Another special case appears if both SIPs have the same weighting factor which regularly occurs when the ν const -init is used. After a collection event, SIP j would carry ν j − ν i = 0 droplets, whereas SIP i would still represent ν i droplets. In this case, half of the droplets from SIP i coalesce with half of the droplets from SIP j and vice versa. Accordingly, both SIPs carry ν new j = ν new i = 0.5 × ν i droplets with mass µ i + µ j . Without this correction, zero-ν SIPs would accumulate over time and reduce the effective number of SIPs, causing a poorer sampling. Instead of this equal splitting, one can also assign unequal shares ξ ν i and (1−ξ )ν i to the two SIPs (with ξ being some random number).
Moreover, self-collections can be considered for kernels with K ii > 0. If 2 p crit >rand(), self-collections occur between the droplets in a SIP (note the factor of 2 due to symmetry reasons). Then, every two droplets within a SIP coalesce, implying ν i = ν i /2 and µ i = 2 µ i .
So far, we explained how a single i − j combination is treated in AON. In every time step, the full algorithm simply checks each i −j combination for a possible collection event.
To avoid double counting, only combinations with i < j and self-collections with i = j are considered. Pseudo-code of the algorithm is given in Algorithm (3). The SIP properties are updated on the fly. If a certain SIP is involved in a collection event in the model and changes its properties, all subsequent combinations with this SIP take into account the updated SIP properties. Similar to the update-on-the-fly version of RMA, results may depend on the order in which the i − j combinations are processed.
For most i − j combinations, p crit is small, and usually only a limited number of collection events occurs in the model; AON may suffer from an insufficient sampling of the droplet space. Actual collections are a rare event in this algorithm. In our standard setup, < 1% of all possible collections occur in the model until rain is initiated by very few lucky SIPs (similar to lucky drops; e.g. Kostinski and Shaw, 2005). Indeed, Shima et al. (2009) reported convergence of AON only for tremendously many SIPs (on the order of 10 5 to 10 6 in a box). We will later see that convergence is possible with as few as O(10 2 ) SIPs if the SIPs are suitably initialised.
Hence, it will be demonstrated that AON is a viable option in 2-D/3-D cloud simulations, as already implied in Arabas and Shima (2013).
As for AIM in Fig. 3, Fig. 4 (top) shows the (r i , ν i ) evolution of selected SIPs for AON. The picture looks more chaotic than for AIM, as each individual SIP has its own independent history due to the probabilistic nature of AON. For the initially smallest SIP, only ν i changes for most of the time, as only collections occur where the partner SIPs have smaller weighting factors ν. Towards the end, the still very small SIP is at least once involved in a collection with a very large SIP that has a larger ν. Hence, r i of this SIP increases substantially. In contrast to the smallest SIP, other initially small SIPs i with similar properties are never part of a collection with ν i < ν j . Hence, their radii r i remain small over the total period, and ν i is the only property that changes. The bottom panel summarises the overall changes in ν i (black) and r i (red) for all SIPs of the simulation. Unlike AIM, where only the initially largest SIPs grow, SIPs from both ends of the spectrum grow in AON. Those SIPs have small ν values in common, and in each collection their mass is updated to m i + m j . The SIPs with initially large ν values lie in the radius range [2 µm, 15 µm] and keep their initial radii (at least in the singleSIP-init used here). The reductions in ν i scatter around ∼ 10 3 for most SIPs and fall off to 1 for the largest SIPs. For the generation of the random numbers, the well-proven (L'Ecuyer and Simard, 2007) Mersenne Twister algorithm by Matsumoto and Nishimura (1998) is used. AON simulations may be accelerated if random numbers are computed once a priori. However, this requires saving millions of random numbers for every realisation. An AON simulation with 1000 time steps and 200 SIPs, for instance, implies 200 × 100 potential collections during 1 time step and in total 2 × 10 7 random numbers. Using random numbers with a smaller cycle length deteriorated the simulation results in several tests and is not recommended.
The current implementation differs slightly from the version in Shima et al. (2009). Due to an unfavourable SIP initialisation similar to the ν const technique, Shima et al. (2009) deal with large N SIP values in their simulations, where it becomes prohibitive to evaluate all N SIP (N SIP − 1) SIP combinations. Hence, they resort to N SIP /2 randomly picked i − j combinations, where each SIP appears exactly in one pair (if N SIP is odd, one SIP is ignored). As only a subset of all possible combinations are numerically evaluated, the extent of collisions is underestimated. To compensate for this, the probability p crit is upscaled with a scaling factor N SIP (N SIP −1)/(2 N SIP /2 ) to guarantee an expectation value as desired.
Moreover, in Shima's formulation, the weighting factors are considered to be integer numbers. In contrast, we use real numbers ν which can even attain values below 1.0. This has several computational advantages: (1) better sampling of the DSD, in particular at the tails, (2) simpler AON implementation with fewer arithmetic and rounding operations and (3) more flexibility, e.g. SIP splitting with real-valued ξ in the case of identical weighting factors.
The study of Sölch and Kärcher (2010) makes use of the vertical position of the SIPs and explicitly calculates whether or not a larger droplet overtakes a smaller droplet within a time step. This approach will be thoroughly analysed in a follow-up study.
In RMA and AIM, SIPs with negative weights may be generated depending, e.g. on the condition t N SIP j =1 o ij > 1 in RMA. By construction, this cannot happen in AON and the latter condition implies that N SIP j =1 p crit,ij of SIP i is greater than unity. Then, this SIP is likely to be involved in several collections (for j with p crit,ij < 1) or is involved in one or several multiple collections (for j with p crit,ij > 1).

Box model results
In this section, box model simulations of the three algorithms introduced in the latter section are presented, starting with the results of RMA, then those of AIM and finally AON. The results of each algorithm are tested for three different collection kernels (Golovin, Long and Hall). As default, probabilistic SIP initialisation methods are used. For each parameter setting, simulations are performed for 50 different realisations. Simulations with the Golovin kernel are compared against the analytical solution given by Golovin (1963). Consistent with many previous studies, we choose b = 1.5 m 3 kg −1 s −1 . Simulations with the Long and Hall kernels are compared against high-resolution benchmark simulations obtained by the spectral-bin model approaches of Wang et al. (2007) and Bott (1998). The volume of the box is assumed to be V = 1 m 3 . In all simulations, collision/coalescence is the only process considered in order to enable a rigorous evaluation of the algorithms. The evaluation is based on the comparison of mass density distributions and the temporal development of the zeroth, second and third moments of the droplet distributions. The first moment is not shown since the mass is conserved in all algorithms per construction. The Supplement contains a large collection of figures that systematically report all sensitivity tests that have been performed. The behaviour of the second and third moments is similar, and the λ 3 evolution is shown only in the Supplement. Later, it will be mentioned that Hall kernel simulations are not as challenging as Long kernel simulations from a numerical point of view. Hence, simulation with the Hall kernel are only shortly discussed in the paper and figures are shown in the Supplement.  panel shows an excellent agreement of RMA with the reference solution and proves at least a correct implementation. Figure 6 compares the temporal evolution of the moments. Moreover, the first row shows the number of SIPs used in RMA. Except for the case with a very coarse grid (κ = 5) with fewer than 40 SIPs in the end, the regular RMA results shown in the left column agree perfectly with the reference solution irrespective of the chosen κ (≥ 10) and minimum weak threshold η ranging from 10 −5 to 10 −8 . The number of non-zero bins increases as the DSD broadens over time. In the last step of the time iteration, SIPs are created from such bins. Hence, their number increases over time. Using a strict threshold, the total mass is not conserved; the larger η is, the more mass is lost (see the Supplement). Hence, using a weak threshold or some other measure (e.g. creation of a residual SIP containing contributions of all neglected bins) to avoid this is highly recommended.

Performance of RMA
Next, RMA simulations with the Long kernel are discussed. As already mentioned, the default RMA version would require tiny time steps which would rule out RMA from any practical application. Both approaches introduced before, update on the fly (OTF) and reduction limiter (RedLim), succeed in eliminating negative ν i values and in finishing the simulation within a reasonable time. However, the results are not as desired. Figure 7 shows the DSDs for a simulation with the reduction limiter γ = 0.1, weak threshold η = 10 −8 , κ = 20 and t = 0.1 s. Whereas the algorithm is capable of realistically reducing the number of the smaller If we average over 50 realisations (as usually, left panel) or use a coarse grain visualisation (as usually with κ plot = 4, middle panel), the oscillations are smoothed out (or masked). Nevertheless, the formation of the rain mode is impeded; probably the mass flux across the problematic radius range is too slow, which is a direct consequence of applying the reduction limiter (mostly SIPs in this part of the spectrum obtain negative weights and have to be corrected).
We tested the algorithm for many parameter settings varying all of the aforementioned parameters: t ∈ [0.01 s, 1 s], κ ∈ [5, 100], γ ∈ [0, 1] and η ∈ [10 −15 , 10 −5 ]. Figure 8 shows the evolution of the zeroth and second moments for various t values (at κ = 10, left column) and κ values (at t = 0.1 s right column). Obviously, the simulation results are nearly insensitive to the bin resolution (as long as κ ≥ 10); however, the higher moment does not come close to the reference value. The effect of a t variation is more substantial. Decreasing t, the total droplet numbers become smaller and the λ 2 values become larger, both lead- ing to a better agreement. Despite using already a very small time step of 0.01 s in the end (we will later see that AIM and AON produce reasonable results for t = 10 s), the agreement with the reference solution is still not perfect.
Hence, our RMA implementation is not capable of producing reasonable results for the Long kernel. It is not clear whether the oscillations are inherent to the original RMA or caused by the introduction of the Reduction Limiter. The latter might introduce discontinuities which could trigger instabilities.
At least, the Golovin RMA simulations with the reduction limiter do not show any signs of instability and agree well with the reference. However, this is not surprising. Clearly, the RedLim correction is only performed for SIPs, where negative weights are predicted. In Golovin simulations, this happens less frequently than in Long simulations. Only, in the very end, the abundance of the largest droplets is underestimated (see top right panel in Fig. 5) and the increase of the higher moment levels off slightly (middle column of Fig. 6). Basically, the application of the RedLim correction, which rescales ν i , can be interpreted as an artificial reduction of the time increment (see Eq. 20) and hence slows down the growth of all corrected SIPs.
Another RMA variant uses update on the fly, which also effectively eliminates negative weights. Such Golovin RMA simulations can be close to the reference; however, the results depend on the order in which the SIP combinations are processed. If collections between the smallest SIPs are treated first within each time iteration (OTF s ), then the growth of the largest droplets is too slow (see bottom left panel in Fig. 5). Starting the processing with collections between the largest SIPs (OTF l ), the DSDs are as desired (see bottom right panel in Fig. 5) and the moments agree perfectly with the reference if κ is sufficiently large (see right column of Fig. 6). The update on the fly has the strongest impact on those SIPs where the regular version would predict negative weights. With OTF, the weights of such SIPs strongly decrease during one time iteration, and hence the continuous evaluations of the O ij values depend on the order in which the SIP combinations are processed.
Long kernel simulations with OTF l yield results qualitatively similar to the RedLim version (see the Supplement) and spurious oscillations still appear in the DSDs.
Note that the Golovin simulations used r critmin = 1.6 µm, whereas the Long simulations used r critmin = 5.0 µm (note the truncated left tail in the DSDs in Fig. 7). A higher r critmin value reduces the SIP number and the computational effort and made simulations with small time steps possible. The simulated λ values are insensitive to the choice of r critmin (see the Supplement).
We conclude that, for time steps feasible in operational terms, none of the tested RMA implementations are capable of producing reasonable results with the Long kernel. Andrejczuk et al. (2010) introduced and evaluated RMA and applied it in a simulation of boundary layer stratocumulus. Our findings are seemingly in conflict with the conclusions of their evaluation exercises. What both studies have in common is a similar trend for a κ variation. In their Fig. 13, simulations for κ ranging roughly from 4 to 30 are depicted. The simulations with many bins show oscillations, whereas the coarsest simulation has no oscillations but is clearly far from the real solution (largest droplets around 40 µm compared to 500 µm in the reference simulation). In their Fig. 14, they presented a detailed sensitivity test only for a κ = 4 simulation, which downplays the severity of the oscillation issue. Moreover, their simulations ran up to 2000 s compared to 3600 s in this study and many other studies (e.g. Bott, 1998;Wang et al., 2007). Hence, they missed the regime where the effect of the oscillations is strongest. Despite our extensive tests, we cannot exclude that in Andrejczuk et al. (2010) an RMA implementation was used where oscillations were less cumbersome; however, the study missed to demonstrate this for a conclusive test case and we have come to the conclusion that the evaluation exercises were incomplete and not suited to reveal the deficiencies faced here.
RMA simulations with the Hall kernel are similarly corrupted by oscillations and do not produce useful simulations either (not shown). Figure 9 displays DSDs obtained by AIM for the Golovin kernel. Compared to the reference, the droplets pile up at too-small radii and the algorithm is not capable of reproducing the continuous shift to larger sizes, even if a fine grid with κ = 200 (right) instead of κ = 40 (left) is used. For both κ values, the increase of the higher moments proceeds at a rate that is too low (see Fig. 10), whereas the decrease in droplet number matches the analytical evolution. AIM is a very robust algorithm in the sense that the results are fairly insensitive to most numerical parameter variations as demonstrated for κ and t in the left column of Fig. 10. Most simulations converge to -what we call -the best AIM solution, which is, however, not identical to the correct solution. The results deteriorate slightly if the initial SIP ensemble is generated with the ν const -init or ν draw -init instead of with the singleSIP-init (right column of Fig. 10).

Performance of AIM
The algorithm performs, in general, better for the Long and Hall kernels, as is detailed in the following. Figure 11 displays DSDs obtained by AIM for the Long kernel. Generally, the results are in good agreement with the reference solution, as long as the SIP ensemble is initialised with the singleSIP-init method (left and middle column). Towards the end of the simulated period (magenta and cyan lines), the removal of small droplets is a bit underestimated and too many small droplets are present. For t = 30 and 40 min, the large droplet mode is too weak, as not enough large droplets have formed. At that stage, the droplets grow rapidly by collection and the AIM results lag behind. Although the offset is less than 5 min, it might become crucial in simulations of short-lived clouds. Also, the evolution of the moments (see Fig. 12) confirms this, as the onset of the rapid changes at around t = 30 min is only slightly retarded if parameters are suitably chosen. Towards the end, the AIM results get again very close to the reference solution. The left column of Fig. 12 shows the dependence on the time step. For time steps t ≤ 20 s, all results are similar to the best AIM solution which is close to the reference. Time steps of 50 s and more do not produce good enough results. Moreover, AIM is fairly insensitive to the choice of κ, r critmin and ν critmin . Simulations with κ ranging from 10 to 100 yield similar results (see middle column). Only, for a very coarse resolution (κ = 5) with 25 SIPs, the decrease in droplet number is too small. Increasing the lower cut-off radius r critmin from 0.6 µm to 5 µm, the r < 5 µm part of the DSD is represented by a single SIP and N SIP is reduced by 60% (see Table 3). The predicted moments are unaffected by this variation (see the Supplement). Those small-r i SIPs are not relevant for the AIM performance. They simply carry too-small fractions of the total grid box mass to be important. Their status will not change over time, as already illustrated in Fig. 3. Similarly, a variation of ν critmin or the switch to a strict threshold ν critmin has no effect (see the Supplement). Now we draw the attention to the importance of the SIPinit method. The right panel of Fig. 11 shows the DSDs when the SIPs are initialised with the ν const -init method. The algorithm completely fails and no droplets larger than 70 µm occur after 60 min. Consequently, the moments are far off from the reference solution (solid lines in the right column of Fig. 12). Switching to the ν draw -init method (dotted lines) or using many more SIPs (up to 1600) improves the results, yet they are still useless. This clearly demonstrates how crucial the initial characteristics of the SIP ensemble are. By initialising the SIPs with an appropriate technique like the singleSIP-init, useful results are obtained with as few as 50 SIPs. Using the ν const -init or ν draw -init, on the other hand, solutions are still useless, even though the number of SIPs and the computation time are factors of 30 and 900 higher, respectively.
The ν random simulations give another example of the importance of the init method. Even though both techniques, ν random,rs (dashed line) and ν random,lb (dash-dotted line), are similar in design and differ only in the creation of the largest SIPs (see Fig. 1), the outcome of the simulations is quite different. For the ν random,lb -init, the solution matches the best AIM solution, whereas for ν random,rs the moment λ 2 stagnates at a level that is too low. The latter test pinpoints the main weakness of AIM, which is also reflected in its name (average impact). The initial weighting factors of those initially largest SIPs (in relation to ν of the remaining SIPs) controls how strong this growth is and how the large droplet mode emerges. The curves depict the AIM results (averages over 50 realisations). The default settings are probabilistic singleSIP-init with weak threshold η = 10 −9 , κ = 40 and t = 10 s. The left column shows a variation of t (see legend) and the middle column a variation of κ (see legend). The right column displays simulations with various initialisation techniques: the ν const -init (solid) and ν draw -init (dotted) with various N SIP values (see legend) as well as the ν random,rs -init (green dashed) and ν random,lb -init (green dashdotted).
All quantities shown in Figs. 10 and 12 are averages over 50 realisations of the initial SIP ensemble. All individual realisations yield basically identical simulation results, and it would have been sufficient to carry out and display simulations of a single realisation.
Next, simulations with the Hall kernel are shortly discussed (figures are only shown in the Supplement). Compared to the Long simulations, the reference solution reveals that small droplets are much more abundant, as the collection of small droplets proceeds at a lower rate. This makes the simulation less challenging from a numerical point of view and AIM DSDs come closer to the reference than in the Long simulations. Consequently, the AIM moments agree very well with the reference. For t ≤ 20 s and κ ≥ 20, all solutions are similar to the best AIM solution. Figure 13 shows the AON results for the Golovin kernel. An excellent agreement with the reference solution is found, which proves at least the correct implementation of AON. Switching to a version without multiple collections (i.e. SIP i collects at most ν i droplets in every time step) does not affect the solution, as cases with p crit > 1 ⇔ ν k > ν i occur rarely. The AON moments closely follow the reference solution, even when the time step is increased from 1 to 10 s or fewer SIPs are used by decreasing κ from 40 to 10 (left column of  The curves depict the AON results (averages over 50 realisations). The default settings are probabilistic singleSIPinit with weak threshold η = 10 −9 , κ = 40 and t = 1 s. Left column: default simulation (red), larger time step ( t = 20 s, blue) and fewer SIPs (κ = 10, brown). Right column: ν const -init (brown) and ν draw -init (blue). Fig. 14). Unlike AIM, AON is successful, even when the initial SIP ensemble is created with the ν const -init or ν draw -init (right column of Fig. 14). Figure 15 displays DSDs of an AON simulation for the Long kernel. The simulations exhibit large differences between individual realisations which deserve a closer inspection. The top row shows DSDs of two specific realisations. The * symbol depicts the g value for each bin. Those symbols are connected by default. An interruption of the connecting line indicates one or more empty bins (g = 0) where no SIPs exist in this specific radius interval. This occurs frequently due to the broadening of the DSD. The solutions are Figure 15. Mass density distributions obtained by AON for the Long kernel from t = 0 to 60 min every 10 min (from black to cyan; see legend). The dotted curves show the reference solution; the solid curves show the AON simulation results. The top row shows two specific realisations (each * symbol depicts a non-zero g value). Rows 2 and 3 show averages over 50 and 500 realisations: the left column uses the same format as all DSD plots before. The right column depicts the final DSD at t = 60 min together for each bin; the interquartile range is determined and depicted by diamonds and a dashed bar. If there is only one (or no) diamond(s) in a bin, the 25th (and the 75th) percentile(s) is (are) too small to be visible. The settings are probabilistic singleSIP-init with η = 10 −9 , κ = 40 and t = 20 s.

Performance of AON
full of spikes and irregularly over-and undershoot the reference solution, particularly in the large droplet mode. The small droplet mode is underestimated in the first realisation and overestimated in the second realisation, for instance. The advantages of AON become apparent when the DSDs are averaged over many realisations, as shown in rows 2 and 3. Then, the DSDs come close to the reference solution (left column) and the interquartile range indicates the broad envelope the individual realisations span around the reference solution (right column). Whereas the average over 50 realisations still has some fluctuations (row 2), the average over 500 realisations produces a smooth solution (row 3).
There are two sources that are potentially responsible for the large ensemble spread: the probabilistic SIP initialisation and the probabilistic AON approach. In a sensitivity test, 50 realisations are computed, all using the same SIP initialisation obtained by a deterministic singleSIP-init. Figure 16 compares those simulations to regular simulations with differing SIP initialisations. In both cases, we find a substantial ensemble spread. Starting with identical SIP initialisations, the spread in terms of interquartile range is, however, somewhat smaller, suggesting that both sources contribute to the ensemble spread. Figure 17 shows AON results with 50 realisations and probabilistic initialisation which gives a good trade-off between computational cost and representativeness. Clearly, AON DSDs are less smooth than those of AIM. Column 1 shows a default simulation with singleSIP-init and shows very good agreement with the reference solution. By disenabling multiple collections (column 2), far too few small droplets become collected and their abundance is substantially overestimated. As a consequence, the mass transfer from small to large droplets is slowed down, and the large droplet mode is underestimated. Using the ν const -init, the large droplet mode is not well-matched and results are again useless. Figure 18 shows the temporal evolution of moments λ 0 and λ 2 for a large variety of sensitivity tests. Column 1 shows a variation of t for the singleSIP-init. The larger the t that is chosen, the more often combinations with p crit > 1 occur and the more crucial it becomes to consider multiple collections. Even for the smallest time step considered, the version without multiple collections does not collect enough small droplets and hence overestimates droplet number. With the regular AON version considering multiple collections, reasonable results are obtained for time steps t ≤ 20 s. Column 2 shows a variation of κ for singleSIP-init. Whereas the higher moments perfectly match the reference, the droplet number shows a non-negligible dependence on κ. For κ < 100, droplet number decrease is faster, the finer the resolution is. For κ ≥ 100, a variation of κ has no effect; hence, convergence is reached. However, those simulations underestimate the droplet number. Best results are obtained for an intermediate resolution of κ = 40. Using the multiSIP-init, the simulations show the same undesired behaviour (see left panel of Fig. 19). Hence, increasing the SIP concentration in the middle part of the initial DSD has no positive effect despite using around 160 % more SIPs (see N SIP values listed in the figure legend). In another experiment, a hybrid singleSIP-init was used. Below r = 16 µm, SIPs are initialised as usually with the prescribed κ. Above this radius, a high resolution with κ = 100 is always used irrespective of the chosen κ. Clearly, more SIPs are initialised with this hybrid version relative to the original version (see N SIP values listed in the figure legend). The middle panel of Fig. 19 shows the droplet number evolution for the original singleSIP-init and the new hybrid version. The sensitivity to κ is basically suppressed when the hybrid version is used. This implies that AON is more or less insensitive to the resolution in radius range r < 16 µm; however, it is sensitive to the SIP resolution in the right tail. For example, the κ = 5 simulation with the hybrid version and 87 SIPs performs better than the κ = 20 simulation with the regular init and 98 SIPs.
In the conventional version, SIPs are initialised down to a radius of 0.6 µm (as can be seen in the top left panel of Fig. 1). Another variation of the singleSIP-init is shown in the right panel of Fig. 19 where this lower cut-off radius is raised to 1.6 µm and around 25 % fewer SIPs are used to de- scribe the DSD. The simulation results are basically identical to the conventional init version and suggest that those initially small-r i , small-ν i SIPs are not relevant for the performance of AON.
Further tests with the singleSIP-init include a variation of the threshold parameter η and a switch from weak thresholds to strict thresholds. Moreover, we investigated the implications of update on the fly of the SIP properties. The singleSIP-init produces an initially radius-sorted SIP ensemble, and looping over the i − j combinations in the algorithm starts with combinations of the smallest droplets, which may introduce a bias. We reversed the order (i.e. started with largest droplet combinations) or randomly rearranged the order of the SIP combinations. None of those variations had a significant effect on the ensemble-averaged results (see the Supplement). The latter insensitivity is in contrast to the RMA behaviour. The reason for this is the comparably small number of SIP combinations that actually result in collections, as well as probabilistic determination of these combinations. This prevents any pronounced bias due to size sorting. Moreover, AON does not preserve the size sortedness of the SIP list (cf. Fig. 4).
Finally, the AON performance for other SIP initialisations is discussed (right column of Fig. 18). As already demonstrated in Fig. 17, AON is not able to produce a realistic Figure 19. Droplet number as a function of time obtained by AON for the Long kernel. The black symbols show the moments of the reference solution. In each panel, the dotted curves depict the results with the regular singleSIP-init as already shown in column 2 of Fig. 18. The solid curves depict results with a modified initialisation: the right panel shows results with the multiSIP-init, the middle column with the hybrid init and the right column with the singleSIP-init with r critmin = 1.6 µm. Each panel shows results for various κ values (see corresponding legend). The hybrid version uses κ = 100 for radii above 15 µm and κ as labelled for radii below 15 µm. The multiSIP-init and hybrid version use more SIPs than the regular singleSIP-init. An r critmin increase leads to a N SIP reduction (see listed N SIP values in the plots for a comparison).
large droplet mode if a moderate number of SIPs are initialised with the ν const technique. Hence, the higher moments are underestimated and droplet number is overestimated. By increasing the number of SIPs up to 1600, the solutions get closer to the reference, yet the agreement is still not satisfactory. The performance for the ν draw -init is similar. Keeping in mind the previous sensitivity studies (hybrid singleSIP-init, multiSIP-init), it is apparent that the ν const -init and ν draw -init suffer from an undersampling of the initially largest droplets. Due to its simplicity, using constant weights for initialisation has been a common approach in previous 3-D LCM cloud simulations (Shima et al., 2009;Hoffmann et al., 2015). Hence, we tested AON extensions aiming at a better performance for such equal weight initialisations.
Let us consider the possible weighting factors the SIPs can attain in the course of a simulation. In the beginning, all SIPs have ν = ν init . After a collection event, for both involved SIPs ν = ν init /2. If such a ν = ν init /2-SIP collects a ν = ν init -SIP, both SIPs carry ν init /2 droplets. Subsequent collections can generate SIPs with weighting factors ν init /4, 3 ν init /4 and so on. It may be advantageous if AON generates a broader spectrum of possible ν values and produces SIPs with smaller weights more efficiently. So far, the equal splitting approach with ξ = 0.5 in a collection event of two equal-ν SIPs has been used. In sensitivity tests, a random number for ξ is drawn in each collection event, either from a uniform distribution ξ ∈ [0, 1] or from a log-uniform distribution ξ ∈ [10 −10 , 10 0 ]. Enhancing the spread of ν values, more collection events occur in the algorithm, as p crit is larger when small-ν SIPs are involved. Once most SIPs were part of a collection event, the first option with ξ ∈ [0, 1] produces a distribution of ν values that is similar to the initial ν distribution of the ν draw -init technique and further equal weights combinations are unlikely to occur. Hence, the new version does not improve the simulation results, as the outcome for the ν draw -init and the standard ν const -init are similar (see the Supplement). Other variations produce smaller weights with ξ = 10 −10 rand() or ξ = 10 −10 rand() 2 , yet without any noticeable improvement in the simulation results (see the Supplement).
To complete the analysis for the Long kernel, the right column of Fig. 18 shows simulation results for ν random,lb and ν random,rs . In short, AON can cope with those initialisations and produces useful results.
As already noted in the AIM section, Hall simulations are not as challenging as Long simulations from a numerical point of view. As the collection of small droplets proceeds at a lower rate for the Hall kernel, disenabling multiple collections in the AON simulations does not deteriorate the results as much as in the Long simulations (see the Supplement). Besides this, simulations with the Hall kernel led to similar conclusions as for the Long simulations and are therefore not discussed in more detail.

Discussion
The presented box model simulations can be regarded as a first evaluation step of collection/aggregation algorithms in LCMs. The final goal is the evaluation in (multidimensional) applications of LCMs with full microphysics. In order to isolate the effect of collection, other microphysical processes like droplet formation and diffusional droplet growth have been switched off, and all box model simulations started with a prescribed SIP ensemble following a specific exponential distribution. In Sect. 4.1, the performance of the different algorithms is compared and we summarise the findings from Sect. 3. Section 4.2 discusses implications of our results and provides further insights.

Summarising comparison of the algorithms' performance
The initialisation techniques for the SIP population generation are mostly probabilistic and, by default, each simulation was performed for 50 different realisations. For RMA and AIM, we found the ensemble spread to be small; hence, a single realisation is as good as the ensemble mean. AON is inherently probabilistic, and we highlighted the substantial ensemble spread. Reasonable results are only obtained only by averaging over many realisations. One may argue that this precludes the usage of AON in real-world applications, as it is not feasible to run 50 realisations in each grid box of a 2-D/3-D model simulation. However, we are not that pessimistic. In such simulations, many grid boxes have similar atmospheric conditions and averaging will occur across such grid boxes. We made a similar experience in simulations of contrail cirrus, where we tested the N SIP sensitivity of the deposition/sublimation process (see Sect. 3.1 in Unterstrasser and Sölch, 2014). We found that very few SIPs per grid box sufficed to reach convergence, even though the few SIPs in a single grid box could not realistically represent a smooth DSD, and reasonable DSDs could only be obtained by averaging over several grid boxes. RMA simulations for the Long kernel require around a factor of 1000 smaller time steps than the respective AON and AIM simulations ( t = 0.01 s versus 10 s). Using the Long kernel, rapid collection growth occurs in a certain size range. In RMA, this puts a strong constraint on the time step (see Eq. 24). In AON, the inclusion of multiple collections allows simulating the rapid growth without the need to reduce the time step. Without multiple collections, the AON requirements on t would be similar to RMA. AIM seems to be unaffected by rapid collections resulting in negative weighting factors as observed in RMA. The reason for this might originate from AIM's typical behaviour. If large, and therefore most effectively collecting, SIPs are produced at all, they will exhibit very small weighting factors. This property reduces the potentially hazardous impact of multiple collections at larger time steps in the tested setups. However, this might not be a universal feature of AIM.
If the initial SIP ensemble is created with the singleSIPinit, 50 to 100 SIPs are needed for convergence in any of the three algorithms. This value is similar to the number of bins used in traditional algorithms for spectral-bin models (Bott, 1998;Wang et al., 2007).
For a given N SIP , the number of floating point operations performed in one time iteration is roughly similar for all three algorithms but depends ultimately on details of the implementations. The RMA RedLim variant is, e.g. more demanding than its OTF counterpart. In AON, the generation of the random numbers needs a non-negligible share of the computing time.
The time complexity of all presented algorithms is O(N 2 SIP ), as computations are carried out for all pairwise combinations of SIPs. A linear sampling approach, as introduced by Shima et al. (2009), which processes only N SIP /2 SIP pairs, has complexity O(N SIP ) and can be applied in RMA or AON. However, more SIPs may be required to reach convergence, and in full microphysical models this may slow down the calculation of all other microphysical processes (which have usually linear time complexity).
All in all, the time step t, which controls the number of iterations, is the most critical parameter for the computing time.

Implications and further insights
In this section, we provide further insight and discuss the implications from the box model tests. Since our results have been gained with typical assumptions for warm clouds, we discuss their representativeness for ice clouds.
The evaluation of different initialisation methods showed that the performance of the collection/aggregation approaches depends essentially on the way the SIPs are initialised, a problem which is inherently absent in spectralbin models. Their initialisation resembles the singleSIP technique used here; i.e. the number concentration (the weighting factor) within a bin (for a certain mass range represented by one SIP) is directly prescribed. However, LCMs exhibit a larger variety of how an initial droplet spectrum can be translated into the SIP space. The study showed that the sin-gleSIP is advantageous for the correct representation of the collisional growth, since they initialise large SIPs with small weighting factors, which are responsible for the strongest radius growth. On the other hand, the ν const initialisation technique, in which all SIPs have the same weighting factor initially, as it is done in many current (multidimensional) applications of LCMs, significantly impedes the correct representation of collisional growth.
In this idealised study, we were able to control (to a certain extent) the representation of droplet spectra by various initialisation methods. In more-dimensional simulations with full microphysics, however, this is not straightforward nor has it been intended. So far, convergence tests in realworld LCM applications simply included variations of the SIP number and have not focused on more detailed characteristics of the SIP ensemble (i.e. the properties that have been discussed in Fig. 1). Droplet formation and diffusional droplet growth, which usually create the spectrum from which collisions are triggered, should be implemented such that good SIP ensembles are generated or evolve before collection becomes important. Here, "good" refers to a SIP ensemble for which the collection/aggregation algorithm performs well. For instance, the basic idea of the ν random initialisation technique (weighting factors are uniformly distributed in log(ν)) might also improve multidimensional simulations.
Generally, the performance of the algorithms is better when the SIP ensemble features a broad range of weighting factors. One viable option to achieve this is the introduction of a SIP splitting technique (Unterstrasser and Sölch, 2014). How this may improve the performance of the collection/aggregation algorithms is outlined next.
Mass fractions represented by individual SIPs, χ i , are analysed. χ i is defined as χ i /M; i.e. the total droplet mass in a SIP χ i is normalised by the total mass within the grid box M. Figure 20 shows the initial χ i values of all SIPs as a function of their initial radius r i . Results are shown for AIM and AON with the singleSIP-init method and two bin resolutions κ = 20 and 100. This corresponds to 99 and 493 SIPs for the specific realisation depicted here. The two rows show the same data using a logarithmic (top row) or linear y scale (bottom). The log-scale version highlights that χ i values spread over many orders of magnitudes. Mainly, the parameter ν critmin controls the minimum value of χ i . The Figure 20. Normalised SIP mass χ i as a function of the initial SIP radius r i . χ i is defined as = χ i /M = (ν i µ i )/M; i.e. the total droplet mass in a SIP is normalised by the total mass within the grid box. χ init denotes χ i of the initial SIP ensemble. χ max denotes the maximum χ i value each SIP attains over the course of a simulation. The left and right panels show AIM/AON simulations with κ = 20 or 100 (see legend). Both algorithms use the singleSIP-init and t = 10 s. The plots show results from a single realisation.
heaviest SIPs carry initially up to 6.5% (κ = 20) or 1.2 % (κ = 100) of the total mass M (see bottom row). Clearly, the values of the κ = 20 simulation are larger, as the total mass is distributed over fewer SIPs. For each SIP, χ i is tracked over time and the maximum value, χ i,max(t) , is recorded (red and brown curves in the graphs). Characteristically of AIM, only the largest SIPs grow substantially and collect mass from other SIPs. Hence, only χ i of those SIPs increases. In addition, this also illustrates that the χ i values of the smallest SIPs are so small that all those SIPs can be merged into a single SIP without changing the AIM outcome (see r critmin variation before). Using the fine resolution (κ = 100), heavy SIPs (i.e. those with largest χ i ) carry up to 10 % of the total grid box mass at some point in time. In the κ = 20 simulation, this ratio can be higher than 50 %, meaning that one specific SIP accumulated more than 50 % of the total grid box mass at some time. Hence, the grid box mass is distributed fairly unevenly over the SIP ensemble. Astonishingly, this has no effect on the performance of AIM as the predicted λ k,SIP values for both AIM simulations are basically identical (see middle column of Fig. 12). In AON simulations, we similarly find that the grid box mass is unevenly distributed over the SIP ensemble. Different from AIM, many initially small SIPs and a few initially medium-sized SIPs also carry a relevant portion of the grid box mass at some time. The algorithms may converge better if those heavy SIPs are split into several SIPs during the simulation.
In all simulations so far, the mean radius of the initial DSD was 9.3 µm. Then, the abundance of droplets larger than around 10 µm drops strongly, which poses a challenge to representing this part of the droplet spectrum in SIP space. In a sensitivity test, we start with more "mature" DSDs. The simulations are initialised with the reference solution from Wang et al. (2007) after t init = 10, 20 or 30 min (cf. red, green and blue solid curves in previous plots of mass density distributions) using the singleSIP-init. Figure 21 shows λ 0 and λ 2 of the DSD for AIM and AON for t init = 20 min and the default t init = 0 min (cases t init = 10 and 30 min are shown in the Supplement). The initial DSD is broader for a later initialisation time, and hence more SIPs are initialised for a given κ (see Table 3 for the resulting N SIP values). This implies in particular that the spectrum above 10-20 µm is sampled with more SIPs. For both algorithms, the simulation results are close to the reference solution. Compared to the default t init = 0 case, a much weaker κ dependence of the AONpredicted droplet number is apparent, and the AIM results do not lag behind. Even though this sensitivity test cannot be repeated for other init methods (as they require an analytical description of the initial DSD), the singleSIP-init simulations already indicate that the SIP initialisation is not as crucial when a later initialisation time is chosen and that our default setup with a narrow DSD may overrate the importance of the SIP initialisation. What are the implications of this for simulations with full microphysics? Clearly, the t init = 20 and 30 min case oversimplify the problem, as such DSDs cannot be produced by diffusional growth only. The t init = 10 min DSD, on the other hand, is still close to the t init = 0 min DSD and may be produced by diffusional growth. RMA simulations with non-zero t init again show spurious oscillations and fail to predict the higher moments correctly (see the Supplement).
In multidimensional models, collection/aggregation might be further influenced by the movement of SIPs due to sedimentation or flow dynamics. For instance, sedimentation removes the largest SIPs with the potentially smallest weighting factors, while turbulent mixing may add SIPs with their initial weighting factor into matured grid boxes, where collection has already decreased the weighting factors of the older SIPs. Indeed, the additional variability in moredimensional simulations might compensate for the missing variability in the weighting factors usually present in simulations using the ν const initialisation technique.
It is not clear which findings of our evaluation efforts are the most relevant aspects that control the performance of collection/aggregation algorithms in more complex LCM simulations. Nevertheless, the idealised box simulations are an essential prerequisite towards more comprehensive evaluations, as they disclosed the potential importance of the SIP initialisation (an aspect that is inherently absent in spectralbin models). All in all, we can state that the behaviour of Lagrangian collection algorithms in more complex simulations demands further investigation. Nevertheless, we have already learned a lot from the box model simulations. A summary will be given in the concluding section.
Besides the academic Golovin kernel, our simulations used the hydrodynamic kernel with collection efficiencies that are usually employed for warm clouds (Long and Hall). We found that Hall simulations are not as challenging as Long simulations from a numerical point of view. For ice clouds, usually a constant aggregation efficiency E a (the analogon to collection efficiency E c ) is chosen, partly due to the lack of better estimates (Connolly et al., 2012). AON simulations with E a = 0.2 indicated that using a constant efficiency makes the computational problem less challenging; e.g. we find a smaller sensitivity to κ compared to the Long simulations shown in Fig. 18 (see the Supplement). Hence, the presented algorithms can be equally employed for aggregation. Certainly, the assumption of spherical particles used here is overly simplistic for ice clouds, in particular if aggregates form. However, including mass-area relationships (e.g. Mitchell, 1996;Schmitt and Heymsfield, 2010) in the kernel expression and using parametrisations of ice crystal fall speed (e.g. Heymsfield and Westbrook, 2010) should not change the nature of the problem.

Conclusions
In the recent past, Lagrangian cloud models (LCMs), which use a large number of simulation particles (SIPs, also called super droplets in the literature) to represent a cloud, have been developed and become more and more popular. Each SIP represents a certain number of real droplets; this number is termed the weighting factor (or multiplicity) of a SIP. In particular, the collision process leading to coalescence of cloud droplets or aggregation of ice crystals is implemented differently in the various models described in the literature. The present study evaluates the performance of three different collection algorithms in a box model framework. All microphysical processes except collection/aggregation are neglected, and an exponential droplet mass distribution is used for initialisation. The box model simulation results are compared to analytical solutions (in the case of the Golovin kernel) and to a reference solution obtained from a spectral-bin model approach by Wang et al. (2007) (in the case of the Long or Hall kernels).
LCMs exhibit a large variety of how an initial droplet spectrum can be translated into the SIP space and various initialisation methods are thoroughly explained. The performance of the algorithms depends crucially on details of the SIP initialisation and various characteristics of the initialised SIP ensemble (an issue that is inherently absent in spectral-bin models and has not been paid much attention in previous LCM studies).
The remapping algorithm (RMA, based on the ideas of Andrejczuk et al., 2010) produces perfect solutions in simulations with the Golovin kernel; however, it shows a poor performance when we switch to the Long kernel. Spurious oscillations occur in the intermediate radius range [100 µm, 200 µm] which impedes the development of a realistic rain mode. Only for unfeasibly small time steps of 0.01 s, the simulation results get close to the reference solution. The evaluation exercises presented in Andrejczuk et al. (2010) were not suited to reveal these shortcomings or downplay their severity. Based on our extensive tests, we cannot recommend the algorithm at its present state for further LCM applications, unless some mechanism to eliminate those oscillations is developed.
The average impact algorithm (AIM, based on the ideas of Riechelmann et al., 2012) can produce very good results; however, it appears to be inflexible inasmuch as only the initially largest SIPs are allowed to grow in radius space. The performance depends on details of the SIP initialisation much more than, e.g. on the time step or the SIP number.
The probabilistic all-or-nothing algorithm (AON, based on the ideas of Shima et al., 2009;Sölch and Kärcher, 2010) yields the best results and is the only algorithm that can cope with all tested kernels. Unlike AIM, in AON it is not predetermined which SIPs will eventually contribute to the large droplet mode. By design, any SIP can become significant at some point, and the algorithm can cope with SIP initialisations that guarantee a broad spectrum of weighting factors. If an equal weight initialisation is used, a tremendous number of SIPs are necessary for AON convergence, as reported by Shima et al. (2009).
Many current (multidimensional) applications of LCMs use such SIP ensembles with a narrow spectrum of weighting factors causing a poor performance of the collection/aggregation algorithms. This should be clearly avoided in order to have collection/aggregation algorithms to work properly and/or efficiently. The time step and the bin resolution κ (used in the singleSIP-init) have values similar to those used in traditional spectral-bin models, and hence the computational efforts of both approaches for the collection/aggregation treatment are in the same range. The presented box model simulations are a first step towards a rigourous evaluation of collection/aggregation algorithms in more complex LCM applications (multidimensional domain, full microphysics).
Code availability. The programming language IDL was used to perform the simulations and produce the plots. The source code can be obtained from the first author. Pseudo-code of the algorithms is given in the text.
The Supplement related to this article is available online at doi:10.5194/gmd-10-1521-2017-supplement.