A new subgrid-scale representation of hydrometeor fields using a multivariate PDF

The subgrid-scale representation of hydrometeor fields is important for calculating microphysical process rates. In order to represent subgrid-scale variability, the Cloud Layers Unified By Binormals (CLUBB) parameterization uses a multivariate probability density function (PDF). In addition to vertical velocity, temperature, and moisture fields, the PDF includes hydrometeor fields. Previously, hydrometeor fields were assumed to follow a multivariate single lognormal distribution. Now, in order to better represent the distribution of hydrometeors, two new multivariate PDFs are formulated and introduced. The new PDFs represent hydrometeors using either a delta-lognormal or a delta-double-lognormal shape. The two new PDF distributions, plus the previous single lognormal shape, are compared to histograms of data taken from largeeddy simulations (LESs) of a precipitating cumulus case, a drizzling stratocumulus case, and a deep convective case. Finally, the warm microphysical process rates produced by the different hydrometeor PDFs are compared to the same process rates produced by the LES.


Introduction
The atmospheric portion of the hydrological cycle depends on the formation and dissipation of precipitation.In a numerical model, precipitation processes are represented by the microphysics process rates.These process rates are highly dependent on the values of hydrometeor fields at any place and time.Hydrometeors (such as rain water mixing ratio) can vary significantly on spatial scales smaller than the size of a numerical model grid box (Boutle et al., 2014;Lebsock et al., 2013).This means that a good representation of subgrid-scale variability is important for the parameterization of microphysical process rates.
Subgrid-scale variability (but not spatial organization) can be accounted for through use of a probability density function (PDF).PDFs have been used in atmospheric modeling to account for subgrid variability in moisture and temperature (e.g., Mellor, 1977;Sommeria and Deardorff, 1977;Tompkins, 2002;Naumann et al., 2013) in order to calculate such fields as cloud fraction and mean (liquid) cloud mixing ratio, and have been extended to vertical velocity in order to calculate fields such as liquid water flux (Lewellen and Yoh, 1993;Lappen and Randall, 2001;Larson et al., 2002;Bogenschutz et al., 2010;Firl and Randall, 2015).PDFs have been used in microphysics to account for subgrid variability in cloud water (Zhang et al., 2002;Morrison and Gettelman, 2008) and in warm hydrometeor fields (Larson andGriffin, 2006, 2013;Cheng and Xu, 2009;Kogan andMechem, 2014, 2016) in order to calculate warm microphysics process rates.They also have been used to represent cloud ice (Kärcher and Burkhardt, 2008).
Regarding the PDF's functional form, generality is highly desired.For instance, we would like the PDF to be capable of representing interactions among species, such as accretion (collection) of cloud droplets by rain drops.In addition, the PDF should be able to represent a variety of cloud types, such as cumulus and stratocumulus.Generality in the PDF's functional form is important because it facilitates the formulation of unified cloud parameterizations (e.g., Lappen and Randall, 2001;Neggers et al., 2009;Sušelj et al., 2013;Bogenschutz and Krueger, 2013;Guo et al., 2015;Cheng and Xu, 2015;Thayer-Calder et al., 2015).
Cloud Layers Unified By Binormals (CLUBB) is a singlecolumn model that uses a multivariate PDF to account for the subgrid-scale variability of model fields (Golaz et al., 2002a, b;Larson and Golaz, 2005).The original PDF used by CLUBB consisted of only vertical velocity, w, total water mixing ratio (vapor + liquid cloud), r t , and liquid water potential temperature, θ l .The PDF is a weighted mixture, or sum, of two multivariate normal functions.Each one of these multivariate normal functions is known as a PDF component.Although a normal distribution is unskewed, the twocomponent shape makes it possible to include skewness in model fields.Larson and Griffin (2013) extended CLUBB's PDF to account for subgrid variability in rain water mixing ratio, r r , and rain drop concentration (per unit mass), N r .Each of these hydrometeor species was assumed to follow a single lognormal (SL) distribution on the subgrid domain.This treatment worked well for calculating microphysics process rates in a drizzling stratocumulus case (Griffin and Larson, 2013).Subsequently, CLUBB's PDF was extended to other hydrometeor species involving ice, snow, and graupel.
However, the single lognormal treatment of hydrometeors is less successful when it is applied to a partly cloudy, precipitating case.The problem is that the single lognormal assumes that a hydrometeor is found (that is, has a value greater than 0) at every point on the subgrid domain.This is not realistic in a partly cloudy regime, such as precipitating shallow cumulus, which has non-zero precipitation over only a small fraction of the domain.
Consider an example in which rain covers 10 % of the grid level.Then the in-precipitation mean of r r is 10 times greater than the grid-mean value.This can cause problems when microphysics process rates are calculated using the SL.The accretion rate of r r is proportional to the value of r r inside cloud.In this example, the SL, which distributes the lognormal around the grid mean, would underpredict accretion rate because it causes r r to be too small in cloud.Likewise, evaporation rate is proportional to the value of r r outside cloud.The SL would overpredict evaporation rate because it spreads r r throughout the domain, including the clear portion.
The solution to this problem is to account for the nonprecipitating region of the subgrid domain.This is done by representing the non-precipitating region of the domain with a delta function at a value of the hydrometeor of zero.The in-precipitation portion of the subgrid domain can still be handled by using a single lognormal distribution to represent subgrid variability in the hydrometeor species.The resulting distribution is called a delta-lognormal (DL).In the above example with 10 % rain fraction, the (in-precipitation) lognormal from the DL PDF would be distributed around the in-precipitation mean, as desired, rather than around the grid mean, which is a factor of 10 smaller.
Further improvements in accuracy can be achieved with relatively minor modifications to the PDF.As previously mentioned, CLUBB's PDF contains two components.Each of these components can be easily subdivided into an inprecipitation sub-component and an outside-precipitation sub-component.The result is a delta-lognormal representation of the hydrometeor field in each PDF component.Both delta functions are at zero and represent the region outside of precipitation, but the in-precipitation hydrometeor values are distributed as two lognormals that may have different means and/or variances.When the two lognormals differ in some way, the resulting distribution is called a delta-doublelognormal (DDL).Figure 1 illustrates the SL, DL, and DDL hydrometeor PDF shapes.
The main purpose of this paper is to present the formulation of an updated multivariate PDF that extends CLUBB's traditional PDF to include the DL and DDL hydrometeor PDF shapes.Additionally, a new method is derived to divide the grid-box mean and variance of a hydrometeor species into PDF component means and standard deviations.A secondary purpose of this paper is to present a preliminary comparison of the new PDF shapes with PDF output by largeeddy simulations (LESs).The SL, DL, and DDL hydrometeor PDF shapes are compared to histograms of hydrometeor data taken from precipitating LESs.Additionally, microphysics process rates are calculated using each of the idealized PDF shapes and compared to microphysics process rates taken from the LES.
The remainder of the paper is organized as follows.Section 2 gives a detailed description of the new PDF.Section 3 discusses the PDF parameters and includes the derivation of a new method to divide the grid-box mean and variance into PDF component means and standard deviations for a hydrometeor species.Section 4 describes the LES setup and the test cases, as well as the driving of CLUBB's PDF for the tests.Section 5 presents a comparison of hydrometeors between the LES and the SL, DL, and DDL PDF shapes.The comparison includes plots of PDFs, Kolmogorov-Smirnov and Cramer-von Mises scores, and microphysics process rates.Section 6 contains all conclusions.

Description of the multivariate PDF
We now describe how the multivariate PDF used by CLUBB is modified to improve the representation of hydrometeors.Perhaps the most important modification is the introduction of precipitation fraction, f p , to the PDF.Precipitation fraction is defined as the fraction of the subgrid domain that contains any kind of precipitation (where any hydrometeor species has a positive value).In order to account for any precipitation-less region in the subgrid domain, the PDF is modified to add a delta function at a value of zero for all hydrometeor species.Each PDF component contains its own precipitation fraction.Expressed generally for a PDF of n components, the overall precipitation fraction is related to the component precipitation fractions by

DL DDL SL
Figure 1.A schematic of the single lognormal (SL), delta-lognormal (DL), and delta-double-lognormal (DDL) hydrometeor PDF shapes.The SL PDF shape is precipitating over the entire subgrid domain, whereas the DL and DDL shapes are not.In all three plots of the PDFs (where each PDF is a function of a hydrometeor species, such as r r ), the weighted PDF from each PDF component is shown (black dashes and black dots).The sum of the two are the SL (solid magenta), the DL (solid green), and the DDL (solid blue).The SL does not contain a delta at 0, and the mean and variance of each PDF component are the same.Each component of the DL has a delta at 0 (upward pointing black arrows on the y axis).The sum of the two component deltas forms the DL's delta at 0 (upward pointing green arrow).The mean and variance of each DL PDF component are the same within precipitation.Each component of the DDL also has a delta at 0 (upward pointing black arrows).The sum of the two component deltas forms the DDL's delta at 0 (upward pointing blue arrow).The mean and/or variance differ between DDL PDF components within precipitation.
where f p(i) denotes precipitation fraction in the ith PDF component, and where 0 ≤ f p(i) ≤ 1 for all f p(i) .Additionally where ξ (i) is the relative weight, or mixture fraction, of the ith PDF component, and where 0 < ξ (i) < 1 for all ξ (i) .A PDF with more than one component requires that each PDF component have a mixture fraction.Before writing the form of the multi-component PDF, we digress to discuss a special case, the cloud droplet concentration (per unit mass), N c .In Larson and Griffin (2013), N c was introduced to the PDF and was assumed to follow a single lognormal distribution.This assumption for N c means that when any cloud is found at a grid level, N c > 0 at every point on the subgrid domain.This is unphysical in a partly cloudy situation, because cloud droplets would be found at points where cloud water is not found.Additionally, the single lognormal treatment of N c can cause problems with the microphysics.The grid-level mean of N c , denoted N c (for the remainder of this paper, an overbar denotes a grid-level mean and a prime denotes a turbulent value), is handed to the PDF by the model, and this mean value includes clear air in a partly cloudy situation.This results in a value of N c that is much smaller than the in-cloud values of N c .Since the single lognormal in N c is distributed around N c , N c is much too small in cloud for cases with small cloud fraction, leading to an excessive autoconversion (raindrop formation) rate.
In order to distribute N c where (and only where) cloud water mixing ratio, r c , is found on the subgrid domain, it cannot use the same method as the other hydrometeors.Hydrometeors such as r r can be found outside clouds where r c is not found, or alternatively hydrometeors might be absent inside cloud where r c is found.Instead the PDF is modified so that a new variable, N cn , replaces N c in the PDF.The variable N cn is a mathematical construct that can be viewed as an extended cloud droplet concentration or even as a simplified, conservative cloud condensation nuclei concentration.It is distributed as a single lognormal over the subgrid domain.At points where cloud water is found, N c is set equal to N cn .Otherwise, N c is set to 0 at points where no cloud water is found (see Eq. 4 below).The value of N cn is approximately the in-cloud mean of N c , and in special cases, is exactly the in-cloud mean of N c .Please see Appendix B for a more detailed explanation.
The PDF includes all the hydrometeor species found in the chosen microphysics scheme with the exception of r c , which is calculated from other variables in the PDF through a saturation adjustment, and N c , which is described above.In addition to r r and N r , a microphysics scheme may include hydrometeor species such as ice mixing ratio, r i , ice crystal concentration (per unit mass), N i , snow mixing ratio, r s , snowflake concentration (per unit mass), N s , graupel mixing ratio, r g , and graupel concentration (per unit mass), N g .The vector containing all the hydrometeor species included in the PDF will be denoted h.The full PDF can be written as P (w, r t , θ l , N cn , h).In order to calculate quantities that depend on saturation, such as r c and cloud fraction, a PDF transformation is required.The PDF transformation is a change of coordinates.The multivariate PDF undergoes translation, stretching, and rotation of the axes (Larson et al., 2005;Mellor, 1977).Within each PDF component, a separate PDF transformation takes place.The ith component PDF, P (i) (w, r t , θ l , N cn , h), is transformed to P (i) (w, χ , η, N cn , h), where χ is an "extended" liquid water mixing ratio that, when the air is supersaturated, has a positive value and furthermore is equal to r c .When the air is subsaturated, χ has a negative value.The variable η is orthogonal to χ.The variables r c and N c can now be written as (3) where H (x) is the Heaviside step function.
The general form of a PDF with n components and D variables (whether D includes all the variables in the PDF or any subset of those variables in a multivariate marginal PDF) can be written as (5) Of the D variables listed, the first J variables are normally distributed in each PDF component (i.e., w, r t , and θ l , or w, χ and η), the next K variables are lognormally distributed (i.e., N cn ), and the last variables are the hydrometeor species, such that D = J + K + .The ith component of the PDF, P (i) (x 1 , x 2 , . .., x D ), accounts for both the precipitating and precipitation-less regions, and is given by The subscripts in the ith component, P (J,K)(i) or P (J,K+ )(i) , denote the number of normal variates, J , and the number of lognormal variates, K or K + , used in Eq. ( 7).Each original PDF component is split into precipitating and precipitation-less sub-components.The component means, variances, and correlations for variables x 1 . ..xJ +K do not differ between the precipitating and precipitation-less parts of Eq. ( 6).This greatly simplifies the procedure for parameterizing the component means and variances, given the grid-level means and variances.Additionally, keeping the component means and variances the same between the inprecipitation and outside-precipitation parts of Eq. ( 6) allows the PDF to be reduced back to prior versions.For instance, the multivariate PDF in Eqs. ( 5) and ( 6) reduces to the version given in Larson and Griffin (2013) when all f p(i) = 1 and various PDF parameters are chosen appropriately.Furthermore, when microphysics is not used in a simulation, hydrometeors are not found in the PDF.In this scenario, the PDF reduces to the original version found in Golaz et al. (2002a).
The PDF does not contain a fraction for each hydrometeor species or type, but rather one precipitation fraction.Each PDF component is split into two sub-components (inprecipitation and outside-precipitation).Including a fraction for each hydrometeor type (rain, snow, etc.) would cause the number of sub-components to grow exponentially with the number of fractions.Using n f hydrometeor fractions increases the number of sub-components to 2 n f in each PDF component.This would make setting the PDF parameters associated with each sub-component increasingly difficult.
The multivariate PDF can be adjusted to account for a situation when a variable has a constant value in a PDF (sub-)component.In that situation, the variable can be reduced to a delta function at the (sub-)component mean value.
A good example of this would be setting N cn to a constant value in order to use a constant in-cloud value of cloud droplet concentration.This is also especially useful when dealing with more than one hydrometeor.If one hydrometeor species is found at a grid level, but another hydrometeor species is not found at that level, the hydrometeor that is not found can reduce to a delta function at zero in the precipitating sub-component of Eq. ( 6).
The general form of the m-variate hybrid normal/lognormal distribution in the ith PDF component, P (j,k)(i) (x 1 , x 2 , . .., x m ), which is found in each subcomponent of Eq. ( 6), consists of j normal variates and k lognormal variates, where m = j + k.The first j variables are normally distributed and the remaining k variables are lognormally distributed.The multivariate normal/lognormal PDF is given by (Fletcher and Zupanski, 2006) Both x and µ (i) are m × 1 vectors, where x is a vector of the variables (in normal-space) in the PDF and µ (i) is a vector of the (normal-space) PDF sub-component means.The notation T denotes the transpose of the vector.The m × m (normal-space) covariance matrix is denoted (i) and its determinant is denoted (i) (Fletcher and Zupanski, 2006).The advantage of a single multivariate PDF, as opposed to a collection of individual marginal PDFs, is that the multivariate PDF accounts for correlations among the variables in the PDF.This is advantageous when calculating quantities such as rain water accretion rate and rain water evaporation rate.
Although the multivariate PDF allows for the calculation or specification of the (horizontal) correlation between any two variables at the same grid level, the PDF does not contain information about vertical correlations.Vertical correlations can arise in calculations of radiative transfer, diagnosed hydrometeor sedimentation, or other processes that involve the correlation of a variable with itself at different vertical levels.Such processes are excluded from this study, and hence information about vertical correlations is not needed here.For one possible method to parameterize vertical correlations, see Larson and Schanen (2013).
When variables are integrated out of the full multivariate PDF, the result is a multivariate marginal PDF consisting of fewer variables.When all variables but one are integrated out of the PDF, the result is a univariate marginal or individual marginal PDF.For any hydrometeor species, h, found in the full multivariate PDF in Eq. ( 5), the univariate marginal distribution is where P L(i) (h) is a lognormal distribution in the ith PDF component, which is given by The in-precipitation mean of h in the ith PDF component is µ h(i) .This is the mean of the ith lognormal of h.However, µ h(i) , as in Eq. ( 9), is the normal-space component mean of h.It is the in-precipitation mean of ln h in the ith PDF component and is given by where σ h(i) is the in-precipitation standard deviation of h in the ith PDF component.The quantity σ h(i) is the standard deviation of the ith lognormal of h.The normal-space component standard deviation of h is σ h(i) , as found in Eq. ( 9).It is the in-precipitation standard deviation of ln h in the ith PDF component and is given by The variables that are distributed marginally as binormals use similar notation.For example, µ w(i) is the mean of w in the ith PDF component, or the mean of the ith normal.Likewise, σ w(i) is the standard deviation of w in the ith PDF component, or the standard deviation of the ith normal.

PDF parameters
This paper will use the phrase "PDF parameters" to refer to the PDF component means, standard deviations, and correlations involving variables in the PDF, as well as the mixture fractions and the PDF component precipitation fractions.The PDF parameters are calculated from various grid-mean input variables.In this paper, the component means, standard deviations, and correlations involving w, r t , and θ l , and the mixture fractions, ξ (1) and ξ (2) , are calculated according to the Analytic Double Gaussian 2 (ADG2) PDF, as described in Sect.(e) of the Appendix of Larson et al. (2002).ADG2 requires the following quantities as input: the overall (gridbox) mean, variance, and third-order central moment of w (w, w 2 , and w 3 , respectively), the overall mean and variance of r t (r t and r t 2 , respectively), and the overall mean and variance of θ l (θ l and θ l 2 , respectively).ADG2 preserves the values of these input variables, meaning that the PDF parameters can be used to successfully reconstruct the values of the input variables.Additionally, ADG2 requires and preserves the overall covariance of w and r t (w r t ), the overall covariance of w and θ l (w θ l ), and the overall covariance of r t and θ l (r t θ l ).All of the aforementioned quantities are prognosed or diagnosed in CLUBB and are not the subject of this paper.
The individual marginal distribution for N cn is specified to be a single lognormal over the entire subgrid domain.This requires that both PDF component means equal the overall (grid-box) mean (µ ).When no hydrometeor species are found at a grid level (h = 0), f p = f p(1) = f p(2) = 0. Otherwise, if any hydrometeor species in h is found at a grid level (has a value greater than 0), f p tol ≤ f p ≤ 1, where f p tol is the minimum value allowed for precipitation fraction when hydrometeors are present.We now describe how CLUBB parameterizes f p(1) and f p(2) , given f p .First, we note that A tunable parameter, υ * (where the * subscript denotes a tunable or adjustable parameter), is introduced and is defined as the ratio of ξ (1) f p(1) to f p , where 0 ≤ υ * ≤ 1.The precipitation fraction of PDF component 1 is solved by The precipitation fraction of PDF component 2 can now be solved by When f p(1) calculated by Eq. ( 13) is small enough to force f p(2) calculated by Eq. ( 14) to be limited at 1, the value of f p(1) is recalculated (with f p(2) = 1) and is increased enough to satisfy Eq. ( 12).
B. M. Griffin and V. E. Larson: A new subgrid-scale representation of hydrometeor fields

Hydrometeor PDF parameters
A mean-and-variance-preserving method is used to calculate the in-precipitation means of the hydrometeor field in the two PDF components, µ h(1) and µ h(2) , and the in-precipitation standard deviations of the hydrometeor field in the two PDF components, σ h(1) and σ h(2) .The fields that need to be provided as inputs are the overall (grid-box) mean of the hydrometeor, h, the overall variance of the hydrometeor, h 2 , the mixture fraction in each PDF component, ξ (1) and ξ (2) , the overall precipitation fraction, f p , and the precipitation fraction in each PDF component, f p(1) and f p(2) .Given these inputs, the in-precipitation mean of the hydrometeor, h| ip , can be calculated by and the in-precipitation variance of the hydrometeor, h| ip 2 , can be calculated by The grid-level mean value of any function that is written in terms of variables involved in the PDF can be found by integrating over the product of that function and the PDF.For example, After integrating, the equation for h expressed in terms of PDF parameters is Likewise, the equation for h 2 expressed in terms of PDF parameters is When the hydrometeor is not found at a grid level, h = h 2 = 0 and the component means and standard deviations of the hydrometeor also have a value of 0. When the hydrometeor is found at a grid level, h > 0. Precipitation may be found in only PDF component 1, only PDF component 2, or in both PDF components.When precipitation is found in only PDF component 1, µ h(2) = σ h(2) = 0 and µ h(1) and σ h(1) can easily be solved by Eqs. ( 18) and ( 19).Likewise, when precipitation is found in only PDF component 2, µ h(1) = σ h(1) = 0 and µ h(2) and σ h(2) can easily be solved by the same equation set.
When there is precipitation found in both PDF components, further information is required to solve for the two component means and the two component standard deviations.The variable R is introduced such that . (20) In order to allow the ratio of σ 2 h(1) to µ 2 h(1) to vary, the pa- where , which decreases.Combining Eqs. ( 19), (20), and ( 21), the equation for h 2 can be rewritten as Both the variance of each PDF component and the spread between the means of each PDF component contribute to the in-precipitation variance of the hydrometeor ( h| ip 2 ).
At one extreme, the standard deviation of each component could be set to 0 and the in-precipitation variance could be accounted for by spreading the PDF component (inprecipitation) means far apart.The value of R in this scenario would be its minimum possible value, which is 0. At the other extreme, the means of each component could be set equal to each other and the in-precipitation variance could be accounted for entirely by the PDF component (in-precipitation) standard deviations.The value of R in this scenario would be its maximum possible value, which is R max .
In order to calculate the value of R max , set µ h(1) = µ h(2) = h| ip and R = R max .Eq. ( 22) becomes When Eq. ( 16) is substituted into Eq.( 23), R max is solved for and the equation is Geosci.Model Dev In order to calculate the value of R, a parameter is used to prescribe the ratio of R to its maximum value, R max .The prescribed parameter is denoted o * , where and where 0 ≤ o * ≤ 1.Both R and R max are known functions of the inputs and tunable parameters.When o * = 0, the standard deviation of each PDF component is 0, and µ h( 1) The two remaining unknowns, µ h(1) and µ h(2) , can be solved by a set of two equations, Eq. ( 18) for h and Eq. ( 26) for h 2 .All other quantities in the equation set are known quantities.To find the solution, Eq. ( 18) is rewritten to isolate µ h(2) such that The above equation is substituted into Eq.( 26).The resulting equation is rewritten in the form so the solution to the quadratic equation for µ h(1) is where The value of Q a is always positive and the value of Q b is always negative.The value of Q c can be positive, negative, or zero.Since 1 negative and h 2 is always positive, the sign of Q c depends on which term is greater in magnitude.When h 2 is greater, the sign of Q c is negative.This means that −4Q a Q c is positive, which in turn means that If the subtraction option of the ± were to be chosen, the value of µ h(1) would be negative in this scenario.At first glance, it might appear natural to always choose the addition option.However, this set of equations was derived with the condition that µ h(1) equals µ h(2) when o * = 1.When ζ * ≥ 0, this happens when the addition option is chosen, but not when the subtraction option is chosen.However, when ζ * < 0, this happens when the subtraction option is chosen, but not when the addition option is chosen.So, the equation for µ h(1) becomes , when ζ * ≥ 0; and The value of µ h(2) can now be found using Eq. ( 27).After µ h(1) and µ h(2) have been solved, σ h(1) and σ h(2) can be solved by plugging Eq. ( 25) back into Eqs.( 21) and ( 20), respectively.
As the value of h| ip 2 / h| ip 2 increases and as the value of o * decreases (narrowing the in-precipitation standard deviations and increasing the spread between the in-precipitation means), one of the component means may become negative.This happens because there is a limit to the amount of inprecipitation variance that can be represented by this kind of distribution.In order to prevent out-of-bounds values of µ h(1) or µ h(2) , a lower limit is declared, called µ h | min , where µ h | min is a small, positive value that is typically set to be 2 orders of magnitude smaller than h| ip .The value of µ h(1) or µ h(2) will be limited from becoming any smaller (or negative) at this value.From there, the value of the other hydrometeor in-precipitation component mean is easy to calculate.Then, both values will be entered into the calculation of hydrometeor variance in Eq. ( 22), which will be rewritten to solve for R.Then, both the hydrometeor mean and hydrometeor variance will be preserved with a valid distribution.
When the value of ζ * ≥ 0, the value of µ h(1) tends to be larger than the value of µ h(2) .Likewise when the value of ζ * < 0, the value of µ h(2) tends to be larger than the value of µ h(1) .Since most cloud water and cloud fraction tends to be found in PDF component 1, it is appropriate and advantageous to have the larger in-precipitation component mean of the hydrometeor also found in PDF component 1.The recommended value of ζ * is a value greater than or equal to 0.
This method of closing the hydrometeor PDF parameter equation set produces a DDL hydrometeor PDF shape when , which result in a single lognormal within the precipitating portion of the subgrid domain.Furthermore, if, in addition to setting o * = 1 and ζ * = 0, one simply sets f p(1) = f p(2) = 1, then precipitation is found everywhere within the subgrid domain, producing the SL hydrometeor PDF shape.Hence, it is very easy to change between DDL, DL, and SL hydrometeor PDF shapes.
Additionally, it should be noted that there is only one o * and only one ζ * applied to all the hydrometeor species in h.
In limited testing, the value of the tunable parameter ζ * did not affect the results much for CLUBB's DDL PDF shape.The value of ζ * has been left at 0, effectively eliminating a tunable or adjustable parameter from the scheme.When ζ * = 0, the DDL shape approaches the DL shape as o * approaches 1.As o * approaches 0, the DDL shape approaches a double delta in precipitation (in addition to the delta at 0).Additionally, when 0 < o * < 1, the in-precipitation skewness of the hydrometeor field is influenced by υ * .As υ * approaches 0, the in-precipitation distribution becomes more highly (positively) skewed.In Gaussian space (see Sect. 5), the in-precipitation distribution is positively skewed.As υ * approaches 1, the in-precipitation distribution is less (positively) skewed.In Gaussian space, the in-precipitation distribution is negatively skewed.For the results presented in this paper for the DDL hydrometeor PDF shape, the remaining two tunable parameters have been set to the values o * = 0.5 and υ * = 0.55.

Model setup and testing
There is insufficient data from observations to calculate all the fields that need to be input into CLUBB's PDF.However, this data can be supplied easily and plentifully by a LES.In this paper, LES output of precipitating cases is simulated by the System for Atmospheric Modeling (SAM) (Khairoutdinov and Randall, 2003).SAM uses an anelastic equation set that predicts liquid water static energy, total water mixing ratio, vertical velocity, and both the south-north and west-east components of horizontal velocity.Additionally, it predicts hydrometeor fields as directed by the chosen microphysics scheme.A predictive 1.5-order subgrid-scale turbulent kinetic energy closure is used to compute the subgrid-scale fluxes (Deardorff, 1980).SAM uses a fixed, Cartesian spatial grid and a third-order Adams-Bashforth time-stepping scheme to advance the predictive equations of motion.It uses periodic boundary conditions and a rigid lid at the top of the domain.The second-order MPDATA (multidimensional positive definite advection transport algorithm) scheme is used to advect the predictive variables (Smolarkiewicz and Grabowski, 1990).
In order to assess the generality of the different hydrometeor PDF shapes for different cloud regimes, SAM was used to run three idealized test cases -a precipitating shallow cumulus case, a drizzling stratocumulus case, and a deep convective case.The use of cases from differing cloud regimes help avoid overfitting the parameterizations of PDF shape.The setup for the precipitating shallow cumulus test case was based on the Rain in Cumulus over the Ocean (RICO) LES intercomparison (van Zanten et al., 2011).The horizontal resolution was 100 m, and 256 grid boxes were used in each horizontal direction.The vertical resolution was a constant 40 m and 100 grid boxes were used in the vertical.The model top was located at 4000 m in altitude.The model time step was 1 s and the duration of the simulation was 72 h.A vertical profile of level-averaged statistics was output every minute and a three-dimensional snapshot of hydrometeor fields was output every hour.
The RICO simulation was run with SAM's implementation of the Khairoutdinov and Kogan (2000, hereafter KK) warm microphysics scheme.KK microphysics predicts both r r and N r .SAM's implementation of KK microphysics uses a saturation adjustment to diagnose r c , and cloud droplet concentration is set to a constant value (which is 70 cm −3 for RICO).
The setup for the drizzling stratocumulus test case was taken from the LES intercomparison based on research flight 2 (RF02) of the second Dynamics and Chemistry of Marine Stratocumulus (DYCOMS-II) field experiment (Ackerman et al., 2009).The horizontal resolution was 50 m and 128 grid boxes were used in each horizontal direction.An unevenly spaced vertical grid was used containing 96 grid boxes and covering a domain of depth 1459.3 m.The model time step was 0.5 s and the duration of the simulation was 6 hours.A vertical profile of level-averaged statistics was output every minute and a three-dimensional snapshot of hydrometeor fields was output every 30 min.The DYCOMS-II RF02 simulation was also run with SAM's implementation of KK microphysics and used a constant cloud droplet concentration of 55 cm −3 .
The setup for the deep convective test case was taken from the LES intercomparison based on the Large-Scale Biosphere-Atmosphere (LBA) experiment (Grabowski et al., 2006).The horizontal resolution was 1000 m, and 128 grid boxes were used in each horizontal direction.An unevenly spaced vertical grid was used, containing 128 grid boxes and covering a domain of depth 27 500 m.The model time step was 6 s and the duration of the simulation was 6 hours.
A vertical profile of level-averaged statistics was output every minute and a three-dimensional snapshot of hydrometeor fields was output every 15 min for the final 3.5 h of the simulation.
The LBA case requires a microphysics scheme that can account for ice-phase hydrometeor species.The LBA simulation was run with Morrison et al. (2005)  ing a saturation adjustment right before the microphysics is called and then allows microphysics to update the value of r c , which in turn is used to update the value r t .Cloud droplet concentration was set to a constant value of 100 cm −3 .CLUBB's hydrometeor PDF shapes will be compared to histograms of hydrometeors produced by SAM LES data.Our goal is to isolate errors in the PDF shape itself.In order to eliminate sources of error outside of the PDF shape and provide an "apples-to-apples" comparison of CLUBB's PDF shapes to SAM data, we drive CLUBB's PDF using SAM LES fields, rather than perform interactive CLUBB simulations.The following fields are taken from SAM's statistical profiles and are used as inputs to CLUBB's PDF: r t , θ l , w 2 , r t 2 , θ l 2 , w r t , w θ l , r t θ l , w 3 , f p , r r , r r 2 , N r , and N r 2 .For the LBA case, we add r i , r i 2 , N i , N i 2 , r s , r s 2 , N s , N s 2 , r g , r g 2 , N g , and N g 2 .Another input to CLUBB's PDF is w.The value of w from large-scale forcing is set according to case specifications in both SAM and CLUBB.CLUBB's PDF is generated at every SAM vertical level and at every output time of SAM level-averaged statistical profiles.
Additionally, covariances that involve at least one hydrometeor are added to the above list and are used to calculate the PDF component correlations of the same two variables.These covariances are r t r r , θ l r r , r t N r , θ l N r , and r r N r .Please see Appendix A for more details on the calculation of PDF component correlations.The values of the component correlations do not affect the individual marginal PDFs of the hydrometeors.They are included for the calculation of microphysics process rates (see Sect. 5.2).
Owing to differences between the KK and Morrison microphysics schemes in SAM, f p used by CLUBB's PDF is computed slightly differently depending on which microphysics scheme is used by SAM.The differences are due to the number of hydrometeor species involved in the microphysics, the thresholding found internally in the microphysics codes, and the variables that are output to statistics by SAM.KK microphysics contains only rain, and SAM's implementation of KK microphysics clips any value of r r (and with it N r ) below a threshold value in clear air.Therefore, it is simple to set f p to the fraction of the domain occupied by non-zero values of r r and N r .Morrison microphysics predicts rain, ice, snow, and graupel.For each of these species, SAM outputs a fraction.To provide an apples-to-apples comparison with CLUBB, f p is approximated as the greatest of these four fractions at any particular grid level.
Although f p is provided by the LES for this study, it can be diagnosed based on the cloud fraction using a method such as that of Morrison and Gettelman (2008).If the cloud fraction, in turn, is diagnosed based on the omnipresent prediction of means, variances, and other moments -as in higher-order moment parameterizations such as CLUBB -then the onset of partial cloudiness is well defined and indeterminacy about the time of cloud initiation is avoided.In contrast, parame-terizations that diagnose cloud fraction based on, e.g., cloud water mixing ratio, lack crucial information in cloudless grid boxes, as discussed in Tompkins (2002).The well-defined onset of CLUBB's cloud fraction is inherited by the precipitation fraction.

Results
We first evaluate the shape of the idealized PDFs directly against SAM LES output data.Histograms of SAM LES data are generated from the three-dimensional snapshots of hydrometeor fields.One histogram is generated at every vertical level for each hydrometeor field.A histogram of a SAM hydrometeor field is compared to the CLUBB marginal PDF of that hydrometeor field at the same vertical level and output time.The comparison is done with each of the SL, DL, and DDL PDF shapes.
Figure 2 compares marginal PDFs involving r r and N r for the RICO case at an altitude of 380 m and a time of 4200 min.For the plot of the PDF of r r in Fig. 2a, the delta function at r r = 0 has been omitted.The SAM data are divided into 100 bins, equally sized in r r , that range from the largest value of r r to the smallest positive value of r r .(In what follows, all histograms use 100 equal-size bins, arranged from smallest to largest value.)The SL hydrometeor PDF shape significantly overpredicts the PDF at small values of r r and significantly underpredicts it at large values of r r .These errors are an expected consequence of the single lognormal's attempt to fit the precipitation-less area.The DL and DDL PDF shapes provide a much closer match qualitatively to the SAM data.A quantitative assessment of the quality of the fit will follow in Sect.5.1.
Each of the CLUBB hydrometeor PDF shapes has a lognormal distribution within precipitation in each PDF component.Taking the natural logarithm of every point of a lognormal distribution produces a normal distribution, and so the plot of the PDF of ln r r in Fig. 2b is a normal distribution in each PDF component for each of the DDL, DL, and SL PDF shapes.The plot of the PDF of ln r r (hereafter referred to as the PDF of r r in Gaussian space) complements the aforementioned plot of the PDF of r r (Fig. 2a).The plot of the PDF of r r is log-scaled on the y axis, accentuating the small values of P (r r ) that are found at large values of r r .The plot of the PDF of ln r r accentuates the PDF at small values of r r .
The plot of the PDF of ln r r is a plot of only the inprecipitation portion of the distribution, omitting all zerovalues.The in-precipitation portion of the PDF is divided by f p , which allows the area under the curve to integrate to 1.The PDF shown in Fig. 2b is the Gaussianized form of Eq. (32).
Figure 2b shows that the SL hydrometeor PDF shape significantly misses the mark, for its peak is located too far to the left of the bulk of the SAM LES data.This shift of the peak to excessively small values is to be expected of a con- The marginal distribution of ln r r using the "in-precipitation PDF".This is the in-precipitation marginal PDF in Gaussian space.(c) The marginal distribution of N r with the delta at N r = 0 omitted.(d) The marginal distribution of ln N r using the "in-precipitation PDF".Again, this is the in-precipitation marginal PDF in Gaussian space.The DDL provides a better fit to SAM LES than the DL, which in turn provides a better fit than the SL.
tinuous PDF shape that tries to include a delta function at zero.The DL PDF shape is far too peaked in comparison to the SAM LES data, which is spread out broadly in Gaussian space.The DDL PDF shape is able to achieve a spread-out shape because it has two different means within precipitation.This allows it to better fit the more platykurtic shape of the SAM LES data in Gaussian space.
The plot of the PDF of RICO N r is found in Fig. 2c and the Gaussian-space plot of N r is found in Fig. 2d.Similar to r r , the SL shape overpredicts the PDF at small values of N r and underpredicts it at large values of N r .In Gaussian space, it is easy to see that SL's peak is located too far to the left.The DDL shape provides a better fit than the DL shape to SAM LES data in Fig. 2c.Again, the DL shape is too peaked in Fig. 2d, whereas the bimodal DDL is able to spread out, which provides a better match to SAM LES data.
Figure 3 contains scatter plots that show the bivariate PDF of r r and N r for both SAM LES and CLUBB's PDF in RICO at the same altitude and time as Fig. 2. The CLUBB PDF scatter points were generated by sampling the DDL PDF using an unweighted Monte Carlo sampling scheme.This demonstrates the advantages of the multivariate nature of CLUBB's PDF.The hydrometeor fields are correlated the same way in CLUBB's PDF as they are in SAM LES.
Figure 4 compares marginal PDFs involving r r and N r for the DYCOMS-II RF02 case at an altitude of 400 m and a time of 330 min.All three hydrometeor PDF shapes provide a decent match to the SAM LES data.In Fig. 4a and c, the SL and DL PDF shapes dip a little below the SAM LES line in the middle of the data range for r r and N r , respectively.The DDL PDF shape stays closer to the SAM LES line in this region.Additionally, the SL PDF shape overestimates the SAM LES line close to the y axis.In Fig. 4b and d, the Gaussian-space plots show that the two components of the DDL shape superimpose more than they did for the RICO case, owing to the reduced in-precipitation variance in the drizzling stratocumulus case.
In order to assess how well the PDF shapes are able to capture ice PDFs as well as liquid PDFs, we turn to the LBA case.In LBA, liquid and ice appear at different altitudes and times.Figure 5  330 min.Compared to SAM's PDF, the DDL hydrometeor PDF shape is too bimodal, but it still provides the best visual match of the three hydrometeor PDF shapes to SAM data.The fit will be quantified in Sect.5.1.To indicate whether the three PDF shapes work for icephase hydrometeors, we compare marginal PDFs involving r i and N i for the LBA case at an altitude of 10 500 m and a time of 360 min (Fig. 6).Similar to the r r and N r plots for RICO and LBA, Fig. 6a and c show that the SL PDF shape overpredicts the PDF at small values of r i and N i and underpredicts it at large values of r i and N i .The DL shape provides a better fit than the SL, and the DDL has a slightly better fit than the DL.The Gaussian-space plots in Fig. 6b and d show that the SAM LES distribution of ln r i and ln N i is again platykurtic.The SL PDF shape has a peak that is shifted to the left.The DDL hydrometeor PDF shape is able to spread out the most to cover the platykurtic shape of the LES in Gaussian space.
Why does the DDL PDF shape match LES output better than the DL shape in the aforementioned figures?The PDFs (in Gaussian space) for the LES of RICO and LBA show a broad, flat distribution of hydrometeor values from the LES.The DL shape is too peaked in comparison to the LES data.The DDL PDF is able to spread out the component means and thereby represent the platykurtic shape more accurately.However, even the DDL PDF fails to capture the far left-hand tail of the LES PDF.In the RICO, DYCOMS-II RF02, and LBA cases, between about 5 and 20 % of the LES PDF is found to the left of the DDL PDF (see Figures 2b, 4b, 5b,  and 6b).However, these values of hydrometeor mixing ratios are small.They are roughly a factor of 20 or more smaller than the median value.By combining these factors, we see that the percentage contribution of hydrometeor mixing ratios that are omitted on the left-hand tail is only about 1 %.
Why does SAM LES data have a platykurtic shape in Gaussian space in these cases?One possible cause is the partly cloudy (and partly rainy) nature of these cases.In these partly rainy cases, a relatively high percentage of the precipitation occurs in "edge regions" near the non-precipitating region.These regions usually correspond to the edge of cloud or outside of cloud.Evaporation (or less accretion) occurs in these regions, increasing the area occupied by smaller amounts of rain.Yet, there is also an area of more intense precipitation near the center of the precipitating region, which produces larger amounts of rain.Collectively, the areas of small and large rain amount produce the large spread in the hydrometeor spectrum.
The DYCOMS-II RF02 PDFs from the LES tend not to share the platykurtic shape seen in the other cases.The RF02  The marginal distribution of ln r r using the "in-precipitation PDF."This is the in-precipitation marginal PDF in Gaussian space.(c) The marginal distribution of N r with the delta at N r = 0 omitted.(d) The marginal distribution of ln N r using the "in-precipitation PDF", which is the in-precipitation marginal PDF in Gaussian space.Owing to relatively low in-precipitating variance, the three hydrometeor PDF shapes are all a close match to SAM LES.case is overcast, so there are not as many "edge" regions of precipitation as found in partly rainy cases.There is much less in-precipitation variance in the RF02 case.The simpler PDF shape is easier to fit by all the PDF shapes (SL, DL, and DDL).To further illuminate the physics underlying the PDF shapes produced by LESs, further study would be needed.

Quality of fit: general scores
While a lot can be learned by looking at plots of the hydrometeor PDFs, they are anecdotal and cannot tell us how well the idealized PDF shapes work generally.To obtain an overall quantification of the quality of the fit, we calculate the Kolmogorov-Smirnov (K-S) and the Cramer-von Mises (C-vM) scores.
Both the K-S and C-vM tests compare the cumulative distribution function (CDF) of the idealized distribution to the CDF of the empirical data (in this case, SAM LES data).Both tests require that the CDFs be continuous.Therefore, the scores are calculated using only the in-precipitation por-tion of the hydrometeor PDF in Eq. ( 8).The DDL, DL, and SAM LES data all have the same precipitation fraction.The in-precipitation portion of the PDF is normalized by dividing by precipitation fraction so that it integrates to 1.The equation for the in-precipitation portion of the marginal PDF, P (h)| ip , is where P L(i) is given by Eq. ( 9).The K-S score is the greatest difference between the empirical in-precipitation CDF, C e (h)| ip , and the idealized inprecipitation CDF, C (h)| ip , at any point in h > 0. In order to run the tests, the SAM LES data from the requested level and time was sorted in the order of increasing value.This was done only for points where the requested hydrometeor was found.The K-S score is given by (Stephens, 1970)  The marginal distribution of ln r r using the "in-precipitation PDF."This is the in-precipitation marginal PDF in Gaussian space.(c) The marginal distribution of N r with the delta at N r = 0 omitted.(d) The marginal distribution of ln N r using the "in-precipitation PDF", which is the in-precipitation marginal PDF in Gaussian space.Again, the DDL provides the best fit to SAM LES.

KS
The number of data points in SAM LES where the hydrometeor is found is denoted n p , and h κ is the value of the hydrometeor at SAM LES-ordered data point κ.
Unlike the K-S test, which only considers the greatest difference between the CDFs, the C-vM test is based on an integral that includes the differences between the CDFs over the entire distribution.The integral is (Anderson, 1962) The C-vM score is calculated by (Anderson, 1962;Stephens, 1970) The K-S and C-vM test scores are produced at every LES vertical level and three-dimensional statistical output time for every hydrometeor species.This results in a large number of scores.We desire that each hydrometeor species have a single K-S score and a single C-vM score in order to more easily compare the DDL, DL, and SL hydrometeor shapes.We calculate this score by averaging the individual level scores over multiple levels and multiple output times.For K-S this is simple, and the result is KS (where angle brackets denote an average over multiple levels and times).The C-vM test score in Eq. ( 35) is dependent on the number of precipitating grid points.This number changes between vertical levels The marginal distribution of ln r i using the "in-precipitation PDF."This is the in-precipitation marginal PDF in Gaussian space.(c) The marginal distribution of N i with the delta at N i = 0 omitted.(d) The marginal distribution of ln N i using the "in-precipitation PDF." Again, this is the in-precipitation marginal PDF in Gaussian space.The method works for frozen hydrometeor species as well, as the DDL provides a better fit to SAM LES than the DL, which in turn provides a better fit than the SL. and output times, so the C-vM scores cannot simply be averaged.Rather, they are normalized first by dividing CVM by n p to produce ω 2 at every level and time.Those results are averaged to calculate ω 2 .
After inspecting profiles of SAM LES results for mean mixing ratios in height and time, regions were identified in height and time where the mean mixing ratio of a species was always at least 5.0 × 10 −6 kg kg −1 .Averaging of the scores was restricted to these regions in order to eliminate from consideration levels that do not contain the hydrometeor or contain only small amounts of the hydrometeor with a small number of samples.RICO test scores for r r and N r were averaged from the surface through 2780 m and from 4200 through 4320 min.DYCOMS-II RF02 test scores for r r and N r were averaged from 277 through 808 m and from 300 to 360 min.
The LBA case contains both liquid and frozen-phase hydrometeor species that evolve as the cloud system transitions from shallow to deep convection.The various hydrometeor species develop and maximize at different altitudes and times, so different periods and altitude ranges are chosen for averaging test scores for each species.LBA test scores for r r and N r were averaged from the surface through 6000 m and from 285 through 360 min.The test scores for r g and N g were averaged from 4132 through 9750 m and from 315 through 360 min.The test scores for r s and N s were averaged from 5026 m through 9000 m and from 345 through 360 min.Finally, the test scores for r i and N i were averaged from 10 250 through 11 750 m at 360 min.For the LBA case, the value of f p used by CLUBB's PDF was based on the greatest value of SAM output variables for rain fraction, ice fraction, snow fraction, and graupel fraction.Each of these statistics is the fraction of the SAM domain occupied by values of the relevant mixing ratio of at least 1.0 × 10 −6 kg kg −1 .In order to keep the comparison of the PDF shapes to SAM data consistent, values lower than this threshold were omitted from the calculations of the individual level-and-time scores for K-S and C-vM.
The results of KS are listed in Table 1 for every hydrometeor species in every case.The DDL PDF shape has the The results of ω 2 are listed in Table 2.The DDL PDF shape has the lowest average score for every case and hydrometeor species, the DL shape has the second-lowest average score, and the SL shape has the highest average score.We note the important caveat that, as compared to DL, DDL has more adjustable parameters.A parameterization with more free parameters would be expected to provide a better fit to a training data set.Therefore, although DDL matches the LES output more closely than does DL, we cannot be certain, based on the analysis presented here, that DDL will outperform DL on a different validation data set.For a deeper analysis, one could use a model selection method that penalizes parameterizations with more parameters.We leave such an analysis for future work.

Microphysical process rates
A primary reason to improve the accuracy of hydrometeor PDFs is to improve the accuracy of the calculation of microphysical process rates.In this section, we compare the accuracy of calculations of microphysical process rates based on the SL, DL, and DDL PDF shapes.
In the simulations of RICO and DYCOMS-II RF02, both SAM LES and CLUBB use KK microphysics.The process rates output are the mean evaporation rate of r r , the mean accretion rate of r r , and the mean autoconversion rate of r r .Also recorded is rain drop mean volume radius, which is important for sedimentation velocity of rain.In order to account for subgrid variability in the microphysics, the KK microphysics process rate equations have been upscaled (to gridbox scale) using analytic integration over the PDF (Larson and Griffin, 2013;Griffin and Larson, 2013).The updates to the multivariate PDF (see Sect. 2) require updates to the upscaled process rate equations.The updated forms of these equations are listed in the Supplement.
Figure 7 shows profiles of RICO mean microphysics process rates.The mean evaporation rate profile in Fig. 7a shows that all three shapes over-evaporate at higher altitudes, but that SL and DL over-evaporate more than DDL.It should be noted that the reason for the over-evaporation at higher altitudes in the RICO case is the marginal PDF of χ produced by ADG2.While it provides a good match between CLUBB and SAM LES in the fields of cloud fraction and r c , the value of σ χ (1) is far too large.When χ and r r (or N r ) are distributed jointly, this results in too many large values of r r (or N r ) being placed in air that is far too dry.RICO mean evaporation rate could benefit from an improved ADG2 in order to produce a better marginal distribution of χ, but that is beyond the scope of this paper.
Figure 7b shows that both the DL and DDL PDF shapes match the LES mean accretion rate profile much better than does the SL shape.The mean autoconversion rate depends on χ and N cn but not hydrometeor variables, and so the autoconversion rate is the same for all three PDF shapes (not shown).The overall mean microphysics rate -i.e., the sum of the evaporation, accretion, and autoconversion rates -is fit best by the DDL shape and worst by the SL shape.Both DDL and DL are a much better match to the SAM profile of rain drop mean volume radius than SL (Fig. 7d).
Figure 8 shows that all three hydrometeor PDF shapes provide a good match to SAM LES for DYCOMS-II RF02.In Fig. 8d, the SL PDF shape deviates more strongly from SAM LES than does DL or DDL near the bottom of the profile of rain drop mean volume radius.
In the simulation of LBA, Morrison microphysics was used in both SAM LES and CLUBB.In order to account for subgrid variability in the microphysics, sample points from the PDF are produced at every grid level using the Subgrid Importance Latin Hypercube Sampler (SILHS) (Raut and Larson, 2016;Larson and Schanen, 2013;Larson et al., 2005).For the LBA case, 128 sample points were drawn.Morrison microphysics is then called using each set of sample points, and the results are averaged to calculate the mean microphysics process rates.
Figure 9 shows the same mean microphysics process rates as in previous figures, but here for LBA.The profile of mean evaporation rate in Fig. 9a shows that DDL is the best match to SAM LES.The profile of mean accretion rate in Fig. 9b shows that DDL is the best match to SAM, followed by DL and then SL.  matched by the DDL hydrometeor PDF shape, followed by the DL shape, which in turn is followed by the SL shape (Fig. 9c).

Conclusions
The multivariate PDF used by CLUBB has been updated to improve the subgrid representation of hydrometeor species.
The most important update is the introduction of precipitation fraction to the PDF.The precipitating fraction contains any non-zero values of any hydrometeor species included in the microphysics scheme.The remainder of the subgrid domain is precipitation-less and is represented by a delta function where every hydrometeor species has a value of zero.When a hydrometeor is found at a grid level, its rep-resentation in the precipitating portion of the subgrid domain is a lognormal or double-lognormal distribution.The introduction of precipitation fraction increases accretion and decreases evaporation in cumulus cases, allowing more precipitation to reach the ground.Additionally, a new method has been developed to calculate the in-precipitation mean and standard deviation of a hydrometeor species in each component of CLUBB's twocomponent PDF.This method preserves the grid-box mean and variance of the hydrometeor species.By simply changing the values of tunable parameters, CLUBB's marginal PDF for a hydrometeor can be changed from a delta-doublelognormal (DDL) to a delta-lognormal (DL) or to a singlelognormal (SL) shape.In order to compare the effectiveness of the three hydrometeor PDF shapes, three simulations -a precipitating shallow cumulus case (RICO), a drizzling stratocumulus case (DYCOMS-II RF02), and a deep convective case (LBA)were run using SAM LES.Statistical output values from the LES for the grid-level mean and turbulent fields were used to drive the PDF for each hydrometeor PDF shape.The idealized PDF shapes were compared to the SAM LES results.The DDL PDF shape produced the lowest average K-S and average normalized C-vM scores when compared to SAM LES results, followed by the DL PDF shape.Both produced lower scores than the original SL PDF shape.However, for DYCOMS-II RF02, all three PDF shapes were in almost equal agreement with SAM LES results.
The DL and DDL PDFs possess three important properties: (1) they are multivariate, and hence can represent interactions among multiple hydrometeor species; (2) they admit a precipitation-less region, which is necessary to permit realistic process rates in cumulus cloud layers; and (3) they have realistic tails, as evidenced by the comparisons with LES shown here.Because of these three properties, the DL and DDL PDFs may be general enough and accurate enough to adequately represent hydrometeor variability over a range of important cloud types, including shallow cumulus, deep cumulus, and stratocumulus clouds.This generality, in turn, may help enable parameterization of these clouds types in a more unified way.Indeed, an early version of the DDL PDF has already been used to represent hydrometeor subgrid variability in some interactive simulations with a uni-  fied cloud parameterization.Namely, the DDL PDF was used in the interactive single-column simulations of these cloud types by Storer et al. (2015) and in the global simulations by Thayer-Calder et al. (2015).Further testing would be required, however, to better understand the limits of the DL and DDL PDFs.Better understanding is particularly desirable in, for instance, mixed-phase and glaciated clouds.This has been left for future work.

Code availability
The CLUBB code is freely available for non-commercial use after registering for an account on the website http: //clubb.larson-group.com.The specific version of CLUBB used in this paper is available in the SVN repository located at http://carson.math.uwm.edu/repos/clubb_repos/tags/Hydromet_PDF_shapes.In the repository is a file README_Hydromet_PDF_shapes which gives instructions for generating the results found in this paper.

Figure 2 .
Figure 2. PDFs of rain in the RICO precipitating shallow cumulus case at an altitude of 380 m and a time of 4200 min.The SAM LES results are in red, the DDL results are blue solid lines, the DL results are green dashed lines, and the SL results are magenta dashed-dotted lines.(a) The marginal distribution of r r with the delta at r r = 0 omitted.(b)The marginal distribution of ln r r using the "in-precipitation PDF".This is the in-precipitation marginal PDF in Gaussian space.(c) The marginal distribution of N r with the delta at N r = 0 omitted.(d) The marginal distribution of ln N r using the "in-precipitation PDF".Again, this is the in-precipitation marginal PDF in Gaussian space.The DDL provides a better fit to SAM LES than the DL, which in turn provides a better fit than the SL.

Figure 3 .
Figure3.Joint PDF of r r and N r in the RICO precipitating shallow cumulus case at an altitude of 380 m and a time of 4200 min.SAM LES results are the red scatter points.CLUBB PDF scatter points were generated by sampling the DDL PDF using an unweighted Monte Carlo scheme.The SAM LES domain is 256×256 grid points.So to provide for the best comparison of LES points to CLUBB PDF sample points, 65 536 CLUBB PDF sample points were used.The light blue scatter points are from PDF component 1 and the dark blue scatter points are from PDF component 2. Every 10th point was plotted from both SAM LES and CLUBB's PDF.The joint nature of the PDF allows r r and N r to correlate the same way in CLUBB as they do in SAM.

Figure 4 .
Figure 4. PDFs of rain in the DYCOMS-II RF02 drizzling stratocumulus case at an altitude of 400 m and a time of 330 min.The SAM LES results are in red, the DDL results are blue solid lines, the DL results are green dashed lines, and the SL results are magenta dashed-dotted lines.(a) The marginal distribution of r r with the delta at r r = 0 omitted.(b)The marginal distribution of ln r r using the "in-precipitation PDF."This is the in-precipitation marginal PDF in Gaussian space.(c) The marginal distribution of N r with the delta at N r = 0 omitted.(d) The marginal distribution of ln N r using the "in-precipitation PDF", which is the in-precipitation marginal PDF in Gaussian space.Owing to relatively low in-precipitating variance, the three hydrometeor PDF shapes are all a close match to SAM LES.

Figure 5 .
Figure 5. PDFs of rain in the LBA deep convective case at an altitude of 2424 m and a time of 330 min.The SAM LES results are in red, the DDL results are blue solid lines, the DL results are green dashed lines, and the SL results are magenta dashed-dotted lines.(a) The marginal distribution of r r with the delta at r r = 0 omitted.(b)The marginal distribution of ln r r using the "in-precipitation PDF."This is the in-precipitation marginal PDF in Gaussian space.(c) The marginal distribution of N r with the delta at N r = 0 omitted.(d) The marginal distribution of ln N r using the "in-precipitation PDF", which is the in-precipitation marginal PDF in Gaussian space.Again, the DDL provides the best fit to SAM LES.

Figure 6 .
Figure 6.PDFs of ice in the LBA deep convective case at an altitude of 10 500 m and a time of 360 min.The SAM LES results are in red, the DDL results are blue solid lines, the DL results are green dashed lines, and the SL results are magenta dashed-dotted lines.(a) The marginal distribution of r i with the delta at r i = 0 omitted.(b)The marginal distribution of ln r i using the "in-precipitation PDF."This is the in-precipitation marginal PDF in Gaussian space.(c) The marginal distribution of N i with the delta at N i = 0 omitted.(d) The marginal distribution of ln N i using the "in-precipitation PDF." Again, this is the in-precipitation marginal PDF in Gaussian space.The method works for frozen hydrometeor species as well, as the DDL provides a better fit to SAM LES than the DL, which in turn provides a better fit than the SL.

Figure 7 .
Figure7.Profiles of mean microphysics process rates in the RICO precipitating shallow cumulus case time-averaged over the last 2 hours of the simulation (minutes 4200 through 4320).The SAM LES results are red solid lines, the DDL results are blue solid lines, the DL results are green dashed lines, and the SL results are magenta dashed-dotted lines.(a) The mean evaporation rate of r r .(b) The mean accretion rate of r r .(c) The overall mean microphysics tendency for r r .(d) The mean volume radius of rain drops.Overall, the DDL provides a better fit to SAM LES than the DL, which in turn provides a better fit than the SL.

Figure 8 .
Figure 8. Profiles of mean microphysics process rates in the DYCOMS-II RF02 drizzling stratocumulus case time-averaged over the last hour of the simulation (minutes 300 through 360).The SAM LES results are red solid lines, the DDL results are blue solid lines, the DL results are green dashed lines, and the SL results are magenta dashed-dotted lines.(a) The mean evaporation rate of r r .(b) The mean accretion rate of r r .(c) The overall mean microphysics tendency for r r .(d) The mean volume radius of rain drops.All hydrometeor PDF shapes provide a good fit to SAM LES.

Figure 9 .
Figure 9. Profiles of mean warm microphysics process rates in the LBA deep convective case time-averaged over the last hour of the simulation (minutes 300 through 360).The SAM LES results are red solid lines, the DDL results are blue solid lines, the DL results are green dashed lines, and the SL results are magenta dasheddotted lines.(a) The mean evaporation rate of r r .(b)The mean accretion rate of r r .(c) The overall mean microphysics tendency for r r .Again, the DDL provides a better fit to SAM LES than the DL, which in turn provides a better fit than the SL.
In the scenario that ζ * = 0 the equation for R max reduces to the ratio of h| ip 2 to h| ip 2 .
., 9, 2031-2053, 2016 www.geosci-model-dev.net/9/2031/2016/B. M. Griffin and V. E. Larson: A new subgrid-scale representation of hydrometeor fields 2037 and the standard deviations of the PDF components account for all of the in-precipitation variance.At intermediate values of o * , the means of each PDF component are somewhat spread apart and each PDF component has some width.The new equation for hydrometeor variance becomes microphysics, which predicts the mixing ratio and number concentration (per unit mass) of rain, cloud ice, snow, and graupel.SAM's implementation of Morrison microphysics diagnoses r c us- Geosci.Model Dev., 9, 2031-2053, 2016www.geosci-model-dev.net/9/2031/2016/B. M. Griffin and V. E. Larson: A new subgrid-scale representation of hydrometeor fields 2039 Griffin and V. E. Larson: A new subgrid-scale representation of hydrometeor fields www.geosci-model-dev.net/9/2031/2016/Geosci.Model Dev., 9, 2031-2053, 2016 2042 B. M.

Table 1 .
Kolmogorov-Smirnov statistic averaged over multiple grid levels and statistical output timesteps comparing each of DDL, DL, and SL hydrometeor PDF shapes to SAM LES results.The best (lowest) average score for each case and hydrometeor species is listed in bold.The DDL has the lowest average score most often, and the DL has the second-lowest average score most often.The DL PDF shape edges out the DDL in the DYCOMS-II RF02 N r comparison.The SL PDF shape has the highest average score for every case and hydrometeor species, except for the LBA r r comparison, where it has the second-lowest score and the DL has the highest score.

Table 2 .
Normalized Cramer-von Mises statistic averaged over multiple grid levels and statistical output timesteps comparing each of DDL, DL, and SL hydrometeor PDF shapes to SAM LES results.The best (lowest) average score for each case and hydrometeor species is listed in bold.The DDL has the lowest average score every time, and the DL has the second-lowest average score every time.