Prediction of cloud condensation nuclei activity for organic compounds using functional group contribution methods

. A wealth of recent laboratory and ﬁeld experiments demonstrate that organic aerosol composition evolves with time in the atmosphere, leading to changes in the in-ﬂuence of the organic fraction to cloud condensation nuclei (CCN) spectra. There is a need for tools that can realistically represent the evolution of CCN activity to better predict indirect effects of organic aerosol on clouds and climate. This work describes a model to predict the CCN activity of organic compounds from functional group composition. Following previous methods in the literature, we test the ability of semi-empirical group contribution methods in Köh-ler theory to predict the effective hygroscopicity parameter, kappa. However, in our approach we also account for liquid– liquid phase boundaries to simulate phase-limited activation behavior. Model evaluation against a selected database of published laboratory measurements demonstrates that kappa can be predicted within a factor of 2. Simulation of homologous series is used to identify the relative effectiveness of different functional groups in increasing the CCN activity of weakly functionalized organic compounds. Hydroxyl, carboxyl, aldehyde, hydroperoxide, carbonyl, and ether


Introduction
Organic compounds are an important contributor to the atmospheric submicron aerosol (Jimenez et al., 2009).The organic fraction is projected to increase in the future due to the confluence of a decreasing sulfate and nitrate burden and increases in the global secondary organic aerosol burden (Heald et al., 2008).An important unanswered question is how the organic influences the aerosol's ability to serve as cloud condensation nuclei (CCN), and in turn modulate climate via indirect effects of aerosols on clouds and precipitation (Andreae and Rosenfeld, 2008).Realistic prescribed variations in secondary organic aerosol hygroscopicity have demonstrable impacts on CCN number concentration (Mei et al., 2013) and can change the simulated global aerosol indirect forcing (AIF) by approx.one-sixth of the AIF simulated in a control case (Liu and Wang, 2010).To obtain a prognostic understanding of the contribution of the organic fraction to indirect aerosol forcing in future climates, models need improved schemes that map simulated organic aerosol composition to hygroscopicity and CCN activity.
Several organic aerosol types (e.g., freshly emitted diesel oil particles or first generation oxidation products of sesquiterpenes) consist of mostly hydrophobic hydrocarbon chains with few functional groups attached.Pure hydrocarbons with a carbon number less than C 30 are expected to be semi-volatile and in the liquid phase.Over time the compounds evolve by functionalization, fragmentation, and oligomerization (Kroll and Seinfeld, 2008;Ziemann and Atkinson, 2012).As functional groups are added to the carbon chain, the products usually, but not always, become less Published by Copernicus Publications on behalf of the European Geosciences Union.
Laboratory (George and Abbatt, 2010;Poulain et al., 2010;Cappa et al., 2011;Massoli et al., 2010;Lambe et al., 2011;Duplissy et al., 2011;Kuwata et al., 2013;Rickards et al., 2013;Suda et al., 2014) and field studies (Jimenez et al., 2009;Chang et al., 2010;Mei et al., 2013) have demonstrated a robust link between the aerosol oxidation state and the ability of the organic fraction to promote hygroscopic water uptake and CCN activity.Proxies from mass spectrometry such as the fragmentation peak f 44 or the atomic oxygento-carbon ratio are often used to model the increase in hygroscopicity.However, these correlations exhibit significant variability between studies and break down when applied at the compound level (Rickards et al., 2013;Suda et al., 2014).
Chemistry models are already capable of simulating the molecular identities of species present in the condensed phase during multi-day evolution of diluting air parcels (Lee-Taylor et al., 2015).Mapping this speciated aerosol composition to the aerosol hygroscopicity should ultimately permit quantification of changes in CCN number concentration (provided that the size distribution is also simulated) and associated effects on clouds and climate.Thermodynamic models should be able to predict CCN activity.Many thermodynamic models have made use of activity coefficients predicted by the universal functional group activity coefficient (UNIFAC) group contribution method (Fredenslund et al., 1975).Several investigators have compared UNIFAC predictions of organic aerosol water content to experimental data (Saxena and Hildemann, 1997;Ming and Russell, 2001;Peng et al., 2001;Choi and Chan, 2002;Mochida and Kawamura, 2004;Marcolli and Peter, 2005;Moore and Raymond, 2008).Some of these comparisons prompted proposed revisions of specific group interaction parameters, e.g., [OH] and [H 2 O].Several thermodynamic models that treat complex phase equilibria of multifunctional, multicomponent organic mixtures are based on UNIFAC activity coefficients (Ming and Russell, 2002;Raatikainen and Laaksonen, 2005;Topping et al., 2005;Amundson et al., 2007;Zuend et al., 2008;Compernolle et al., 2009).The development of these models has been driven by the need to enable predictions over a wide range of conditions and compositions, including the effect of liquid-liquid phase separation on gas-to-particle partitioning (Zuend and Seinfeld, 2012;Topping et al., 2013).The prediction of CCN activity of organic compounds has received less attention.Rissman et al. (2007) used the aerosol diameter-dependent equilibrium model (ADDEM; Topping et al., 2005) with an underlying UNIFAC core to predict the relationship between critical supersaturation and dry for several dicarboxylic acid aerosols.To our knowledge no study to date has systematically focused on the prediction of CCN activity from thermodynamic models.
Here we build on this body of work to predict the contribution of a compound with known chemical structure to the CCN activity of a particle of known size.The proposed model uses the UNIFAC equations (Fredenslund et al., 1975) with group interaction parameters form Hansen et al. (1991), Raatikainen andLaaksonen (2005), andCompernolle et al. (2009) to model activity coefficients and free energy of mixing.Liquid-liquid phase boundaries are determined using the area method of Eubank et al. (1992).Molecular volume is estimated from elemental composition and adjustments for functional group composition using the approach of Girolami (1994).The relationship between critical supersaturation and dry diameter is then predicted using Köhler theory (Seinfeld and Pandis, 2006).The basic model mechanics are similar to those employed in multicomponent phase equilibrium models (Ming and Russell, 2002;Raatikainen and Laaksonen, 2005;Topping et al., 2005;Amundson et al., 2007;Zuend et al., 2008) but limited in scope to binary compositions and with focus on accurately representing phase and water activity at conditions relevant at the point of CCN activation only.These predictions are validated by manually mapping chemical composition to UNI-FAC groupings and comparing modeled CCN activity against observations from a compiled library of recently published CCN data of mostly weakly oxidized hydrocarbons containing a mixture of alcohol, carbonyl, aldehyde, ether, carboxyl, nitrate, and hydroperoxide moieties.The model is used to predict how the addition of one or more functional groups to otherwise similar molecules promotes CCN activity.Envisioned application to multi-component aerosols and contrasts with more complete thermodynamic models are discussed.

Köhler theory
The saturation ratio over a curved droplet is given by the Köhler equation where a w is the water activity, σ s/a is the surface tension of the solution/air interface, T is temperature, M w is the molecular weight of water, ρ w is the density of pure water, R is the universal gas constant, and D is the wet drop diameter.
Water activity depends on the water content and the amounts and identities of solutes in the nucleus.The principle water content variable used in this work is the mole fraction where x w is the mole fraction of water, n w and n s,i are the number of moles of water and solutes, and i is the number of dry components.The wet drop diameter can be calculated from x w if the dry diameter, D d , is specified and it is assumed that the particle is spherical and that the volume of water and solute are additive: In Eq. ( 3) v w and v s,i are the molar volume of the water and solutes and ε i are the volume fractions in the dry particle.Equation ( 3) is obtained by rearranging Eq. ( 7) in Petters et al. (2009a).The critical supersaturation required for an aqueous solution droplet to activate into a cloud droplet is found by combining Eqs. ( 1) and (3) and finding the x w (or D) that maximizes s c where s c is the critical supersaturation in %.The variables that control s c are v s , a w , and σ s/a .In this work it is assumed that surface tension is that of pure water.Discussion on this and other assumptions are provided at the end of this section.
First the prediction of v s and a w for organic compounds with known chemical structure is described.

Molar volume
Molar volume is calculated from the molecular formula using the method of Girolami (1994).Each element is assigned a relative volume based on its location in the periodic table.
The elemental volumes are summed and scaled by a constant factor to compute v s .If the oxygen is bound in the form of alcohol [OH] or carboxyl [C(=O)OH] moieties, the actual v s is smaller due to intramolecular bonding.Therefore, v s is decreased by 10 % for each [OH] or [C(=O)OH] group but by no more than 30 % of the molar volume derived from the elemental composition.Girolami (1994) tested this method for 166 liquids and reports agreement with observations v s ∼ ± 10 %.Barley et al. (2013) reviewed the performance of various methods for predicting molar volume using a test set of 56 multifunctional organic compounds and report similar scatter.

Water activity
Water activity is related to the mole fraction via where γ w is the activity coefficient of water.Activity coefficients are estimated using the semi-empirical group contribution method UNIFAC (Fredenslund et al., 1975).The UNIFAC model describes a liquid solution that consists of i components.Each component is divided into k groups.The activity coefficient of component i in solution (γ i ) has contributions from combinatorial (γ C ) and residual parts (γ R ) The combinatorial part is computed via θ i = q i x i j q j x j In Eqs. ( 7), x i is the mole fraction of component i, θ i , and i are the average surface and segment fraction, z is the lattice coordination number, v k is the number of groups of type k in component i, R k , and Q k are the group volume and surface area parameters derived from Bondi (1964), and r i and q i are the normalized van der Waals volume and surface area.The summation i or j is over all components in the mixture, including component i.
The residual part is computed via In Eqs.(8), a mn are empirically determined parameters, mn is the group interaction parameter of group m with n, X m is the mole fraction of group m in the mixture, m is the area fraction of group m, k is the group residual activity coefficient, and k is the residual activity coefficient of group k in a reference solution containing only molecules of type i.Equations ( 8) are also used to compute (i) k .The summation n or m is over all different groups in the mixture, and the summation k is over all groups in component i.
Groups  S1 in the Supplement.Some of the main groups have several subgroups, with each subgroup having unique volume and surface area parameters R k andQ k .These are summarized in Table S2.

Phase equilibrium
For some x w liquid-liquid phase separation can occur.The normalized Gibbs free energy of the mixture, defined as the actual Gibbs free energy divided by the thermal energy, is needed to compute the number of thermodynamically stable phases in the system.For a binary system consisting of water (w) and a single solute (s), Gibbs energy is calculated from the activity coefficients via standard thermodynamic relationships (Prausnitz et al., 1999;Petters et al., 2009a) where g mix is the normalized change in Gibbs free energy of the mixture, g ideal is the change in ideal Gibbs free energy of the mixture (Raoult's law), and g excess is the excess Gibbs free energy of mixing quantifying the deviation from Raoult's law.In highly non-ideal solutions liquid-liquid phase separation may occur.Two compositions x a and x b define the water mole fraction of the two co-existing phases.
Computationally, x a and x b can be obtained from g mix using the area method (Eubank et al., 1992).Briefly, the state space is evaluated by computing the following area for all possible combinations x I and Phase boundaries x a and x b exist if condition is satisfied.If multiple phases coexist in phase equilibrium, the Gibbs-Duhem relationship dictates that the chemical potential of each component is equal in all phases.Therefore the water activity inside the miscibility gap is constant and the values entering Eq. ( 4) are subject to the constraint We note that Eubank et al. (1992) algorithm can be extended to n components.Other numerically efficient approaches to find phase equilibrium, including those of n component mixtures, are available in the literature (e.g., Amundson et al., 2005Amundson et al., , 2007;;Zuend et al., 2010).Comparison for phase boundaries (x a , x b ) calculated using standard UNI-FAC parameters and the Eubank method used in this model, and standard UNIFAC parameter and the algorithm in the UHAERO model (Amundson et al., 2007) are in good agreement and summarized in the Supplement.

Model implementation
The model was implemented to run on a personal computer using the commercial MATLAB environment (Math-Works, Inc.).Alternatively, the code runs under the Octave environment, which is available as free software under the GNU General Public License.Correct implementation of the UNIFAC model was confirmed by comparing results from test mixtures against output from existing implementations, which is further described in the Supplement.A compound is defined by specifying a count of subgroups comprising the molecule.Equations ( 6)-( 8) are solved to find γ w for n linearly spaced values within the domain x w ∈ [0.0001, 0.9999].Resulting γ w are parsed through Eqs. ( 9)-( 11) to find the number of stable phases and to define a w over the entire domain.These a w are interpolated onto a higher resolution linearly gridded domain (m points) to improve the accuracy of the computation of s c using Eq. ( 4).Values for n and m are selected to balance computational speed and solution accuracy.Equations ( 6)-( 8) have linear time complexity.Equations ( 9)-( 11) have quadratic time complexity.Thus, the two algorithms have an order of O(n) and O(n 2 ), respectively.For n > 200, the overall model time complexity is O(n 2 ).For n > ∼ 800 and m = 10000, the resolution is sufficiently high so that the computed s c becomes independent of the choice of n.All computations in this work were carried out for n = 1000 and m = 10000.Total model execution times for a single compound on an Intel(R) Core(TM) i7-2600 3.4 GHz microprocessor using a single core were 39 s with MATLAB version R2013a (8.1.0.604) 64 bit and 282 s with GNU Octave version 3.8.1 configured for 64 bit.

Hygroscopicity parameter
Equation ( 4) is solved to find s c for a specified dry diameter, fixed T = 298.15K and σ s/a = 0.072 J m −2 .The result is expressed in terms of the hygroscopicity parameter κ (Petters and Kreidenweis, 2007) that is defined via The hygroscopicity parameter is obtained by iteratively seeking the κvalue that satisfies Eq. ( 12) for a given D d , s c pair.
Kappa values obtained by fitting a D d , s c pair to Eq. ( 12) with the assumed temperature and surface tension conceptually correspond to "apparent hygroscopicity at standard state" (Christensen and Petters, 2012).All values in this work are apparent κ's.For simplicity these are denoted as κ without further qualification.Observations against which the model is evaluated are summarized in the Supplement and will be discussed further in Sect.3.

Model assumptions and limitations
The model approach presented here is limited to liquid organic compounds.This assumption is implied in both molar volume and UNIFAC activity coefficient calculations.Comparison with observational CCN data where the reference phase state may be crystalline should be interpreted with caution.For example, CCN experiments performed with crystalline dicarboxylic acids demonstrate that for some compounds deliquescence, i.e., a solubility-controlled phase transition, must precede droplet activation (Petters and Kreidenweis, 2008).The UNIFAC approach is unable to accurately predict the solubility of these compounds if they existed in their crystalline solid state.If, however, the compound is in metastable aqueous solution, the UNIFAC prediction is expected to be valid to within the general accuracy of the specific model implementation.Under atmospheric conditions where the organic compounds are embedded in a matrix comprising a multitude of organic compounds, a liquid or amorphous solid is the prevailing stable phase (Marcolli et al., 2004).Furthermore, since metastable states with hygroscopically bound water appear to dominate in the atmosphere (Rood et al., 1989;Nguyen et al., 2014) the liquid assumption may not be a serious limitation.Nonetheless, it is unclear whether the assumption of a liquid-like reference state is a serious limitation if the organic particles are highly viscous (Vaden et al., 2011;Shiraiwa et al., 2011;Zobrist et al., 2011;Renbaum-Wolff et al., 2013).
Other limitations of the UNIFAC method are the problems of accounting for group proximity effects and the inability to distinguish between isomers.Proximity effects occur when polar groups are separated by less than three to four carbon atoms (Topping et al., 2005).Since only the number of groups of type i are specified, all isomers are modeled to have identical κ values.Although experiments show that the location of the functional group has a small and systematic effect on the observed κ (Suda et al., 2014), those effects are relatively small and beyond the resolution of the model presented here.
The application of Eq. ( 4) assumes that the surface tension is that of pure water.Many organic compounds found in ambient organic aerosol lower the surface tension at the solution-air interface (Tuckermann and Cammenga, 2004;Tuckerman, 2007).However, several studies have demonstrated via experiment and theory that surfactant partitioning between the bulk solution and the Gibbs surface phase greatly diminishes the effect one would predict by applying macroscopic surface tensions in Köhler theory (Li et al., 1998;Rood and Williams, 2001;Sorjamaa et al., 2004;Prisle et al., 2011).Neglecting to account for reduced surface tension and using water activity to estimate CCN activity results in an underestimate of κ by ∼ 30 % for the strong surfactant sodium dodecyl sulfate (Petters and Kreidenweis, 2013).We note that estimates of surface tension reduction for pure organic liquids can be obtained from critical pressure and boiling point (Sastri and Rao, 1995) and the Sprow and Prausnitz (1966) expression coupled with UNIFAC activity coefficients (Topping et al., 2005;Rafati et al., 2011).Combined with predictions of critical properties from functional group data (Joback and Reid, 1987), predicted binary surface tensions could be obtained for each compound.Including surfactant partitioning in Eq. ( 4) is possible using the expressions in Petters and Kreidernweis (2013) or similar approaches (Sorjamaa et al., 2004;Raatikainen and Laaksonen, 2011).Thorough validation against experimental data, including measurements of surface tension and CCN activity, is needed before this approach should be adopted.

Relationship to other thermodynamic models and application to multicomponent systems
The basic model functionality described here can also be obtained by appropriately initializing other multicomponent equilibrium models (Ming and Russell, 2002;Raatikainen and Laaksonen, 2005;Topping et al., 2005;Clegg and Seinfeld, 2006;Amundson et al., 2007;Zuend et al., 2008) with a set of binary water/organic solutions, parsing the output through a phase equilibrium module (if not included in the thermodynamic model itself) and the Köhler model.The predicted CCN activity mostly depends on the underlying set of group interaction parameters.The output should match with the solution presented here if the same interaction parameter matrix is used.The main conceptual distinction between the approach proposed here and the approach employed by the more complex multicomponent models is our focus on predictions for binary organic/water solutions and limitation of the scope to a narrow range of water activities relevant to CCN activation only.Accurate representation of hygroscopic growth at a w < ∼ 0.99 is not required and would be of secondary concern when tuning interaction parameters.We envision that the proposed specialized model approach can be used to categorize individual compounds into three miscibility regimes, analogous to the solubility regimes defined in Petters and Kreidenweis (2008).Regime I: the compound is CCN inactive and can be effectively modeled as κ = 0. Regime II: the compound is CCN active without any additional phase constraints.In turn κ is mostly determined by molar volume and slightly modulated by activity coefficients.Regime III: the compounds' CCN activity is limited due to miscibility constraints.In turn κ is highly sensitive to overall water content and can either have κ ∼ 0 or express www.geosci-model-dev.net/9/111/2016/Geosci.Model Dev., 9, 111-124, 2016 κ according to its molar volume.Once pure component κ's are predicted and stored in a database, the overall organic aerosol (OA) κ in mixed particles can be calculated quickly using the volume-weighted mixing rule (Petters and Kreidenweis, 2007).This compound-by-compound treatment of multicomponent mixtures assumes that solute-solute interactions are negligible.Salting-in and salting-out of solution effects are not captured.Effective κ values for compounds falling into the limited miscibility regime may be misrepresented in this treatment.Whether such effects are important will depend on the fraction of compounds in a mixture that fall into the limited miscibility regime and whether the proposed approach of intermediate complexity -modeling binary solutions coupled with a linear mixing rule -ultimately proves sufficiently accurate to model the evolution of ambient OA.In the following we use experimental data to demonstrate that the outlined UNIFAC model is suitable to categorize compounds into these three regimes.

Results and discussion
Experimental data for validation were compiled from the literature.A detailed summary of the compound names, chemical structures, physicochemical properties, CCN observations, and observed κ app 's is provided in the Supplement (Tables S3-S7).This set features compounds with mostly linear carbon backbones C 4 to C 18 and O : C ratio between 0.1 and 1.The data are grouped into model compounds for primary organic aerosol (POA; Table S3), functionalized hydroperoxy ethers (Table S4), hydroxy nitrates (Table S5), carboxylic acids (Table S6), and carbohydrates (Table S7).S3 are taken from Raymond and Pandis (2002) and Shilling et al. (2007).Data in Tables S4 and S5 are taken from the Supplement of Suda et al. (2014).Data in Tables S6 and S7 are from various sources and are summarized in the Supplement of Petters et al. (2009b), which was updated with new compounds from Christensen and Petters (2012), and data were re-screened for quality.The compounds were selected to provide systematic variation in the number and type of functional groups with otherwise similar structure, i.e., linear or weakly branched alkane backbone with variable carbon chain length.

Compounds included in
To illustrate model initialization and model output, two example compounds from the Supplement, C 12 dihydroxy nitrate and C 13 trihydroxy nitrate, are presented in Table 1.For some of the compounds density and solubility data are available and those data are included in the Supplement.Table 1 shows how the molecular structure is decomposed into the subgroups understood by the UNIFAC and Girolami (1994) model framework.Detailed model output for the two example compounds is illustrated in Fig. 1.The predicted mole fraction dependence of g mix suggests that the C 13 trihydroxy nitrate is miscible with water in all proportions while the C 12 dihydroxy nitrate is not.The dashed black line connecting x a and x b encloses the maximum positive area with the g mix line and defines the two-phase region.Water activity derived from g mix is graphed in the middle panel.It shows that the miscibility gap for the C 12 dihydroxy nitrate occurs at water activity close to unity.Phase gaps at water activity near unity may result in miscibility-controlled cloud droplet activation (Petters et al., 2006), which is analogous to solubility-/deliquescence-limited cloud droplet activation (Shulman et al., 1996;Hori et al., 2003;Bilde and Svenningsson et al., 2004;Kreidenweis et al., 2006;Petters and Kreidenweis, 2008).Köhler curves in the right panel demonstrate miscibility-limited activation behavior.For the C 13 trihydroxy nitrate, the Köhler curve is smooth and exhibits a single maximum corresponding to the model critical supersaturation.For the C 12 dihydroxy nitrate two maxima appear.The first maximum corresponds to the point of incipient phase separation x a .The height of the miscibility barrier depends on the dry diameter.For large dry particles where the Kelvin term does not play a significant role, the supersaturation of point x a is reduced and the second classical Köhler maximum will control droplet activation.Similar complex Köhler curves have been reported previously (e.g., Bilde and Svenningsson, 2004;Petters and Kreidenweis, 2008).Experiments with pure crystalline sparingly soluble organic compounds have demonstrated convincingly that the larger maximum indeed controls cloud droplet activation for solubility-limited cases (Hori et al., 2003;Bilde and Svenningsson, 2004;Hings et al., 2008).The s c vs. D d relationship for phase-controlled activation does not result in κ app that is independent with respect to D d (Petters and Kreidenweis, 2008).Therefore, for compounds having κ < ∼ 0.06 where phase separation might play a role, the observed s c , D d pair is included in the data tables (Tables 1, S3-S7) and κ values are computed from the observation and the model (Eq.12) at the same D d .Note that the D d -dependent κ only plays a role in a narrow range of miscibilities.Sufficiently soluble and truly insoluble substances are not affected.In summary, Table 1 and Fig How well do data-derived and model-derived κ app compare?For numerical comparison both κ's are included in Ta-Table 1. Properties for two example chemical compounds.UNIFAC representation indicated the number and type of subgroups to represent the chemical structure: MW denotes molecular weight (g mol −1 ) and v s denotes the model predicted molar volume (cm 3 mol −1 ).CCN reflects the observed supersaturation and dry diameter data pair obtained from the source (Suda et al., 2014)

25
number and type of subgroups to represent the chemical structure MW denotes molecular weight 634 (g mol -1 ) and vs denotes the model predicted molar volume (cm -3 mol -1 ).CCN reflects the 635 observed supersaturation and dry diameter data pair obtained from the source (Suda et al., 2014)

25
number and type of subgroups to represent the chemical structure MW denotes molecular weight 634 (g mol -1 ) and vs denotes the model predicted molar volume (cm -3 mol -1 ).CCN reflects the 635 observed supersaturation and dry diameter data pair obtained from the source (Suda et al., 2014) 1).Open circles denote the mole fractions x a and x b that correspond to the envelope of compositions where liquid-liquid phase separation is predicted for the C 12 dihydroxy nitrate.
bles S3-S7.A graphical illustration of these is presented in Fig. 2. To improve clarity, compounds with predicted and modeled κ < 0.001 are clustered in the lower left corner.Such low κ's correspond to compounds that are effectively CCN inactive.The range between κ = 10 −3 and 10 −5 spans a narrow range in the s c -D d -κ state space that characterizes CCN activity (cf.Fig. 1 in Petters and Kreidenweis, 2007).Resolving these differences is not particularly meaningful for organic dominated particles that typically have D d < 300 nm.Furthermore, the κ of an internally mixed particle is approximately the weighted volume fraction in the mixture.For κ < 10 −3 the contribution to a mixed particle's κ is insensitive to the exact value.Finally, although state-of-the-science size-resolved CCN measurements can resolve differences in κ < 10 −3 , compound impurities can interfere.A 1 % impurity having κ similar to ammonium sulfate would contribute ∼ 0.06 to a measured particle κ.In addition, solvent residuals (Huff Hartz et al., 2006;Shilling et al., 2007;Rissman et al., 2007) and control over the dry particle phase state (Raymond and Pandis, 2002;Hori et al., 2003;Broekhuizen et al., 2004;Bilde and Svenningson, 2004) can disproportionally bias the characterization of low κ's.Combined these points justify the definition of κ < 0.001 as effectively CCN inactive.Compounds in the CCN inactive corner include all compounds from Table S3, the C 14 and C 15 hydroxnitrate, and the C 14 trinitrate.These compounds all have 11 or more methylene groups and O : C ratios between 0.11 and 0.65.CCN activity of these compounds is satisfactorily predicted by the model.Nine compounds are predicted to be CCN inactive but have measurements indicating 0.001 > κ obs > ∼ 0.03.These are graphed below the dashed line and include C 14 di-and tetra-nitrate, C 13 hydroxy nitrate, C 14 and C 15 dihydroxy nitrate, the remaining hydroperoxide ethers from Table S4, and cis-pinonic acid.The observed C 14 di-and tetra-nitrate are  S6) Hydroxynitrates (Table S5) Carbohydrates (Table S7) Hydroperoxide ethers (Table S4) POA model compounds (Table S3)  barely larger than the cutoff for CCN inactive.Variation of κ between the C 14 di-, tri-and tetra-nitrate (cf.Fig. 2 in Suda et al., 2014) implies that the trinitrate has lower κ than the di-and tetra-nitrate, which suggests that some random variability in the data is superimposed on the trend.Similarly, the observations show that the C 14 and C 15 dihydroxy nitrate are slightly more CCN active than the C 13 dihydroxy nitrate.Although this is possible such behavior is not plausible due to the well-established hydrophobic nature of the added CH x groups.One possible explanation for the discrepancies is the sensitivity of observed κ's to trace contamination.Each of the compounds was purified via high-performance liquid chromatography (HPLC; Suda et al., 2014) but degree of purification likely varied between compounds.Furthermore, experimental uncertainty for the HPLC-CCN method used is slightly larger than for standard methods since it requires application of fast-flow scans.Finally, the data are from a single set of experiments.More data are needed before attributing the mismatch to either model or measurement error.

Cetyl
Another notable outlier is adipic acid.Here, the observed κ < 0.01 corresponds to the solubility-limited value that is referenced against its solid crystalline phase state.In contrast, the predicted value κ = 0.14 is in good agreement with the molar volume prediction (κ = 0.17; cf.Fig. 4 in Christensen and Petters, 2012) and observed κ that adipic acid particles express when solubility limitations are removed (cf.Fig. 1 in Hings et al., 2008).This scenario was selected to illustrate the inability of the UNIFAC model to treat solid phases.It therefore cannot capture deliquescence and deliquescence-/solubility-limited activation.In atmospheric OA multiple organic compounds likely form an amorphous supercooled melt (Marcolli et al., 2004) and metastable aqueous solutions are ubiquitous (Rood et al., 1989).Thus the metastable prediction would be valid to account for adipic acid in the context of atmospheric OA.
A series of carboxylic acids and carbohydrates cluster near the 1 : 1 line at κ > ∼ 0.06.These compounds are generally highly functionalized having at least two carboxyl, hydroxyl, or carbonyl groups for every four carbon atoms.The O : C ratio always exceeds 0.5 and is close to 1 for many of the compounds.For the predictions, activity coefficients approach unity, compounds are miscible in water in all proportions, and model κ's closely track the prediction based on estimated molar volume.Overall comparison of predicted vs. observed κ is approximately within a factor of 2 and this range is similar to predictions that are based on actual molar volume (cf.Fig. 2 in Petters et al., 2009b).
The series of hydroxy nitrates, dihydroxy nitrates, and trihydroxy nitrates for different carbon chain lengths also clusters near the 1 : 1 line.The spread is within approximately a factor of 2 and similar to that of the carboxylic acids and carbohydrates.These compounds span the entire range from κ < 0.001 to κ ∼ 0.1 and have as few as two hydroxyl and one nitrate group per 13 carbon atoms (C 13 dihydroxy nitrate).The model appears to accurately predict the influence of the methylene and hydroxyl groups on the transition from immiscible and CCN inactive to sufficiently miscible and CCN active according to the molar volume of the compound.For the C 11 , C 12 , and C 13 dihydroxy nitrates, the predicted miscibility-limited activation demonstrated in Fig. 1 seems to adequately explain the transition.The accurate model prediction of this sensitive transition regime is encouraging, especially since no adjustment was made to the a mn group interaction parameters for [OH], [CH x ], and [H 2 O] groups.
In summary, Fig. 2 demonstrates four capabilities of the model.First, the model has good skill in correctly classifying effectively CCN inactive compounds (κ < 0.001).Second, the model captures the molar volume-dependent activation of highly functionalized compounds (low molecular weight dicarboxylic acids and polysaccharides).Scatter between predicted and observed κ is approximately within a factor of 2 and considered acceptable taking into account the considerable diversity in the underlying CCN data.We note that uncertainties in molar volume estimation of v s ∼ ±10 % stemming from the Girolami (1994) method correspond to ±10 % error in predicted κ for these compounds, which is significantly less than the observed scatter in the data (Petters et al., 2009b).Third, the model predicts that miscibility limitations are the cause for poor CCN activity of weakly functionalized hydrocarbons, and the phase separation information can be used to quantitatively predict the transition between sufficiently miscible and effectively immiscible species.Finally, the model seems to accurately capture the main functional group dependencies observed previously (Suda et al., 2014): a strong promoting effect of hydroxyl, a weak promoting effect for hydroperoxides, a negligible or inhibiting effect of nitrate, and inhibiting effect of methylene groups on CCN activity.How, then, can one quantify the model sensitivity of κ to the addition of functional groups to otherwise similar molecules?
Simulation of homologous series can be used to derive these sensitivities.Figure 3 shows modeled κ's for a series of functionalized n-alkanes.The gradual decreasing trend of κ with increasing carbon number is due to the increase in molar volume.A steep decline is observed when a critical carbon number is exceeded.Beyond this point the additional methylene groups reduce the miscibility with water and render the compound effectively CCN inactive.For example, CCN activity for a C 16 trihydroxy alkane is controlled mostly by molar volume while C 18 trihydroxy alkane is effectively CCN inactive.The critical carbon number is C 7 , C 12 , C 16 , C 20 , and C 24 for the mono-, di-, tri-, tetra-, and penta-hydroxy alkanes, respectively.Starting with an n-alkane, the most dramatic effect of adding functional groups is to render the molecule miscible with water.Contrasting the critical carbon number for different homologous series can be used as a measure of a particular groups' ability to transform the molecule such that it is sufficiently miscible in water and can express its molar volume κ.The hydroxy alkane series shows that approximately one hydroxyl group is needed to compensate for the addition of four methylene groups (i.e., to maintain miscibility at the composition of the critical carbon number), expressed as a ratio, [CH n ] / [OH] ∼ −4/1.Similar ratios for the other groups are derived from the shifts in the dihydroxy alkane series upon further functionalization: . This leads to a sorting of relative effectiveness of the groups in promoting miscibility, hydroxyl (−4) > acid (−2.5) > aldehyde (−2) > hydroperoxide (−1) > carbonyl (−0.66) > ether (−0.5) > nitrate (0.66), where the number in parentheses corresponds to the  NO 3 radicals (Suda et al., 2014, their Supplement), and the known low miscibility of organic nitrates in water (Boschan et al., 1955).Furthermore, sorting of the different functional groups is qualitatively consistent with the sensitivity of κ to the addition of functional groups derived from CCN data (Table S5, Suda et al., 2014).

Treatment of OA evolution in the atmosphere
The computational speed of the model is relatively slow.The slow speed is due to the need to evaluate the entire range of mole fractions in order to determine the phase boundaries.Improvement in model execution speed is likely possible via algorithm optimization.Furthermore, parallel execution of the code is possible.With a regular workstation it is feasible to perform offline computation of ∼ 10 6 κ's for a large set of compounds produced by the Generator of Explicit Chemistry and Kinetics of Organics in the Atmosphere (GECKO-A) or similar models.Once pure component κ's are predicted, the evolution of the overall OA κ in mixed particles can be calculated quickly using the linear mixing rule (Petters and Kreidenweis, 2007), subject to the limitations of this approach discussed in Sect. 2. One additional limitation is the need for algorithms that automatically map the computer-generated simplified molecularinput line-entry system (SMILES) structures (e.g., Table 3 in Lee-Taylor et al., 2015) to UNIFAC groups.Several of these structures are bridged and even manual mapping of those structures to UNIFAC groupings will necessitate definition of new groups with unknown volume, surface, and interaction parameters.Separate studies are needed to establish the minimal number of new groups that would be needed to obtain optimal coverage for the set of compounds of interest.

Summary and conclusions
This paper describes how functional group contribution methods can be used to estimate the CCN activity of pure organic compounds.Group interaction parameters were taken from a mix of sources and used without further tuning.Model fidelity was evaluated against a database of published CCN data.Weakly functionalized alkanes are correctly classified as effectively CCN inactive (defined as κ < 0.001).Highly functionalized and water-soluble molecules are predicted to activate in accordance with the estimated molar volume and generally predictions agree with observations within a factor of 2. Liquid-liquid phase separation is predicted to occur for compounds with few functional groups and phase separation is predicted to control κ.The model adequately reproduces the observation that hydroxyl groups strongly promote CCN activity while nitrate groups inhibit CCN activity.A few outliers in the model evaluation may be explained by the combination of CCN measurement uncertainty, compound pu-rity, uncertainty in dry particle phase state, and insufficiently tuned group interaction parameters.However, more systematic data on weakly functionalized compounds, including repeat studies, are needed before a retuning of parameters is justified.The model makes new predictions about the relative effectiveness of the groups in promoting miscibility.Most notably, it predicts that hydroperoxides have much less of an effect than hydroxyl, which is slightly surprising since one would expect the hydrogen bonding to be similar.The model state space can serve as a rough guide to define test conditions to quantify via experiment the effectiveness of adding one or more functional groups to a carbon backbone.
Although this work is limited to a few functional groups, the presented framework is general since interaction parameters are available for a wide range of groups.For atmospheric purposes, amines, olefins, and aromatic compounds are the most relevant groups that need to be added.Few, if any, systematic CCN data for these groups are available.However, the success of the current model to estimate κ without the need to tune parameters could be taken as an indication that first-order predictions can be obtained until such data become available.

Code availability
Source code and example scripts demonstrating model initialization for the compounds presented in this study are available as Supplement to this manuscript.
The Supplement related to this article is available online at doi:10.5194/gmd-9-111-2016-supplement.
. 1 demonstrate model input, illustrate model mechanics, and identify model outputs.

Figure 2 .
Figure 2. Model predicted vs. experimentally determined κ values.Values κ < 0.001 are classified as CCN inactive and are clustered in the lower left corner of the graph.Colors are used to delineate the grouped source data in the Supplement.Selected structures from the Supplement are included in the graph.C x HN, C x DHN, and C x THN denote hydroxy nitrate, dihydroxy nitrate, and trihydroxy nitrate and x denotes the total number of carbon atoms.C 14 DiN, C 14 TriN, C 14 TetraN denote the C 14 dinitrate, trintrate, and tetranitrate, respectively.Points below the dashed line corresponds to compounds with predicted κ < 0.001 and observed κ > 0.001.Typical range of observed κ CCN for peroxides is indicated by the horizontal bar.

Figure 3 .
Figure 3. Modeled κ values for homologous series of functionalized n-alkanes.Solid lines correspond to alkanes with 1-5 non-terminal hydroxyl groups.Orange dashed lines correspond to further functionalized dihydroxy alkanes as described in the legend.Colored carbon numbers (C 7 , C 12 , C 16 , C 20 , and C 24 ) correspond to the largest carbon number without miscibility-limited activation for the respective hydroxy alkanes series.
within UNIFAC are represented as main groups and subgroups.The main groups evaluated in this work are alkane [CH n ], alcohol [OH], water [H 2 O], carbonyl [CH n C(=O)], aldehyde [HC(=O)], ether [CH n (O)], carboxyl [C(=O)OH], nitrate [CH n ONO 2 ], and hydroperoxide [CH n (OOH)].Interaction parameters a mn between the main groups that are used in this work are tabulated in Table

Table S3
from which observed κ was determined.