Influence of bulk microphysics schemes upon Weather Research and Forecasting (WRF) version 3.6.1 nor’easter simulations

This study evaluated the impact of five singleor double-moment bulk microphysics schemes (BMPSs) on Weather Research and Forecasting model (WRF) simulations of seven intense wintertime cyclones impacting the midAtlantic United States; 5-day long WRF simulations were initialized roughly 24 h prior to the onset of coastal cyclogenesis off the North Carolina coastline. In all, 35 model simulations (five BMPSs and seven cases) were run and their associated microphysics-related storm properties (hydrometer mixing ratios, precipitation, and radar reflectivity) were evaluated against model analysis and available gridded radar and ground-based precipitation products. Inter-BMPS comparisons of column-integrated mixing ratios and mixing ratio profiles reveal little variability in non-frozen hydrometeor species due to their shared programming heritage, yet their assumptions concerning snow and graupel intercepts, ice supersaturation, snow and graupel density maps, and terminal velocities led to considerable variability in both simulated frozen hydrometeor species and radar reflectivity. WRF-simulated precipitation fields exhibit minor spatiotemporal variability amongst BMPSs, yet their spatial extent is largely conserved. Compared to ground-based precipitation data, WRF simulations demonstrate low-to-moderate (0.217–0.414) threat scores and a rainfall distribution shifted toward higher values. Finally, an analysis of WRF and gridded radar reflectivity data via contoured frequency with altitude diagrams (CFADs) reveals notable variability amongst BMPSs, where better performing schemes favored lower graupel mixing ratios and better underlying aggregation assumptions.


Introduction
Bulk microphysical parameterization schemes (BMPSs), within numerical modern weather-prediction models (e.g., Weather Research and Forecasting model, WRF; Skamarock et al., 2008), have become increasingly complex and computationally expensive.Presently, WRF offers BMPS options varying from simplistic, warm-rain physics (Kessler, 1969) to multi-phase, six-class, two-moment microphysics (Morrison et al., 2009).Microphysics and cumulus parameterizations drive cloud and precipitation processes within WRF and similar models, which has consequences for radiation, moisture, aerosols, and other simulated meteorological processes.Tao et al. (2011) highlighted the importance of BMPSs in models by summarizing more than 36 published, microphysics-focused studies ranging from idealized simulations to hurricanes to mid-latitude convection.More recently, the observation-based studies of Stark (2012) and Ganetis and Colle (2015) investigated microphysical species variability within United States (USA) east coast wintertime cyclones (locally called "nor'easters") and have called for further investigation into how BMPSs impact these cyclones, which is the motivation behind this nor'easter study.
A nor'easter is a large ( ∼ 2000 km), mid-latitude cyclone occurring from October to April and is capable of bringing Published by Copernicus Publications on behalf of the European Geosciences Union.
punishing winds, copious precipitation, and potential coastal flooding to the northeastern USA (Kocin and Uccellini, 2004;Jacobs et al., 2005;Ashton et al., 2008).This region is home to over 65 million people and produces USD 16 billion of daily economic output (Morath, 2016).Given its high economic output, nor'easter-related damages and disruptions can be extreme.Just 10 strong, December nor'easters, between 1980 and 2011, produced USD 29.3 billion in associated damages (Smith and Katz, 2013).
Recent nor'easter studies are scarce given the extensive research efforts of the 1980s.These historical studies addressed key environmental drivers including frontogenesis and baroclinicity (Bosart, 1981;Forbes et al., 1987;Stauffer and Warner, 1987), anticyclones (Uccelini and Kocin, 1987), latent heat release (Uccelini et al., 1987), and moisture transport by the low-level jet (Uccellini and Kocin, 1987;Mailhot and Chouinard, 1989).Despite extensive observational analyses, little attention has been given to role of BMPSs in mid-latitude winter cyclones.Reisner et al. (1998) ran several Mesoscale Model version 5 winter storm simulations with multiple BMPS options that impacted the Colorado front range during the Winter Icing and Storms Project.Double-moment BMPSs produced more accurate simulations of super-cooled water and ice mixing ratios than single-moment BMPSs.However, single-moment BMPS-based simulations vastly improved when the snow size distribution intercepts were derived from a diagnostic equation rather than from a fixed value.Wu and Pretty (2010) investigated how five six-class BMPSs affected WRF simulations of four polar-low events (two over Japan, two over the Nordic Sea).Their simulations yielded nearly identical storm tracks, but notable cloud top temperature and precipitation errors.Overall, the WRF single-moment BMPS (Hong and Lim, 2006) produced marginally better cloud and precipitation process simulations than those from other BMPSs.For warmer, tropical cyclones, Tao et al. (2011) investigated how four six-class BMPSs impacted WRF simulations of Hurricane Katrina.They found BMPS choice minimally impacted storm track, yet sea-level pressure varied up to 50 hPa.Shi et al. (2010) evaluated several WRF single-moment BMPSs during a lake-effect snow event.Simulated radar reflectively and cloud-top temperature validation revealed that WRF accurately simulated the onset, termination, cloud cover, and band extent of a lake-effect snow event; however snowfall totals at fixed points were less accurate due to interpolation of the mesoscale grid.Inter-BMPS simulation differences were small because low temperatures and weak vertical velocities prevented graupel generation.Reeves and Dawson (2013) investigated WRF sensitivity to eight BMPSs during a December 2009 lake-effect snow event.Simulated precipitation rates and snowfall coverage were particularly sensitive to BMPSs because vertical velocities exceeded hydrometeor terminal fall speeds in half of their simulations.Vertical velocity differences were attributed to varying BMPS frozen hydrometeor assumptions concerning snow density values, temperature-dependent snow-intercepts, and graupel generation terms.
This study will evaluate WRF nor'easter simulations and their sensitivity to six-and seven-class BMPSs with a focus on microphysical properties and precipitation.The remainder of this paper is divided into three sections.Section 2 explains the methodology and analysis methods.Section 3 shows the results.Finally, Sect. 4 describes the conclusions, their implications, and prospects for future research.
This study investigates the seven nor'easter cases described in Table 1 and shown in Fig. 1.These cases are identical to those in Nicholls and Decker (2015) and represent a small, diverse sample of nor'easter events of varying intensity and seasonal timing.In Table 1, the Northeast Snowfall Impact Scale (NESIS) value serves as proxy for storm severity (1 = notable, 5 = extreme) and is based upon storm duration, population impacted, area affected, and snowfall severity (Kocin and Uccellini, 2004).Early and late season storms  (cases 1, 2, and 7) did not have snow and thus lack a NESIS rating.Furthermore, 5-day, WRF model simulations for this study were initialized 24 h prior to the first precipitation impacts in the highly populated mid-Atlantic region and prior to the onset of rapid, coastal cyclogenesis off of the North Carolina coastline.This starting point provides sufficient time to establish mesoscale circulations, surface baroclinic zones, and sensible and latent heat fluxes (Bosart, 1981;Uccelini and Kocin, 1987;Kuo et al., 1991;Mote et al., 1997;Kocin and Uccellini, 2004;Yao et al., 2008;Kleczek et al., 2014).
The first nor'easter-associated precipitation impacts are defined as the first 0.5 mm (∼ 0.02 inch) precipitation reading from the New Jersey Weather and Climate Network (Robinson, 2005) related to the cyclone.A smaller threshold was not used to avoid capturing isolated showers occurring well ahead of the primary precipitation shield.

Evaluation and analysis techniques
Model evaluation efforts involved comparing WRF model output to GMA, Stage IV precipitation (StIV; Fulton et al., 1998;Lin and Mitchell, 2005), and Multi-Radar, Multi-Sensor (MRMS) three-dimensional (3-D) volume radar reflectivity (Zhang et al., 2016).GMA offers six-hourly, gridded dynamical fields, including water vapor, with global coverage.StIV is a six-hourly, 4 km resolution, gridded, combined radar and rain gauge precipitation product covering the USA.Finally, MRMS is 2 min, 1.3 km resolution, gridded 3-D volume radar mosaic product derived from S-and C-band radars covering the USA and southern Canada (Zhang et al., 2016) and it is the operational successor to the National Mosaic and Multi-Sensor QPE (Quantitative Precipitation Estimation) (NMQ; Zhang et al., 2011) product.Both StIV and MRMS, however are limited by the detection range of their surface-based assets.All cross comparisons between WRF and these evaluation data were conducted at identical grid resolution.
Analysis of WRF model microphysical, precipitation, and simulated radar output was comprised of three main parts: precipitable mixing ratios and domain-averaged mixing ratio profiles, simulated precipitation, and simulated radar reflectivity.Precipitable mixing ratios are calculated for all six microphysical species (vapor, cloud ice, cloud water, snow, rain, and graupel) using the equation for precipitable water: In Eq. ( 1), PMR is the precipitable mixing ratio in mm, ρ is the density of water (1000 kg m −3 ), g is the gravitational constant (9.8 m s −2 ), p sfc is the surface pressure (Pa), p top is the model top pressure (Pa), w is the mixing ratio (kg kg −1 ), and dp is the change in atmospheric pressure between model levels (Pa).Only water vapor PMRs are evaluated because all other GMA mixing ratio species are non-existent, and ground and space validation microphysical data are lacking, especially over the data-poor North Atlantic (Li et al., 2008;Lebsock and Su, 2014).Similarly, mixing ratio profiles will only be inter-compared amongst BMPSs because satellitederived cloud-ice profile products (e.g., CloudSat 2C-ICE; Deng et al., 2013) do not directly overpass domain 4 during coastal cyclogenesis for any case.WRF-simulated precipitation fields and their distribution were evaluated against StIV and simulation error was quantified via bias and threat score (critical success index; Wilks, 2011) values.Finally, contoured frequency with altitude diagrams (CFADs) were used to validate WRF-simulated radar reflectivity relative to MRMS similar to the radar validation efforts of Yuter and Houze (1995), and Lang et al. (2011Lang et al. ( , 2014)).A CFAD offers the advantage of preserving frequency distribution information, yet is insensitive to spatiotemporal errors.Additionally, CFAD-based scores were calculated for each height level and with time using Eq. ( 2).
In Eq. ( 2), CS is the CFAD score, and PDF m and PDF o ( %) are the probability density functions (PDF) at constant height from WRF and MRMS, respectively.The CFAD score ranges between 0 (no PDF overlap) to 1 (identical PDFs).

Hydrometeor species analysis
Figure 2 displays six classes (water vapor, cloud water, graupel, cloud ice, rain, and snow) of precipitable mixing ratios (mm) from each WRF simulation and GMA, and Fig. 3 shows corresponding simulated radar reflectivity (no MRMS on this date) at 4000 m above mean sea level (a.m.s.l.) from case 5, domain 4 at 06:00 UTC, February 2010.At this time, storm track errors are negligible, the cyclone is centralized within domain 4, and mixing ratio profiles (Fig. 4) show all hydrometeor species to coincide at 4000 m a.m.s.l. and that snow and graupel mixing ratios approach their maximum values at this height.Figure 5, shows the seven-case composite mixing ratios derived from hourly data during the residence time of each nor'easter case in domain 4 (24-30 h).This composite illustrates that mixing ratio profiles largely preserve their shape, maximum mixing ratio heights, and mixing ratio tendencies (i.e., higher snow-mixing ratios in GCE6 and GCE7), but hourly mixing ratio values themselves can vary up to 3.5 times higher (QRAIN; WDM6) at a given height than in the seven case composite (Fig. 5).  3 best corresponds to precipitable rain and then graupel (Fig. 2) despite the near non-existence of the former at 4000 m a.m.s.l.(Fig. 4).This apparent discrepancy suggests localized enhancement of rain mixing ra-tios where stronger vertical velocities near convection likely drive the freezing level higher than Fig. 4 indicates.Within the broader precipitation shield (20-35 dBZ), radar reflectivity patterns best correspond to precipitable snow and then precipitable graupel (Fig. 2) for all BMPSs except for Lin6 where this trend is reversed.Although Fig. 4 shows that all five BMPSs loosely agree on amount and height of maximum graupel at 4000 m a.m.s.l., Lin6 has little to any snow at this level, which likely explains the trend reversal.Inter-BMPS mixing ratio variability both at this level and throughout the troposphere is due to identifiable trends within the underlying assumptions made by BMPSs and will be explained in more detail below.
All evaluated BMPSs share a common heritage with the Lin scheme (note: Lin6 is a modified form of the original Lin scheme).Amongst the BMPSs, only WDM6 explicitly forecasts cloud condensation nuclei, rain, and cloud water number concentrations, the remaining schemes apply derivative equations for these quantities (Hong et al., 2010).Aside from the above, all five BMPS differ primarily in their treatment of frozen hydrometeors, which is most evident from the nearly identical (exception: WDM6) rain mixing ratio profiles (Figs. 4 and 5) and precipitable water vapor (Fig. 2) and is a result consistent with Wu and Petty (2010).Comparing WSM6 to WDM6 reveals the second moment has little to no effect on precipitable rain coverage area (Fig. 2), yet precipitable rain is enhanced (Fig. 2) and rain mixing ratios drop sharply near the surface.
Similar to rain, precipitable cloud water extent (Fig. 2) and maximum cloud water height (Figs. 4 and 5) barely change, yet mixing ratio amounts (Figs. 2, 4, 5) did vary amongst the BMPSs.These cloud-water mixing ratio differences are likely associated with both varying ice supersaturation allowances as described for the Goddard schemes by Chern et al. (2016) and for the WRF schemes by Hong et al. (2010) and assumed cloud water number concentrations (300 cm −3 for WSM6).Although WDM6 borrows much of its source code from WSM6, forecasts of cloud condensation nuclei and cloud water number concentrations alter interhydrometeor species interactions, which in turn alter cloudwater mixing ratios (Hong et al., 2010).The similarly between WSM6 and WDM6 in Figs.2-4 indicate that forecasted cloud number concentrations for case 5 are likely close to the 300 cm −3 value assumed by WSM6.For the other cases, cloud-water mixing ratios did vary between WSM6 and WDM6 indicating that WDM6 cloud-water number concentrations did stray from 300 cm −3 and therefore cause the apparent differences in composite cloud water mixing ratios (Fig. 5).
Figures 2, 4, and 5 show that precipitable snow and snow-mixing ratios vary considerably amongst the BMPSs with Lin6 and GCE6 having the smallest and largest snow amounts, respectively.Dudhia et al. (2008) and Tao et al. (2011) attributed the low snow-mixing ratios in Lin6 to its high rates of dry collection of snow by graupel, its low snow size distribution intercept (decreased surface area), and its auto-conversion of snow to either graupel or hail at high mixing ratios.GCE6 turns off dry collection of snow and ice by graupel, greatly increasing the snow-mixing ratios at the expense of graupel and reducing snow riming efficiency relative to Lin6 (Lang et al., 2007).Snow growth in GCE6 is further augmented by its assumption of water saturation for the vapor growth of cloud ice to snow (Reeves and Dawson, 2013;Lang et al., 2014).GCE7 addressed the vapor growth issue of GCE6 by introducing snow size and density mapping, snow breakup interactions, a relative humidity (RH)-based correction factor, and a new vertical-velocitydependent ice supersaturation assumption (Lang el al., 2007(Lang el al., , 2011(Lang el al., , 2014;;Chern et al., 2016;Tao et al., 2016).Despite the reduced efficiency of vapor growth of cloud ice to snow due to both the new RH correction factor and the ice supersaturation adjustment, the new-snow mapping and enhanced cloud ice-to-snow auto-conversion in GCE7 offset this potential reduction, which kept GCE snowfall mixing ratios higher than  those in non-GCE BMPSs.Unlike Lin6, WSM6 and WDM6 assume that grid cell graupel and snow fall speeds are identical (Dudhia et al., 2008) and that ice nuclei concentration is a function of temperature (Hong et al., 2008).These two aspects, effectively eliminate the accretion of snow by graupel and increase snow-mixing ratios at lower temperatures (Dudhia et al., 2008;Hong et al., 2008).Figures 4 and 5 show the maximum snow-mixing ratio height is roughly conserved in all non-Lin6 BMPSs.Lin6's assumption of non-uniform graupel, and snow fall speeds and dry collection of snow by graupel reduces snow-mixing ratios in the middle troposphere and raises its maximum snow-mixing ratio height.Compared to snow, graupel mixing ratios are generally smaller except for Lin6 where an unrealistically high, dry collection of snow by graupel dominates species growth (Stith et al., 2002).Graupel mixing ratios are the lowest in GCE7 due to the net effect of its additions despite the inclusion of a new graupel size map.In particular, the combination of the new snow size map (decrease snow size aloft, increases snow surface area, and enhances vapor growth), the addition of deposition conversion processes (graupel/hail particles experiencing deposition growth at lower temperatures are converted to snow), and a reduction in super-cooled droplets available for riming (cloud-ice generation is augmented; see below) all favor snow growth at the expense of graupel (Lang et al., 2014;Chern et al., 2016;Tao et al., 2016).Consistent with Reeves and Dawson (2013), WSM6 and WDM6 graupel-mixing ratio values are typically 30-50 % of their snow counterparts.
Although cloud-ice mixing ratios are nearly an order of magnitude smaller than those for snow (GCE6), these mixing ratios still vary greatly amongst the BMPSs as illustrated in Figs. 2, 4, and 5. Cloud-ice mixing ratios are the highest in GCE7 and lowest in Lin6.Wu and Petty (2010) similarly found low cloud-ice mixing ratios in Lin6 simulations and ascribe it to dry collection by cloud ice by graupel and its fixed cloud-ice size distribution.Similar to Lin6, GCE6 uses a monodispersed cloud-ice size distribution (20 µm diameter), but assumes vapor growth of cloud ice to snow assuming water saturation conditions (yet supersaturated with respect ice) leading to higher cloud-ice amounts and also increased cloud ice-to-snow conversion rates (Lang et al., 2011;Tao et al., 2016).GCE7 blunts cloud ice-to-snow conversion rates using a RH correction factor that is dependent upon ice supersaturation, which is itself dependent up vertical velocity.Additionally, GCE7 also includes contact and immersion freezing terms (Lang et al., 2011), makes the cloud-ice collection by snow efficiency a function of snow size (Lang et al., 2011(Lang et al., , 2014)), sets a maximum limit on cloud-ice particle size (Tao et al., 2016), makes ice nuclei concentrations follows the Cooper curve (Cooper, 1986;Tao et al., 2016), and allows cloud ice to persist in ice subsaturated conditions (i.e., RH for ice ≥ 70 %) (Lang et al., 2011(Lang et al., , 2014)).Despite the increased cloud ice-to-snow auto-conversion rates in GCE7 (Lang et al., 2014;Tao et al., 2016), precipitable cloudice amounts nearly doubled relative to GCE6 (See Fig. 2).Similar to GCE7, WSM6 generates larger cloud-ice mixing ratios than Lin6, which Wu and Petty (2010) attributed to excess cloud glaciation at temperatures between 0 and −20 • C and its usage of fixed cloud-ice size intercepts.Additionally, both WSM6 and WDM6 include ice sedimentation terms, which promote smaller cloud-ice amounts (Hong et al., 2008).Despite their varying assumptions, the maximum cloud-ice heights for both case 5 and overall (Figs. 4 and 5) are consistent between BMPSs.

Stage IV precipitation analysis
Excessive precipitation, whether frozen or not, is one of the most potentially crippling impacts of a nor'easter.Figures 6  and 7 show domain 3, accumulated precipitation, their difference from StIV, and the associated probability and cumulative distribution functions (PDF and CDF, respectively) for cases 5 and 7 based upon the 24-30 h residence period of a nor'easter within domain 4. Domain 3 serves as the focus for this section because most of domain 4 resides close to or outside the StIV data boundaries.Cases 5 and 7 are chosen because of their near-shore tracks (Fig. 1), which affords good StIV data coverage.Table 3 includes threat score and bias information from all seven cases and their associated standard deviation statistics.Both threat score and model bias assume the same 10 mm threshold value, which is approximately the 25th percentile of accumulated precipitation (Figs. 6 and 7).
The case 4 threat score and bias values (Table 3) are more than 2 standard deviations from the composite mean due to its non-coastal storm track (Fig. 1) and thus it is excluded from this analysis.The remaining six cases show WRF to have low-to-moderate forecast skill (threat score: 0.217 -Lin6; 0.414 -Lin6) and to cover too large an area with precipitation values greater than 10 mm (bias: 1.47 -Lin6, case 7; 4.05 -GCE7, case 3) relative to StIV.Inter-BMPS threat score and bias differences are an order or magnitude or less than the values from which they are derived.Consistent with Hong et al. (2010), threat score and bias values from WSM6 are equal to or improved upon by WDM6 due to its inclusion of a cloud condensation nuclei feedback.Overall, WDM6 shows marginally better precipitation forecast skill than other BMPSs (lowest threat score in four out of six cases and lowest mean threat score: 0.322), yet Lin6 is the least biased (lowest bias score in four of out of six cases and lowest mean bias: 2.55).
PDF and CDF plots from Figs. 6 and 7 show WRF to favor higher precipitation amounts and is consistent with the positive bias scores in Table 3.Previous modeling studies of strong convection by Ridout et al. (2005) and Dravitzki and McGregor (2011) found that both GFS and the Coupled Ocean-Atmosphere Mesoscale Prediction System produced too much light precipitation and too much heavy precipitation, which contrast with the above results.Unlike these two studies, nor'easters track too far offshore to be fully sampled by rain gauge data and S-band weather radars.These two issues could lead to an under bias in StIV data, especially near the data boundaries and suggests that WRF threat scores and biases are likely closer to observations than Table 3 indicates.Marginal changes in accumulated precipitation PDFs and CDFs and threat scores amongst BMPSs are consistent with the investigation of simulated precipitation during warm-season precipitation events and a quasistationary front by Fritsch and Carbone (2004) and Wang and Clark (2010), respectively.ity control measures for non-precipitating echoes tend to artificially curtail radar echoes at 5 dBZ, especially near the dataset edges (J.Zhang, NOAA, personal communication, 2016).Domain 4-based CFADs (not shown) depict little to no aggregation and are inconsistent with CFADs from previous convection (Lang et al., 2011;Min et al., 2015) and midlatitude winter storm (Shi et al., 2010) studies.The larger spatial extent and better radar overlap in domain 3 leads to more realistic CFADs with aggregation.Case 4 data are shown in Fig. 8 because MRMS data were more readily available and apply the latest MRMS reprocessing algorithm.

MRMS and radar reflectivity analysis
Figure 8 shows that the MRMS-based CFAD has two distinct frequency maxima: one above and another below 6000 m a.m.s.l.. Model simulations replicate the sub-6000 m a.m.s.l.frequency maxima with varying degrees of success.Below 2000 m (0 • C height), GCE7-and Lin6-based CFADs more closely match the MRMS radar reflectivity probability spectra and correctly show its maximum to occur between 0 and 15 dBZ.Other schemes over broaden this probability spectra and shift its maximum toward higher reflectivity values.Despite this rightward shift, hydrometeor profiles below 2000 m a.m.s.l.(Fig. 4) are similar for all BMPS and that factors including assumed or simulated (WDM6) droplet size distributions or aggregation assumptions may be probable causes.
Between 2000 and 6000 m, all non-GCE7 CFADs incorrectly shift toward higher reflectivity values with increasing height and favor values up to 10 dBZ higher (WSM6) than MRMS.Radar reflectivities at 3000 m a.m.s.l. on 26 January 2015 (Fig. 9) indeed show an overestimation of radar reflectivities in non-GCE7 BMPSs from regions of strong convection off of the North Carolina and New Jersey coastlines near the cold front and warm front, respectively.This rightward bowing of CFADs above the melting layer was also reproduced in Shi et al. (2010) (GCE6) and Min et al. (2015) (WSM6 and WDM6).Similar to these studies, all non-GCE7 schemes seemingly produce too much graupel (Fig. 4), which have stronger reflectivity signatures (see Sect. 3.1).GCE7 has the least graupel as a consequence of its new snow size map, inclusion of deposition processes, reduced super-cooled cloud droplets, and improved aggregation physics.
Above 6000 m a.m.s.l. the WRF-based CFADs all collapse toward smaller reflectivity values.This collapse is well documented in the literature (Shi et al., 2010;Lang et al., 2011;Figure 9. MRMS radar reflectivity and WRF-simulated radar reflectivity (dBZ) at 3000 m above sea level at 18:00 UTC, 26 January 2015.Shown radar reflectivity differences are as indicated.Min et al., 2015) and occurs due to errors stemming from increased entrainment of ambient air near cloud top and underlying aggregation assumptions made by each BMPS.Although each scheme fully collapses by 7500 m a.m.s.l., the Goddard-based CFADs indicate a considerably steeper tilt in the maximum frequency core as compared to other schemes, which is a likely byproduct of its higher snowfall mixing ratios (Fig. 4).Once above, 8000 m a.m.s.l., MRMS radar reflectivity values show a second frequency maxima above 15 dBZ, which is not replicated by WRF.Radar reflectivities at 9000 m a.m.s.l. on 26 January 2015 (Fig. 10) show precipitating echoes to occur offshore where the non-precipitating echo filtering applied in MRMS removed weak reflectivities and artificially shifting the CFAD toward higher values.
Finally, CFAD scores (Eq.2) with height and time (Fig. 11) provide a means to evaluate hourly forecast skill at each higher level relative to MRMS. Figure 11 shows Lin6 and GCE7 to have notably improved forecast skill, especially between 2000 and 4850 m a.m.s.l., where increased graupel mixing ratios and droplet sizes, which produced radar reflectivities, are higher than those from MRMS.Despite their similar CFAD scores, CFAD structures (Fig. 8) and 3000 m a.m.s.l.radar reflectivities (Fig. 9) do suggest that GCE7 produces more realistic results than Lin6, where the rate of dry collection of snow by graupel is unrealistically high.In short, Lin6 produces the right answer for the wrong reason, whereas GCE7 produces the correct answer with a more realistic solution.Between 6300 and 7000 m a.m.s.l., GCE7 CFAD scores fall below all other schemes as a consequence of overly small droplets from its aggregation simulations and cloud entrainment, which cut off cloud tops at lower heights.The other six cases produce similar tendencies in their CFAD and CFAD scores as noted above for case 4, except cloud heights become higher and CFADs become wider with the introduction of stronger convection in early and late season events.

Conclusions
The role and impact of five bulk microphysics schemes (BMPSs; Table 2) upon seven Weather Research and Forecasting model (WRF) wintertime cyclone ("nor'easter") simulations (Table 1) are investigated and validated against GFS model operational analysis (GMA), Stage IV rain gauge and radar estimated precipitation, and the radar-derived, Multi-Radar, Multi-Sensor (MRMS) 3-D volume radar reflectivity product.Tested BMPSs include three single-moment, sixclass BMPSs (Lin6, GCE6, and WSM6), one single-moment, seven-class BMPS (GCE7), and one double-moment, sixclass BMPS (WDM6).Simulated hydrometer mixing ratios show general similarities for non-frozen hydrometeor species (cloud water and rain) due to their common Lin BMPS heritage.However, frozen hydrometeor species (snow, graupel, cloud ice) demonstrate considerably larger variability amongst BMPSs.This variability results from different assumptions concerning snow and graupel intercepts, degree of allowable ice supersaturation, snow and graupel density maps, and terminal velocities made by each BMPS.WRF-simulated precipitation fields exhibit similar coverage, but tend to favor higher precipitation amounts relative to Stage IV observations resulting in low-to-moderate threat scores (0.217-0.414).Inter-model differences are an order of magnitude or less than the threat score values, but WDM6 does demonstrate marginally better overall forecast skill.Finally, MRMS-based contoured frequency with altitude diagrams (CFADs) and CFAD scores show Lin6 and GCE7 are best in the lower half of the troposphere, where GCE7 most realistically reproduced the maximum frequency core between 5 and 15 dBZ due to its temperature and mixing-ratio-dependent aggregation and new-snow map.However, the overly large growth of graupel by dry collection of snow by graupel does suggest that Lin6 obtains high CFAD scores with a less realistic solution than GCE7.Above 6300 m a.m.s.l., model simulations approach or exceed their cloud tops where entrainment and hydrometeor sizes differences alter cloud top heights and reflectivity fields and nonprecipitating echo filtering in MRMS data make evaluations less meaningful with increasing height above cloud top.
This study has shown that although BMPS choice has minimal impact on the large-scale simulated environment, its effect upon microphysical and precipitation properties of a nor'easter is more profound.No single BMPS demonstrated consistently improved precipitation forecast skill as compared to other schemes, yet differences in their underlying microphysical assumptions do yield variable forecast skill of simulated radar reflectivity structures amongst the BMPSs when compared to MRMS observations.Follow-on studies could investigate additional nor'easter cases or simulate other weather phenomena (polar lows, monsoon rainfall, drizzle, etc.).Results covering multiple phenomena may provide guidance for model users in their selection of BMPS for a given computational cost.Additionally, potential studies could focus on key aspects of a nor'easter's structure (such as the low-level jet) or validation of model output against current and recently available satellite-based datasets from MODIS (Justice et al., 2008), CloudSat (Stephens et al., 2008), CERES, andGPM (Hou et al., 2014).Finally, other validation methods including object-oriented (Marzban and Sandgathe, 2006) or fuzzy verification (Ebert, 2008) could be implemented.

Figure 1 .
Figure 1.Nested WRF configuration used in simulations.The large panel shows the first three model domains (45, 15, 5 km grid spacing, respectively).The smaller panels show the location of domain 4 (1.667 km resolution) for each of the seven cases.The colored lines show the cyclone track as indicated by GMA for each nor'easter case.

Figure 2 .
Figure 2. Domain 4 (1.667 km grid spacing), precipitable mixing ratios (mm) at 06:00 UTC, 6 February 2010.Shown abbreviations for mixing ratios include QV is water vapor, QC is cloud water, QG is graupel, QI is cloud ice, QR is rain, and QS is snow.
Figures 4 and 5 also contain two black dashed lines denoting the 0 and −40 • C heights, which denote the region where super-cooled water may occur.Although both the super-cooled water fraction and these temperature heights vary hourly, the latter demonstrates little to no inter-BMPS variability.Comparing Figs. 2 and 3 reveals a strong correspondence between radar reflectivity signatures at 4000 m a.m.s.l. and precipitable hydrometeor species, especially rain, graupel, and snow.As seen in Fig. 4, all cloud water and rain above 3500 m a.m.s.l. is super-cooled.Stronger nor'easter-related convection (reflectivity > 35 dBZ) in Fig.

Figure 3 .
Figure 3. Simulated radar reflectivity (dBZ) at 4000 m above mean sea level and their difference at the same time as Fig. 2.

Figure 7 .
Figure 7. Case 7, 24 h precipitation accumulation and their differences (mm, small panels) and corresponding probability density and cumulative distribution functions (big panel) of these same data derived from Stage IV and WRF model output.Accumulation period is from 18:00 UTC, 12 March 2010-18:00 UTC, 13 March 2010.Shown differences are model -Stage IV (StIV).

Figure 11 .
Figure 11.Domain 3 (5 km grid spacing), hourly CFAD scores (See Eq. 2) of radar reflectivity and indicated differences from case 4 starting 12:00 UTC, 26 January 2015 and ending on 12:00 UTC, 27 January 2015.The time period corresponds to the same time period as in Fig. 5.The y axis shows height above mean sea level (h.m.s.l.; m).

Table 1 .
Nor'easter case list.The Northeast Snowfall Impact Scale (NESIS) number is included for storm severity reference.Mean sea-level pressure (MSLP) indicates maximum cyclone intensity in GMA.The last two columns denote the first and last times for each model run.GMA storm tracks are displayed in Fig. 1.

Table 2 .
Applied bulk microphysics schemes and their characteristics.The below table indicates simulated mixing ratio species and number of moments.Mixing ratio species include: QV is water vapor, QC is cloud water, QH is hail, QI is cloud ice, QG is graupel, QR is rain, QS is snow.

Table 3 .
Domain 3, Stage IV-relative, accumulated precipitation threat scores and biases assuming a threshold value of 10 mm (25th percentile of 24 h accumulated precipitation).Bolded values denote the model simulation with the threat score closest to 1 (perfect forecast) or a bias values closest to 1 (number of forecasted cells matches observations).The lower two panels indicate the number of standards deviations (SD) each threat score and bias value deviates from the composite (all models + all cases) mean.