Development of a numerical system to improve particulate matter forecasts

Introduction Conclusions References

uncertainties due to emission fluxes, meteorological fields, and chemical and physical parameterizations in the CTMs. For example, the Korean Ministry of Environment (MoE) has recently started to implement air quality forecasts for PM 10 over the Seoul metropolitan area (SMA), the largest metropolitan area in South Korea, and also plans to perform PM 2.5 and ozone forecasts in 2015. However, the forecasting accuracy in 20 the current system was low (< 60 %) in 2013. Thus, urgent improvements in the PM 10 predictions are necessary.
In this context, an improved short-term PM forecast system was developed and introduced, based on an analogy to the system of numerical weather prediction (NWP). Figure 1a presents a flow diagram of an NWP in which regional meteorological model-satellite-derived aerosol optical depths (AODs), is developed in this study. Similarly, the BCs for the CTM runs are obtained from global CTM simulations.
In the improved CWF system, AOD data retrieved from low Earth orbit (LEO) satellite sensors, such as Moderate Resolution Imaging Spectroradiometer (MODIS), Multiangle Imaging SpectroRadiometer (MISR), and Sea-viewing Wide Field-of view Sen-10 sor (SeaWiFs), can be used to set up the ICs for the short-term PM forecast (Benedetti et al., 2009;Liu et al., 2011;Saide et al., 2013). While these AOD data have an advantage in spatial coverage compared with those obtained from point stations, the use of the LEO satellite-derived AODs has another limitation in acquiring continuous observations over a certain area due to the capabilities of the LEO sensors in their orbital periods and viewing swath widths.
Such limitations in using LEO satellite observations can be overcome with the help of geostationary (GEO) satellite sensors providing semi-continuous observations over a specific part of the Earth during the day (Fishman et al., 2012;Lahoz et al., 2011;Zoogman et al., 2014). Recently, aerosol optical properties (AOPs) from the Geosta-Introduction

Meteorological and chemistry-transport modeling
The WRF model provided meteorological data with 15 km × 15 km horizontal grid spacing and 26 vertical layers extending up to 50 hPa. To obtain highly-resolved terrestrial input data, the topography height from NASA Shuttle Radar Topography Mission (SRTM) 3 arc sec database (http://dds.cr.usgs.gov/srtm/version2_1/SRTM3) and 5 the land use information provided by Environmental Geographic Information Service (EGIS; http://egis.me.go.kr) were used. Initial and boundary meteorological conditions for the WRF simulation were provided by the National Centers for Environmental Protection (NCEP) final operational global tropospheric analyses (http://rda.ucar.edu/ datasets/ds083.2). To improve 3-D temperature, winds and water vapor mixing, objec-10 tive analysis was employed by incorporating the NCEP ADP Global surface and upper air observation data. The meteorological fields were provided with 1 h temporal resolution, and were then converted into the input fields for the CMAQ model simulations by the Meteorology-Chemistry Interface Processor (MCIP; ver. 4.1) (Otte and Pleim, 2010). 15 The CMAQ model is a chemistry-transport model that simulates the chemical fates and transport of gaseous and particulate pollutants. In this study, the CMAQ modeling covered Northeast Asia, from 92 to 149 • E and 17 to 48 • N, using 15 km × 15 km horizontal grid spacing ( Fig. 2) with 14 terrain following σ-coordinates, from 1000 to 94 hPa. The configurations of the WRF model and CMAQ simulation used in this study 20 are described in Table 1.
Anthropogenic emission inputs were processed by Sparse Matrix Operator Kernel Emissions in Asia (SMOKE-Asia; ver. 1.2.1), which has been developed for processing anthropogenic emissions for Asia. Details of SMOKE-Asia were described in Woo et al. (2012). Biogenic emissions were prepared using the Model of Emission of Gases 25 and Aerosol from Nature (MEGAN; ver. 2.0.4) (Guenther et al., 2006) with the MODISderived leaf area index (Myneni et al., 2002), MODIS land-cover data sets (Friedl et al., 2002), and the metrological input data described above. For the consideration of biomass burning emissions, daily fire estimates provided by Fire Inventory from NCAR (FINN) were used (Wiedinmyer et al., 2011). Asian mineral dust emissions were not considered in this study. Thus, the periods for model evaluation were selected during periods when mineral dust events did not take place. To take full advantage of the AOD data sets intensively measured during the Distributed Regional Aerosol Gridded Observation Network in Asia (DRAGON-Asia) campaign, modeling episodes were chosen for the campaign period from 01 March to 31 May 2012. First, background CMAQ model simulations were conducted for the 3 month DRAGON period with 10 day spin-up modeling. After this, initial conditions were prepared using the ST-kriging method, observation operators and CVs via the 10 combination of GOCI AODs with the background modeling AOD. Analysis was carried out for 12 h from 12:00 LT on 10 selected high PM pollution days. Each hindcast hour is referred to be H + 0 to H + 12. In this study we paid more attention to the performance of the first 12 h PM 10 hindcast results, and the analysis of the hindcast results after 13 h is also discussed briefly in Sect. 3.3. 15 In the hindcast analysis, different hindcast runs with 12 combinations of different observation operators and CVs were conducted, as discussed in Sects. 2.4 and 2.5. We selected 28 March,8,9,14,17,and 23 April,and 6,13,15,and 16 May 2012 for the analysis associated with three criteria of: (i) on the selected days the average PM 10 from 12:00 to 18:00 LT was above 70 µg m −3 over SMA, (ii) on the selected days, the 20 daily coverage of the GOCI AOD data was at least 20 % over the GOCI domain, and (iii) on the selected days, dust events were not recorded over South Korea according to the Korea Meteorological Administration (KMA). In this study, we focused on SMA, because we were particularly interested in this area. However been difficult for most GEO satellite sensors to produce accurate AOPs, because they have only one or two visible channels. In contrast, the GOCI instrument has six visible and two near-infrared channels, and can produce multi-spectral images eight times per day with a spatial resolution of approximately 500 m × 500 m with a coverage of 2500 km × 2500 km, including part of Northeast China, the Korean peninsula, 10 and Japan (Fig. 2). Using the 1 h resolved multi-spectral radiance data from GOCI, the uncertainties of AOP retrievals can be dramatically reduced (M. E. . The GOCI AOPs were retrieved with Global Aerosol model version 2 (GloA2) multichannel algorithms that can provide hourly AOP data including AOD, FMF, and SSA at 550 nm. Compared with the algorithms from two previous studies (Lee et al., 2010, 15 2012), the GloA2 algorithm uses an improved lookup table for retrieving the AOPs, using extensive observations from Aerosol Robotic Network (AERONET) and monthly surface reflectance observed from GOCI, and provides 1 h resolved AOP data at eight fixed times per day (from 09:30 to 16:30 LT) with 6 km × 6 km spatial resolution. In this study, the AOD data from the GOCI AOPs were used (because the SSA and FMF data 20 need further improvements) and were compared with collection 5.1 MODIS aerosol products from the Aqua and Terra satellites (Levy et al., 2007;Remer et al., 2005). The AERONET AOD data were also used for assessing the relative accuracy of the GOCI AODs. Figure 3a and  and mean bias (MB = −0.19), compared with MODIS data (R = 0.88; RMSE = 0.15; MB = −0.02). This indicates that the GOCI AOD data not only have comparable quality to the MODIS AOD data, but also provide a higher number of data over the GOCI domain. In Fig. 3c, the daily spatial AOD percent coverages of the Terra-MODIS and GOCI sensors are compared. It was found that there are a large number of daily miss-5 ing pixels in the observations of both satellite sensors (the average percent coverages of TERRA-MODIS and GOCI AODs during the period were about 19 and 38 %, respectively).

Ground-based observations
AERONET is a global ground-based sunphotometer network managed by the NASA Goddard Space Flight Center, providing spectral AOPs including AOD, SSA, and particle size distributions, available at http://aeronet.gsfc.nasa.gov (Holben et al., 1998).
To match the wavelength of GOCI AOD with AERONET AOD, the AOD data at 550 nm were calculated via interpolation, using AODs and Ångström exponent data between 440 and 870 nm from the DRAGON-Asia level 2.0 data. AOD data from 29 AERONET 15 sites inside the GOCI domain were used for validating GOCI and ST-kriging AOD products, and those from six AERONET sites in SMA were selected for evaluating the performance of hindcast AODs. To analyze hindcast surface aerosol concentrations, the PM 10 observations provided by the National Ambient Air Monitoring System (NAMIS) network in South Korea were 20 used. The NAMIS network, operated by the MoE has collected air pollutant concentrations of PM 10 measured by an automatic β-ray absorption method with a detection limit of 2 µg m −3 at 5 min intervals. We selected 58 NAMIS sites in SMA, the locations of which are shown in Fig. 2, and used 1 h averaged data for the analysis during the selected episodes. 25 Ion concentrations of PM 2.5 were also measured using a particle-into-liquid sampler coupled with ion chromatography (PILS-IC) and a low air-volume sampler with a Teflon filter in Yongin City, located downwind of Seoul (Fig. 2) Lee et al. (2015) and are not repeated here. One-hour averaged sulfate (SO 2− 4 ), nitrate (NO − 3 ), and ammonium (NH + 4 ) concentrations, measured by the PILS-IC, and 24 h averaged SO 2− 4 , NO − 3 , NH + 4 , organic carbon (OC), and elementary carbon (EC), measured by the low air-volume sampler, were used for further comparison during the selected episodes (Sect. 3.4). The observed OC concentrations 5 were multiplied by a factor of 1.5, to estimate organic aerosols (OAs) concentrations (He et al., 2011;Huang et al., 2010).

Spatio-temporal kriging
Kriging is a geostatistical interpolation method to estimate unmeasured variables and their uncertainties, using correlation structure of measured variables. This method has mathematical linkages to data assimilation (DA) techniques, such as optimal interpolation (OI), variational methods, such as 3-DVAR and 4-DVAR, and ensemble Kalman filter (EnKF) (Wikle and Berliner, 2007). An atmospheric application study of the kriging method to estimating PM 10 exceedance days over Europe reported that ST-kriging showed comparable performances to those of the EnKF approach (Denby et al., 2008). 15 In this study, the ST-kriging method was used to fill out the missing pixels ( Fig. 3c) with the spatial and temporal GOCI AOD data. The AOD fields produced by ST-kriging can be prepared with a horizontal resolution of 15 km × 15 km from 10:00 to 16:00 LT over the GOCI domain. In this study, the AOD data at 12:00 LT (H + 0) during the selected episode days were used for preparing the initial conditions. The details and 20 general application of the ST-kriging method are presented in Appendix A. One advantage of using ST-kriging in this study framework is to use large numbers of observational data (GOCI AODs), compared with other methods. In fact, the GOCI AOD data are densely available temporally (with 1 h intervals) and spatially (compared with MODIS AODs; see Fig. 3a and b). This was the primary reason for using the ST-kriging 25 method in this study. For example, when initial AOD fields were prepared at a certain time (e.g., at noon, 12:00 LT: H + 0), the ST-kriging method uses not only GOCI AOD Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | data at 11:30 or 12:30 LT, but also GOCI AOD data at 09: 30, 10:30, and 01:30, un-like other methods. In the case of 04 April 2012 (a high PM pollution episode during the DRAGON-Asia campaign), other interpolation methods (e.g., Cressman, bilinear, and nearest-neighbor methods) or DA methods (e.g., OI and 3-DVAR) could only use the GOCI AOD data of ∼ 88 000 for the preparation of the initial AOD field at 12:00 LT, 5 whereas the ST-kriging method used the GOCI AOD data of ∼ 280 000 (3 times more AOD data).
If the observation data are densely available and the differences between the observations and model-simulated data are large (i.e., the model simulations include relatively large errors or uncertainties), there is less "practical need" to use the CTM-10 simulated data in the process of data assimilation. That is, it would be more desirable if the values of the unobserved (missing) pixels could be filled in based on "more reliable" observation data (here, GOCI AODs). This would be particularly true, when the CTMpredicted AODs are systematically underestimated compared with GOCI or AERONET AODs (as will be shown in Fig. 5a). Additionally, computation costs of the ST-kriging 15 method are so low that the ST-kriging AOD can be calculated rapidly. For example, the 1 day process for preparing the AOD fields over the GOCI domain takes only ∼ 20 min with two 3.47 GHz Xeon X5690 6-core processors and 32 gigabytes memory in the current application of the ST-kriging method. Thus, it can be applied directly to the daily CWF due to the relatively cheap computation cost. Again, computation time (rapid 20 calculation) is a central issue in daily (short-term) chemical weather forecasts. The calculation of daily three-dimensional semivariogram takes most of the computation time (regarding the details of calculation of the daily three-dimensional semivariogram, refer to Appendix A and Fig. A1).
Connected with these discussions, in the application of the ST-kriging method to the 25 GOCI AODs, the "optimal number" of observation data is necessary to balance the accuracy of the data and the computational speed. From many sensitivity tests (not shown here), the optimal number of observations for one white (missing) pixel is approximately 100. That is, the use of more observation data above this optimum number does not meaningfully enhance the accuracy of AODs of the missing pixels, but simply takes more computation time. This number of observation data is usually available for the most of the missing (white) pixels of the GOCI scenes from nearby grids both/either at the concurrent scene spatially within ∼ 100 km and/or at the temporally-close snapshots within 3 h. Based on these reasons, the ST-kriging method was chosen for this 5 study.

Observation operator
An observation operator (or forward operator) describes the relation between observation data and model parameters. For example, the observation operator in this study converts the aerosol composition into AODs (and vice versa). Based on the aerosol 10 composition and the relative humidity (RH) from the model simulations, simulated AODs at a wavelength of 550 nm (τ CMAQ ) were calculated with the following observation operator: where N and M denote the number of aerosol species (s) and model layer (l ), respec-  In this study, three observation operators were used for calculating AODs and updating initial PM composition for the hindcast studies. The differences in the observation operators are caused mainly by the differences in α s,dry and f s (RH l ) of Eq. (1). The first observation operator was selected from Goddard Chemistry Aerosol Radiation and Transport (GOCART) model (Chin et al., 2002;hereafter GOCART operator 4 , OC, BC, and sea-salt aerosols were considered separately in this operator. The second observation operator was from the GEOS-Chem model (the GEOS-Chem operator). The detailed aerosol speciation and MEE values were described in Martin et al. (2003). Final observation operator is based on the study of Malm and Hand (2007) (the IMPROVE operator). This observation operator 5 was based on the reconstruction method with the MEEs and hygroscopic enhancement factors at 550 nm for different types of aerosol species. Table 2 summarizes the characteristics of the three observation operators chosen in this study. To consistently consider the characteristics of the three observation operators, aerosol types (s in Eq. 1) were classified into seven groups: SO 2− 4 , NO − 3 , NH + 4 , OAs, BC, sea-salt, and others, 10 which mainly consist of PM 2.5 trace elements (Reff et al., 2009). In the classification, internal mixing states of SO 2− 4 , NO − 3 , and NH + 4 were assumed. It should also be noted that the consideration of NO − 3 is important to correctly estimate AOD and aerosol mass loading in East Asia (R. S. Park et al., 2011Song et al., 2008). Figure 4 shows the MEE values (product of α s,dry and f s (RH l ) in Eq. 1) calculated for SO 2− 4 , NO − 3 , and 15 NH + 4 at a wavelength of 550 nm as a function of RH, indicating that the three different operators can create large differences in the MEE values.

Selection of control variables
To prepare the distributions of the aerosol composition, the ST-kriging AOD fields should be converted into the 3-D aerosol composition. To do this, the differences 20 between the ST-kriging AODs and background AODs (often called "observational increments": ∆AOD k = AOD ST-kriging,k − AOD bg,k , k = grid cell) should be added to the background model-derived aerosol composition at each grid cell, in connection with the observation operators (Eq. 1). Which aerosol species is/are selected for allocating ∆AOD k ? We selected four types of control variables (CVs) of particulate species. First, 25 all the particulate species were selected as CVs. In this case, ∆AOD k was distributed to all the particulate species, with the particulate fractions calculated from the background  Park et al., 2011. This can be related to either (or both) the uncertainty in SO 2 emissions in East Asia or (and) the uncertainty 5 in the parameterizations of SO 2− 4 production in the CTM models (Lu et al., 2010;Smith et al., 2011). In addition, there is also large uncertainty in the levels of hydroxyl radicals (OH) due to uncertain daytime HONO chemistry, OH reactivation, in-plume process and others (Archibald et al., 2010;Han et al., 2015;Karamchandani et al., 2000;Kim et al., 2009;Kubistin et al., 2010;Lelieveld et al., 2008;Song et al., 2003Song et al., , 2010Sörgel 10 et al., 2011;Stemmler et al., 2006;Zhou et al., 2011). Obviously, these uncertainties can influence the levels of H 2 SO 4 and thus particulate sulfate concentrations in the atmosphere. In this case, aerosol mass concentrations (except for SO 2− 4 ) were the same as those of the background aerosol concentrations. Third, SO 2− 4 and OAs were chosen to be changed. Although OAs are one of the major particulate species, it is well-known 15 that OAs concentrations are also systematically underestimated due to two reasons: (i) the uncertainty in the parameterizations of the OA formation (Donahue et al., 2006(Donahue et al., , 2011Dzepina et al., 2009;Hodzic et al., 2010;Matsui et al., 2014;Slowik et al., 2010), and (ii) the uncertainty in emission inventories for anthropogenic and biogenic OA precursors (Guenther et al., 1999;Sakulyanontvittaya et al., 2008;Tsimpidi et al., 2010;20 Wyat Appel et al., 2008). In this case, the mass concentration of surface OAs is assumed to be equal to that of SO 2− 4 , based on the ground-based measurement studies over East Asia (Lee et al., 2009;Zhang et al., 2007Zhang et al., , 2012 (Bassett and Seinfeld, 1983;Saxena et al., 1986;Seinfeld and Pandis, 2012;Song and Carmichael, 1999;Stelson et al., GMDD 8, 5315-5366, 2015 Development of a numerical system to improve particulate matter forecasts S. Lee et al.  . In the four case studies, background modeling-derived vertical profiles and size distributions of aerosol species were used for converting 2-D AOD to 3-D aerosol mass concentrations. With the combinations of the three different observation operators and four choices of CVs (Table 3), 12 hindcast runs were made for high PM episodes during the DRAGON-Asia campaign.

3 Results and discussion
In Sect. 3, the performances of ST-kriging method are evaluated via comparisons with the AERONET AOD in the GOCI domain (Sect. 3.1). Sensitivity analyses were then conducted to examine the impacts of the observation operators and CVs on the accuracy of the hindcast runs (Sect. 3.2). After that, the overall performances of the hindcasts were evaluated with ground-based observations during the high PM 10 episodes over SMA (Sect. 3.3). A comparative analysis of the PM composition between hindcast results and observations was also conducted to further investigate/analyze the performance of the hindcast system (Sect. 3.4).

Evaluation of ST-kriging AODs
15 Figure 5a-c shows scatter plot analyses of background CMAQ-simulated AODs, spatial kriging AODs (i.e., kriging only with the GOCI AODs from one scene) and  Fig. 5c, it can be seen that the ST-kriging can effectively produce the AOD fields (also note the increase in N). Figure 5d and e shows the scatter plot analysis of the ST-kriging AOD products vs. the AERONET AOD data with kriging variances (KVs). It is found that the ST-kriging AOD data with KV ≤ 0.04 show similar scattering pattern and accuracy to those of 5 GOCI AOD. In contrast, some overestimated outliers from the ST-kriging AOD data in Fig. 5e (e.g., 1.0-2.0 in the x axis and 2.0-4.0 in the y axis) show different patterns than those from the GOCI AOD data. This may be explained by the relatively large KVs (> 0.04) of such overestimated outliers. The KV generally increases when the observations near a certain prediction point are not available or when nearby observations 10 have relatively large errors. Thus, when the GOCI observations are contaminated by optically thin clouds and they are not removed perfectly, this can increase the local variances due to their high cloud optical depth (COD). These factors can affect the quality of the ST-kriging AOD products. Collectively, it appears that the ST-kriging method is a reasonable tool for obtaining realistic AOD values at locations where the GOCI ob- 15 servations are not available.

Sensitivity of observation operators and control variables to AOD and PM 10 predictions
To investigate the best combination of the observation operators and CVs, the AOD and PM 10 hindcast runs and sensitivity analyses with the 12 different configurations 20 (Table 3) were performed. The hindcast AOD and PM 10 from 13:00 to 19:00 LT (H + 1 to H + 6) on 10 selected episode days were compared with the ground-measured AOD and surface PM 10 concentrations. The observations from the six AERONET sites and nearest NAMIS PM 10 stations within 10 km from the AERONET locations were selected for this comparison study (Fig. 2). The AOD values for the background CMAQ model simulations without the application of the ST-kriging method (noSTK) were also calculated with the GEOS-Chem observation operator.  Figure 6 shows the soccer plot analysis of the 13 hindcast AODs (left panel) and PM 10 (right panel) during the first 6 h of the short-term PM hindcasting on the 10 selected episode days. In the soccer plot, mean fractional bias (MFB) and mean fractional error (MFE) (described in Appendix B) are plotted on the x and y axes, respectively. Using this plot, the relative discrepancy can be presented by the distances from the 5 origin of the plot, and particular characteristic, such as systematic bias, can also be shown as a group of scatter points. Detailed statistical metric values are shown in Table 4. All the AODs and PM 10 with the application of the ST-kriging method (STK) are much better than those from the noSTK simulation, with reduced errors and biases. Percentage decreases in MFE with the STK hindcasts were found to be 60-67 % for 10 AOD and are 50-63 % for PM 10 . The MFB also decreased by 67-82 % for AOD and by 56-84 % for PM 10 . The noSTK case showed a strong negative bias (i.e., underprediction) and the 12 STK cases also showed less, yet still negative, biases. These negative biases are considered to be systematic, because of the negative bias of the GOCI AOD data (Fig. 6). Additionally, the negative biases are due to underestimation of CMAQ- On the other hand, there are relatively small differences in errors and biases among the 12 STK cases (Fig. 6). Several differences among the 12 sensitivity cases were 20 investigated further. First, the error and bias patterns for the AOD values were different from those for the PM 10 predictions, being associated with the different observation operators. For example, the STK cases with the IMPROVE observation operator (cases C1-C4) exhibited a relatively small bias for PM 10  , and OAs (i.e., A4, B4, and C4) showed better performances for both the AOD and PM 10 predictions. Figure 7 shows the performances of the short-term hindcast system with the 13 different configurations via comparisons between the hourly-averaged PM 10 observations and model PM 10 predictions at the six NAMIS sites, on 09 April, 06 and 16 May 2012, 5 respectively. Only 3 day and six site results were selected and presented here, and more comprehensive performance evaluations are presented in Sect. 3.3. While noSTK failed to reproduce the high PM pollutions, all the STK cases showed significant improvements in the surface PM 10 predictions. However, there was a tendency that the hourly peaks of PM 10 were not well captured by the STK cases.
Consequently, it can be concluded that the combination of GOCART observation operator and CVs of SO 2− 4 and OAs (represented by A3) leads to the best results in the current hindcast system (Table 4). The use of the GOCART observation operator and CVs of SO 2− 4 , NO − 3 , NH + 4 , and OAs (represented by A4) could also provide comparable performance to A3. However, it appears that the differences among the 12 STK cases 15 were relatively small.

Overall performance evaluation of PM 10 hindcast over SMA
In this section, PM 10 from the hindcast experiments were compared with the PM 10 observations from 58 NAMIS sites to evaluate the overall performance of the current hindcast system in SMA. Table 5 provides the statistical metrics that were calculated 20 separately from the first and the second 6 h hindcast results. The main characteristics of the statistical analysis in Table 5 are similar to those at the six sites discussed in the previous section. First, both errors and biases of PM 10 distributions were significantly reduced after the application of the ST-kriging method. The MFEs and MFBs in the 12 h STK simulations decreased by ∼ 40 and ∼ 80 %, respectively. 25 A distinctive difference was also found in the model performances for the first and the second 6 h runs. During the first 6 h, all the hindcast results showed negative biases, with the MFB of ∼ −100 % for the noSTK cases and ∼ −40 % for the STK cases. The performances of the A3 and A4 cases are somewhat better than those of the other STK cases (Table 5). Collectively, the MFEs and MFBs of the STK cases are a factor of 2-4 smaller than those of the noSTK cases during the first 6 h. Figure 8 shows a comparison between the noSTK case and the A3 case, in terms 5 of the PM 10 predictions, during the first and the next 6 h in SMA with the 6 h averaged NAMIS PM 10 observations. As shown, the A3 case produced better PM 10 predictions during the first and the next 6 h. In addition, the A4 case (not shown) also provided similar results to the A3 case, as discussed in Sect. 3.2. It can be confirmed again that the A3 and A4 cases are able to produce better PM 10 predictions against the PM 10 10 observations in SMA.
Hindcast performances from H + 13 to H + 24 were also evaluated with the groundmeasured NAMIS PM 10 data. In short, the differences between all the STK and noSTK cases became smaller than those during the first 12 h (approximate difference of 10 % was found at H + 24, i.e., 24 h after the hindcast actually began). Based on this, it ap- 15 pears that the effects of using the initial PM composition on the hindcast performances may effectively last during the first 12 h. After 12 h, the effects started to diminish. This is due to several facts: (i) the regions for applying the initial PM composition in this study were limited only within the GOCI domain (relatively small region), (ii) although the initial PM composition was used, its effects can be offset by uncertainties and er-20 rors in emissions as time progressed; and (iii) the large uncertainties associated with the formation of SO 2− 4 and OAs in the CTMs can also limit the effects of the initial PM composition. The second and the third are the reasons that there is strong necessity for both emissions and CTMs to be improved continuously, even though the initial PM composition is applied in the short-term forecast activities.

Evaluation of hindcast performance with observed PM composition
In the previous section, PM 10  Although the purpose of this study is to develop a better PM forecast system for accurately predicting "PM 10 mass" concentrations, it is still necessary to more carefully scrutinize the changes in the "PM composition" in accordance with the different selections of the CVs. During the DRAGON-Asia campaign, the PM 2.5 composition was measured for 5 SO 2− 4 , NO − 3 , and NH + 4 with 30 min intervals and for SO 2− 4 , NO − 3 , NH + 4 , OC and BC with 24 h intervals using PILS-IC instrument (semi-continuous measurements) and low airvolume sampler with a Teflon filter (off-line measurements), respectively, in Yongin City near SMA (Fig. 2). Thus, in this section, the selection of the CVs is further discussed with the observed PM 2.5 composition. Figure 9 shows the comparison between 1 h averaged SO 2− 4 , NO − 3 , and NH + 4 concentrations measured via the PILS-IC instrument and model-predicted concentrations during the selected days at the Yongin observation site. Only the STK cases with the GO-CART observation operator (i.e., A1-A4) were selected here. The STK cases showed significant changes in the PM composition with the selection of CVs. For example, 15 the A2 and A3 cases tended to overestimate the SO 2− 4 concentrations but underestimated the NO − 3 , and NH + 4 concentrations, whereas the A1 and A4 cases tended to relatively well capture the trend of the concentrations of the three particulate species. This phenomenon was driven by intra-particulate thermodynamics. That is, if larger amounts of SO 2− 4 are allocated into particles (like the cases of A2 and A3), then NO − 3 20 tends to be evaporated, because SO 2− 4 is more strongly associated with NH + 4 (Song and Carmichael, 1999). As shown in Fig. 9a and b, when the SO 2− 4 concentrations increases (as in case A2), the NO − 3 concentrations decrease accordingly, because NO − 3 is evaporated out of the particulate phase as a form of HNO 3 Carmichael, 1999, 2001). Collectively, the "best" results were produced from the case A4, as shown in Fig. 9a-c.
The 24 h averaged PM 2.5 compositions measured from the PILS-IC instrument and the low air-volume sampler with a Teflon filter during the campaign period are also  Fig. 9d. Again, the observations of the SO 2− 4 , NO − 3 , and NH + 4 concentrations were obtained from both the PILS-IC instrument and the low volume sampler, whereas the concentrations of OAs ( ∼ = [OC] × 1.5) and EC were only measured via the low air-volume sampler. As shown in Fig. 9d, the SO 2− 4 , NO − 3 , and NH + 4 concentrations from both samplers showed good agreements (see circles and crosses in Fig. 9d). The 5 A4 case (the red bars in Fig. 9d) again showed the best results in the comparison between the observed and predicted particulate composition, particularly in SO 2− 4 and OAs. In the previous discussion (see Sects. 3.2 and 3.3), the A3 and A4 cases showed the best performances for "predicting PM 10 mass concentrations" over SMA. This is somewhat consistent with our analysis in this section. However, in case of the A3, it can capture the PM mass behaviors (Sect. 3.3) but does not capture the changes in the PM composition well (this section). Based on this, it is concluded that the A4 case would be the best configuration for accurately predicting the PM composition as well as the PM mass. However, this PM composition analysis was conducted with only one site observations (in Yongin City) in this study. Thus, to reach a firmer conclusion, more 15 intensive analyses with multiple site observations are required in future.

Summary and conclusions
For the purpose of improving the performance of short-term PM forecast in Korea, an integrated air quality modeling system was developed with the application of the ST-kriging method using the geostationary satellite-derived AOD data over Northeast and OAs (case A3) was found to be the best one for the PM 10 mass prediction. The most remarkable result was that the mean fraction bias (MFB) of the PM 10 predictions was approximately five-fold reduced over SMA during high PM pollution periods with the case A3 combination. All the hindcast runs with the application of the ST-kriging, 5 however, generally showed negative biases (i.e. under-predictions). This was primarily due to the underestimation of the GOCI AOD. In addition, the combination of the GO-CART observation operator and the selection of CVs of SO 2− 4 , NO − 3 , NH + 4 , and OAs (Case A4) was found to give the "best" results for the prediction of particulate composition at one observation site. However, more intensive measurements of the PM 10 composition are needed for reaching a more solid conclusion.
The ST-kriging AODs used in the current study are expected to be used in other data assimilation methods. For example, in the 3-DVAR method, the observation error covariance matrix, which presents the degree of errors of the observations, has been usually assumed by linear equations or single constant value (Liu et al., 2011;Schwartz 15 et al., 2012;Shi et al., 2011). However, as discussed with KVs in Sect. 3.1, the error covariance of the AOD observations can be improved, and the use of the improved observation error covariance matrix can help to prepare more accurate AOD fields, for example, via a 3-DVAR method. This study is now underway.
In future, planned GEO satellite sensors will give other opportunities to use semi- instrument, it is being designed to provide backscattered UV/Vis radiances between 300 and 500 nm with a spatial resolution of 5 km × 5 km over a large part of Asia. Using advanced observations from the GEMS sensor, it is anticipated that the system developed here will be able to make significant contributions to further improvements in the performances of the PM forecasting system in Asia. This improved PM predictions and modeling framework can also be a core part for entire air quality forecasting system, a more comprehensive health impact assessments, and radiative forcing estimation over (East) Asia in future.
Appendix A: Spatio-temporal kriging method 5 The ST-kriging methods assume that measured variables in space and time (τ(s, t)) can be regarded as a random function, consisting of a trend component (m) and residual component ( ) of which the mean is zero: The unobserved value τ * (s, t) can be averaged with weight using measured values 10 from the surrounding: where n is the number of observations in local neighborhood and w i (s, t) is the kriging weight assigned to τ(s i , t i ). The kriging weight is determined by a theoretical semivariogram. 15 In case of spatial kriging (τ(s)), the semivariogram (γ) is the best fit to the semivariance (γ * ) as a function of spatial lag (h). Assuming the trend component m(s) in τ(s) is constant over the local domain (i.e., the ordinary kriging method), the semivariance is defined as: where N(h) is the number of paired observations at a spatial distance of h, and τ i (s i +h) is the i th observation (in this study, AOD) separated by h from the observation located at s i . The semivariogram is then depicted by a theoretical model which is the bestfitting curve to the semivariance by minimizing the least square error. For example, a spherical semivariogram (γ), which is commonly used in the theoretical models of 5 the atmospheric studies, is estimated by finding optimal three parameters: (i) nugget (c n ), (ii) range (a); and (iii) partial sill (σ 2 0 ): The range parameter indicates the maximum lag in which the variation of semivariogram is meaningful (Cressie, 1992).

10
To combine the spatial and temporal data for preparing the spatio-temporal semivariograms, the temporal information can be converted into the spatial information (Gräler et al., 2012). First, the spatial and temporal semivariograms are estimated independently using the spherical model from the daily GOCI AOD data. Second, the ratio of spatial range parameter (a s ) of the spatial semivariogram to temporal range parameter 15 (a t ) of the temporal semivariogram (i.e., spatio-temporal scale factor, km h −1 ) is used to convert the unit of temporal lag into the unit of spatial distance. Consequentially, the 3-D spatio-temporal AOD data are converted into the 2-D spatial AOD fields. After that, the spatio-temporal semivariogram is provided to predict the AOD fields with 15 km×15 km spatial resolution from 10:00 to 16:00 LT over the GOCI domain. For the ST-kriging 20 method, the "gstat" (Pebesma, 2004) and the "spacetime" (Pebesma, 2012) software packages in the "R" environment for statistical computing were used (R Development Core Team, 2011). Figure A1 presents an example of the 3-D semivariograms from the fitted model (left) and sample from the GOCI data on 08 April. The mean nugget (c n ), range (a), partial sill (σ 2 0 ) of the spatio-temporal model semivariogram were 0.025, 25 583 km, and 0.227, respectively, during the entire DRAGON-Asia campaign. The average spatio-temporal scale factor of ∼ 34 km h −1 was calculated indicating that the AODs observed before or after 1 h at certain location show a similar correlation pattern to those measured simultaneously at ∼ 34 km apart in the ST-kriging model.

Index of agreement
Root mean square error where N is the number of data and M i and O i are the model value and observation, respectively. The value highlighted by overbar means the arithmetic mean of the data.