Climate change inspector with intentionally biased bootstrapping (CCIIBB ver. 1.0) – methodology development

The outputs from general circulation models (GCMs) provide useful information about the rate and magnitude of future climate change. The temperature variable is more reliable than other variables in GCM outputs. However, hydrological variables (e.g., precipitation) from GCM outputs for future climate change possess an uncertainty that is too high for practical use. Therefore, a method called intentionally biased bootstrapping (IBB), which simulates the increase of the temperature variable by a certain level as ascertained from observed global warming data, is proposed. In addition, precipitation data were resampled by employing a block-wise sampling technique associated with the temperature simulation. In summary, a warming temperature scenario is simulated, along with the corresponding precipitation values whose time indices are the same as those of the simulated warming temperature scenario. The proposed method was validated with annual precipitation data by truncating the recent years of the record. The proposed model was also employed to assess the future changes in seasonal precipitation in South Korea within a global warming scenario as well as in weekly timescales. The results illustrate that the proposed method is a good alternative for assessing the variation of hydrological variables such as precipitation under the warming condition.


Introduction
The complex influence of human actions on the climate system is well represented through global climate models (GCMs).A number of GCMs demonstrate variations in the large-scale atmospheric circulation and related changes in hydrometeorological variables (Allen and Ingram, 2002;Held and Soden, 2006;Lenderink and Van Meijgaard, 2008).It has been generally accepted that quantifying the range of possible changes in the hydrological cycle (such as precipitation and evaporation) is harder than in temperature (Allen and Ingram, 2002).Furthermore, hydrological variables vary much more in space and time than temperature and are difficult to correctly simulate.
The relationship between temperature and precipitation has been studied in literature in order to predict the future variations of precipitation under the global warming condition.From the Clausius-Clapeyron (C-C) relation, saturation vapor pressure increases by 6-7 % for each 1 • C increase in temperature, and rainfall intensity also increases at a similar rate with warming (Trenberth and Shea, 2005).Lenderink and Van Meijgaard (2008) showed that the intensity of hourly precipitation exhibits a C-C relation for summer while showing super C-C scaling for winter.
These relations are only focused on very short timescales (not more than daily) or generally retrieved from GCM outputs.The behavior of mean precipitation over long-term period such as months and seasons is difficult to predict as temperature increases.It might be beneficial if one could derive the behavior of long-term mean precipitation under warming conditions or the range of possible changes (IPCC, 2013).
Therefore, a simple method that simulates temperature from observed data is proposed in the current study while increasing temperature up to a certain level as a warming scenario.In addition, precipitation is simulated by employing a block-wise resampling technique (Srinivas and Srinivasan, 2000) associated with the temperature simulation.The resampled covariate, precipitation, forcing the warming condition in a certain level is obtained from the simulation.The proposed approach allows for assessment of the impact of Published by Copernicus Publications on behalf of the European Geosciences Union.
T. Lee: Simulating climate warming scenarios precipitation as temperature increases with a current climate horizon.
The paper is organized as follows.In the next section, the fundamental mathematical background related to biasbootstrapping modeling is presented.The employed data and application methodology are described in Sect.3. The validation study of the proposed IBB approach is shown in Sect. 4. The results assessing the long-term evolution of seasonal precipitation with simulating weekly temperature and precipitation data are illustrated in Sect. 5. Finally, the summary and conclusions are presented in Sect.6.

Methodology
In order to simulate the warming scenario, i.e., increasing mean temperature, up to a certain level, the observed data must be sampled with a different combination.Intuitively, warmer temperature values are more likely to be resampled among the observations if the mean is increased.Therefore, the proposed method in the current study is to resample the observed data by fixing the mean temperature increment in the resampled dataset by weighting the probability of selection according to its magnitude (see Fig. 1).In addition, the block bootstrapping with precipitation was employed to assess the changes in these variables as temperature increases.

Intentionally biased bootstrapping (IBB)
Bootstrapping (also known as resampling from observed data with replacement) is a statistical method for creating replica datasets from the original data to assess the variability of the quantities of interest without analytical calculation (Davison and Hinkley, 1997;Davison et al., 2003;Ouarda and Ashkar, 1995).This bootstrapping technique has been extended to simulate time series of hydrometeorological variables (Beersma and Buishand, 2003;Lall et al., 1996;Lall and Sharma, 1996;Lee andOuarda, 2011, 2010;Mehrotra and Sharma, 2005).In the current study, the intentionally biased bootstrapping (IBB) technique is employed so that the mean of the resampled datasets is varied as needed to simulate a global warming scenario.
IBB was proposed by Hall and Presnell (1999) as a class of weighted bootstrapping techniques in order to reduce bias or variance, as well as to render some characteristic equal to a predetermined quantity.A good example of IBB is the adjustment of Nadaraya-Watson kernel estimators to make them competitive with local linear regression (Cai, 2001).In the current study, IBB was employed to simulate the temperature data from observation by bootstrapping under the constraint of increasing mean value, which indicates warming.The conceptual background of IBB has been employed to simulate future climates of weather analogs (Orlowsky et al., 2010;Orlowsky et al., 2008).In the current study, an IBB method with easy manipulation to simulate increased temperature data is proposed.The mathematical description of the proposed IBB method is the following.
Among an n number of observations x i , where i = 1, . .., n, assume resampling the observations with replacement (i.e., bootstrapping) by increasing the mean of the simulated data by as much as µ ; this implies that higher values have a higher probability of being resampled and lower values have lower selection probability.This IBB can be achieved by assigning different weights S i,n according to the magnitudes of the observations as follows.
Note that this assigned weight S i,n plays a role in the selection probability for the observed data in the IBB procedure after scaling and adjusting it.
The mean of the resampled data is as follows: where x (i) represents the ith increasing ordered value and S i,n .The amount of the mean increaseδ µ is as follows.
To obtain different values of δ µ ,the weights can be generalized with the weight order (r) as follows: where r = n i=1 S r i,n .The difference is as follows.
x j (5) Once the magnitude of the mean increase is given (e.g., temperature increase) as µ , the weight order "r" is estimated accordingly.For example, when the temperature change is obtained from the GCM outputs and this change is supposed to be propagated into a specific location and a finer timescale, the selection of the weight order can be performed using a meta-heuristic optimization technique with the objective function as follows. Minimize In the current study, the harmony search (HS) was used for the meta-heuristic optimization.The performance of the HS in hydrological applications is well reviewed in the literature (Geem et al., 2001;Lee andGeem, 2005, 2004;Lee and Jeong, 2014a;Mahdavi et al., 2007;Yoon et al., 2013a).Note that if r > 0, then δ µ (r) > 0, which implies a global warming scenario; if r < 0, then δ µ (r) < 0, which implies a global cooling scenario.When r < 0, lower values are resampled more frequently than higher values, causing the mean of the resampled data to decrease.Furthermore, if r goes to infinity then the maximum of the observations is always selected, and if r goes to negative infinity, only the minimum is chosen.
In the IBB procedure, the adjusted scaled weight η i = S r i,n / r is the probability that each ith data point is subject to be selected.In the case of n = 30, the weights for i = 1, . .., n are shown in Fig. 2 with the weight order of r = 0.5.The figure shows that the probability of being selected (i.e., η i ) is between approximately 0.01 for the lowest values and 0.05 for the highest order values of approximately 0.05 to lead to positive bias in the resampled data (e.g., 1.0 • C increase).For example, if the number of the simulation is 100 and η i = 0.05, then the data point will be selected 5 times.A different probability implies a different number of selection for each data point.Subsequently, a different number of selections may lead to variation changes, called variance reduction or inflation.This issue is dealt with in the following section.

Variance reduction and inflation
Because of the biased selection of higher values, the variance of the resampled data results is reduced (Lee and Jeong,Figure 2. Example of the adjusted scaled weights (η i ) vs. order numbers in the case of n = 30 and order weight r = 0.5.Note that η i is the probability of being selected and increases as the order is increased, so that higher values are subject to being selected more often than lower values, leading to a positive bias.
Note that the variance in Eq. ( 7) is based on σ 2 = E(X 2 ) − (EX) 2 .The difference of the variance is as follows: where σ 2 is the sample variance of the observed data.To overcome the reduction of the variance in IBB, a random perturbation can be applied to the resampled data X R as follows: where ε is a random variable with a normal distribution N (0, 1).Subsequently, the mean and variance of the perturbed data are as follows:

Block bootstrapping
Bootstrapping is a random sampling with replacement and block bootstrapping is to resample blocks.Each block contains a set of predictor and predictand, like a regression.Here, temperature and precipitation can be set as a block and they act as predictor and predictand, respectively.
www.geosci-model-dev.net/10/525/2017/Geosci.Model Dev., 10, 525-536, 2017 When the temperature presumably increases by a certain degree, it is interesting to note how the other weather variables vary.For example, if the temperature is increased by 1 • C, the greatest concern in climate research will be how the precipitation will change.
To address this question, the block bootstrapping technique for the precipitation variable is adapted (Carlstein et al., 1998;Lee et al., 2010).Once the temperature is resampled from the observed data at certain times using IBB, the observed precipitation data from the same time are considered (see Fig. 2).Unlike for the case of temperature, not much significant variance reduction is expected in the resampled precipitation data because the precipitation data are not conditionally resampled.This block bootstrapping technique is popularly employed in multivariate weather simulations (Lee and Jeong, 2014b;Lee et al., 2012).

Overall simulation procedure
The overall simulation procedure of temperature and precipitation data is described in this section.Simple schematic presentation of the procedure is shown in Fig. 1.Let x i , y i (i = 1, . .., n) be the observed temperature and precipitation data, respectively.Suppose that the simulation length is the same as the record length (i.e., n) and 100 series need to be simulated.a. Assume that the increased overall temperature mean is known as µ .
b. Estimate the weight order (r) from meta-heuristic algorithm (here, Harmony Search) with the objective function of Eq. ( 6) from the observed temperature data.
c. Resample the temperature data from the observations with the probability of S r i,n for ith largest data (i = 1, . .., n).d.Assume that kth largest temperature data x (k) is resampled from step (3) and its corresponding time index of (k) is "j ".Note that (k) indicates the kth largest value and j indicates the j th time-index value.Then, j th precipitation data, y j , is resampled simultaneously.

Data description and application methodology
In the current study, 54 weather stations that record temperature and precipitation in South Korea with more than 30 years of record length, and which are managed by the Korea Meteorological Administration (KMA), were employed.South Korea is located in eastern Asia and has a mean annual precipitation of 1283 mm according to the KMA.This country is climatologically influenced by the Siberian air mass during winter and the Maritime Pacific High during summer.Most of the annual precipitation in South Korea falls during the rainy season from June to September due to the occurrence of tropical cyclones, extratropical cyclones, fronts, and other weather systems.Because the orographic area in South Korea is heterogeneous and large, the rainfall in South Korea has large spatial and temporal variability (Park et al., 2007;Yoon et al., 2013b).The water resource control system, including climate change, is an important aspect of this study due to the seasonal and spatial variability of rainfall in this country.Datasets shorter than 30 years of data were excluded, after which a total of 54 datasets were employed.The data were extracted from the KMA website (http://www.kma.go.kr/).Most of the time spans are approximately 33 years, from 1976 to 2008.
The validation study was performed with an annual dataset to present the performance of the proposed model with truncating recent years as 1994-2008.The truncated data were not used in simulation but employed in validation.Also, a case study was applied with the weekly dataset of the 54 stations in South Korea.In the application study of the proposed IBB procedure in Sect.5, (1) 0.5 and 1.0 • C increases in the mean weekly temperature were assumed; (2) weekly temperature datasets were simulated using the assumed temperature increase; and (3) weekly precipitation datasets were also simulated along with the weekly temperature dataset as a block.Note that the simulation does include not a gradual change, such as a trend, but the overall mean change.We simulated the weekly timescale so that the data spanned a long enough period to provide a summary of weather statistics and a short enough period to reflect the temporal variability.Furthermore, the observed weekly datasets of temperature and precipitation were aggregated into seasonal timescale data, and the aggregated seasonal data were used to present the seasonal variations in precipitation as temperature increases.
Note that although we simulated the temperature with a specific condition of increase (e.g., +0.5 or +1.0 • C), no such restriction was placed on the precipitation, allowing one to determine whether there is any change in precipitation with the condition of increasing temperature.One hundred series were simulated with the same time span as the observations.

Validating IBB model with annual data
To further prove the credibility of the proposed IBB model, we validated the model with truncating the last 15 years (1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008) of the annual mean temperature and precipitation data over South Korea.The last truncated 15 years were set as the validation period while the rest of the preceding years were set as the test period.The dataset of the test period was employed in the simulation while the dataset of the validation period is only used in comparison, to check how much the proposed model performs.Among others, annual-scale data are employed to easily illustrate the performance of the proposed IBB model.At first, some mathematical terms need to be defined to explain the validation procedure as follows.
where µ p V and µ p T are the mean annual precipitation over the validation years and over the test period, respectively, while µp IBB is the annual mean precipitation of the IBBsimulated data with the record length of the validation years.
In Fig. 4, the annual mean precipitation of the observation over the validation period (µ p V , filled blue circle) and the test period (µ p T , filled red triangle) as well as the IBB simulation (µ p IBB , box plot) is illustrated.The result indicates that the observed mean precipitation over the validation period (µ p V ) presents higher than the mean for the test period (µ p T ) in most of the stations.The IBB-simulated data reflects this tendency, showing higher mean precipitation than the mean precipitation of the test period, though its magnitude shows some difference.The mean of the observed annual precipitation for the validation period at each station and the mean of one hundred IBB-simulated data is shown in Fig. 5.The top panel shows that the simulated data reproduce fairly well the observed mean of annual precipitation for the validation period (1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008).The observed mean difference (Dµ obs p ) of the annual precipitation between the test period (1976)(1977)(1978)(1979)(1980)(1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993) and the validation period, shown at the bottom panel of Fig. 5, matches fairly well with the one of the IBB-simulated data (Dµ IBB p ). Rather high variability at the difference is inevitable due to relatively small record length for both the test period and the validation period.Overall, the validation study implicates that the proposed IBB approach can simulate the future evolution of annual precipitation over South Korea.
In Fig. 6, the spatial distribution of the differences for the annual mean precipitation is shown with the observed data (i.e., Dµ obs p ) and with the IBB-simulated data (Dµ IBB p ).A high increase of annual mean precipitation in the northern and southern parts of the country and a small increase and slight decrease in the southern part shown in the observed data (left panel) are well reflected in the IBB-simulated data (right panel), except that the increase shown from the IBBsimulated data (right panel) in the left southern part of the country is not shown in the observed data.Overall, the figure indicates that the spatial pattern of the annual mean precipitation difference from the observed data (see the left panel) is similar to the one from the IBB-simulated data (see the right panel).

Precipitation changes according to assumed temperature increase
Figure 7 shows the results of the fitted IBB model for the Buan station, located at 35 • 44 N and 126 • 43 E. The top panel (Fig. 7a) shows the estimated weight order of each week for the mean temperature data employing the HS meta- heuristic algorithm with the objective function of Eq. ( 6) while assuming a 0.5 • C increase.The estimated values of the weight order range from 0.2 to 1.3.The mean and standard deviation of the observed and theoretical results (see Eqs. 2 and 7) with a 0.5 • C mean increase are shown in Fig. 7b and  c, respectively.The predominant annual cycle of the mean weekly temperature is seen in the mean statistics, as shown in Fig. 7b, while the annual cycle of the standard deviation (equivalent to the square root of variance) is not as prominent as the annual cycle of the mean (see Fig. 7c).Note that the weight order and the standard deviation (see Fig. 7a and  c) are highly negatively correlated.In other words, when the standard deviation is small (e.g., at approximately the 23rd week), the weight order is high, and vice versa.This result is intuitive in that if the variance is great, the corresponding temperature values differ greatly from each other.Subsequently, the weights of the large values to be selected are not necessarily much different from the weights of the low values in such a case, which induces a low weight order.In Fig. 7c, the variance difference between the observed and theoretical data, as defined in Eq. ( 8), is shown with a dotted line.This variance difference is inflated to the resampled data, as in Eq. ( 9).This inflation procedure is optional in assessing the overall trend of annual mean precipitation data regarding cli-mate warming scenarios.However, it might be helpful when the purpose of the study is to evaluate an overall variation of extreme precipitation statistics.
The statistics of the simulated data from IBB with the condition of a 0.5 • C mean temperature increase are shown as a box plot in Fig. 8; the statistics of the observed data are shown in the same figure with dotted lines and cross marks.The mean increases by exactly 0.5 • C, as intended, and the standard deviation (square root of variance) is well preserved through the variance inflation process (see Eq. 8).The minima and maxima of the mean weekly temperatures are increased.
Shown in Fig. 9a are the mean differences between the simulated and observed weekly precipitation with the conditions of 0.5 and 1.0 • C increases at the Buan station.The differences are not significant at the 5 % level.However, the mean differences are continuously positive from the 30th to 40th week, which is during the summer season.This result indicates that a seasonal effect on the precipitation change must exist.Therefore, we also extended our study to a seasonal timescale.The mean precipitation differences of all 54 stations are shown for 0.5 and 1.0 • C increases in Fig. 9b and  c, respectively.Both plots show a decrease in autumn and increases in the other seasons.For a 1.0 • C temperature increase, 61, 24, and 45 % of the employed stations show a significant increase in mean precipitation for the winter, spring, and summer seasons, respectively.In contrast, the mean temperature decreases during the autumn season.Approximately 30 % of the stations experience a significant change in the mean precipitation at the 5 % level given a 1.0 • C temperature increase.The detailed information is provided in Table 1.
The spatial distribution of seasonal mean precipitation differences is shown in Fig. 10 given the condition of a 1 • C temperature increase.An increasing pattern of precipitation during winter (see Fig. 10a) can be seen over the South Korean peninsula.Notably, the eastern and southern coastal areas undergo a significant increase with a 95 % confidence interval (±5.38).Note that the significance interval at each station is different because the variances between stations are different.The detailed significance interval for each station is provided in Table 2.During spring (see Fig. 10b), the northern part of the country shows an increasing pattern while the southwestern and southeastern parts show decreasing patterns, but their magnitudes are not significant (±15.04).The summer precipitation (see Fig. 10c) undergoes a significant increase in the southwestern area of the country (±29.94).In contrast to the other seasons, a significant decrease in mean precipitation occurs during autumn (see Fig. 10d) throughout the country, especially over the eastern coastal area.The same spatial pattern of seasonal mean precipitation can be observed given the condition of a 0.5 • C temperature increase, as in the case of a 1.0 • C temperature increase, with little significant change (see Fig. 11).
The spatial distributions of seasonal precipitation changes seem to be related to the flow direction of the seasonal air mass.In South Korea, winter is influenced primarily by the Siberian air mass with prevailing northwesterly winds, while summer is hot and humid with southeasterly winds.

Summary and conclusions
A simple method is proposed (1) to simulate precipitation given the condition of a mean temperature increase derived from the observations and (2) to address the problem of how the precipitation varies while the temperature is increased through global warming.The results illustrated that a simple IBB technique for the temperature variable, incorporating block sampling of precipitation, can achieve this objective.
The presented technique is valuable because hydrometeorological variables such as precipitation and discharge are difficult to model with current GCMs, while the temperature prediction is relatively accurate.The proposed method can be extended to other hydrometeorological variables as well as other applications, including studies at the global scale.The limit of the proposed method is that the temperature increase is limited since employed data are observational.One possibility for allowing a greater temperature increase than that from the observations is to include neighboring, similar stations or seasons.The author believes that the proposed model can be a good surrogate or competitor in GCM-based climate change impact assessments of hydrometeorological variables.
The proposed IBB method is not a physical-based method but a statistical simulation approach in which a physical mechanism of precipitation cannot be taken into consideration.Substantial modification might be required to accommodate this mechanism.The proposed IBB method is conditioned and assumed only on the mean temperature change.A further scheme can be developed to consider the changes of multiple variables by classifying the conditions of interested variables.Another possible extension of the current study must be analyzing the future variation of hydrological extreme events (e.g., extreme floods).When a long-term variation of hydrological extreme events is related with precipitation, the proposed IBB method can be used to derive the variation.

Code and data availability
All the employed code can be provided upon request to the author of the current study.The employed precipitation and temperature data over South Korea can be downloaded from the KMA website http://www.kma.go.kr/weather/climate/ pastcal.jsp.
The Supplement related to this article is available online at doi:10.5194/gmd-10-525-2017-supplement.

Figure 1 .
Figure 1.Procedure for the proposed simulation IBB method of temperature and precipitation data.

Figure 7 .
Figure 7. (a) Estimated weight order from HS and weekly statistics of (b) mean and (c) variance for the observed temperature data (solid line) and the theoretical statistics (dashed line with cross) using Eqs.(2) and (7), for Buan station.The weekly difference in variance between observation and theoretical (see Eq. 8) is shown in panel (c) by a dotted line.

Figure 8 .
Figure8.The statistics of the observed (dotted line with cross) and generated (box plot) data for the weekly mean temperature using IBB, with a 0.5 • C temperature increase in Buan, South Korea.Boxes display the interquartile range (IQR), and whiskers extend to the extrema (i.e., maximum and minimum).The horizontal lines inside the boxes depict the median of the data.Note that the mean and maximum of the simulated data are increased significantly compared with the corresponding observed data, while the minimum of the simulated data is slightly increased and the standard deviation of the simulated data agrees with that of the observed data due to the variance inflation, as in Eq. (9).

Figure 9 .
Figure 9.The mean precipitation differences of the observed and simulated data (a) for the weekly precipitation in Buan with a 0.5 • C mean temperature increase, (b) for the seasonal precipitation of all 54 stations with a 0.5 • C mean temperature increase, and (c) for a 1.0 • C mean temperature increase.Note that the mean of the simulated precipitation data is indicated for weekly (a) or seasonal (b, c) time frames.

Figure 10 .
Figure10.Spatial distributions in South Korea of the mean difference in seasonal precipitation (mm) with a 1.0 • C increase in mean temperature.Note that the scale for the summer distribution is different from the other seasons, the 95 % significance intervals are different at each station and the mean values of the significance intervals are ± 5.38, ±15.04, ±29.94, and ±4.84 for Winter (December, January, February), Spring (March, April, May), Summer (June, July, August), and Autumn (September, October, November), respectively.

Figure 11 .
Figure 11.Spatial distribution of mean difference of seasonal precipitation (mm) with 0.5 • C increasing mean temperature in South Korea.Note that the scale of summer is different from the other seasons and the 95 % significance intervals are different at each station and the mean values of the significance intervals are ±5.38,±15.04, ±29.94, and ±4.84 for Winter (December, January, February), Spring (March, April, May), Summer (June, July, August), and Autumn (September, October, November) respectively.

Table 1 .
Mean precipitation difference of the observed and simulated data for seasonal data over all the employed stations in South Korea in case of +1.0 • C mean temperature increase.

Table 2 .
Confidence interval for mean precipitation difference of the observed and simulated data for seasonal data.