A Hydrological Emulator for Global Applications –

While global hydrological models (GHMs) are very useful in exploring water resources and interactions between the Earth and human systems, their use often requires numerous model inputs, complex model calibration, and high computation costs. To overcome these challenges, we construct an efficient open-source and ready-to-use hydrologic emulator (HE) that can mimic complex GHMs at a range of spatial scales (e.g., basin, region, globe). We constructed both a lumped and a distributed scheme of the HE based on the monthly abcd model, for users' choice – minimal computational cost with reasonable model fidelity, or heavier computational load with better predictability. Model predictability and computational efficiency were evaluated in simulating global runoff from 1971–2010 with both the lumped and distributed schemes. The results are compared against the runoff product from the widely-used Variable Infiltration Capacity (VIC) model. Our evaluation indicates that the lumped and distributed schemes present comparable results regarding annual total quantity, spatial pattern and temporal variation of the major water fluxes (e.g., total runoff, evapotranspiration) across the global 235 basins (e.g., correlation coefficient r between the annual total runoff from either of these two schemes and the VIC is >0.96), except for several cold (e.g., Arctic, Interior Tibet), dry (e.g., North Africa) and mountainous (e.g., Argentina) regions. Compared against the monthly total runoff product from the VIC (aggregated from daily runoff), the global mean Kling-Gupta efficiencies are 0.75 and 0.79 for the lumped and distributed schemes, respectively, with the distributed one better capturing spatial heterogeneity. Notably, the computation efficiency of the lumped scheme is two orders of magnitude higher than the distributed one, and seven orders more efficient than the VIC model. Our results suggest that the revised lumped abcd model can serve as an efficient and acceptable HE for complex GHMs and is suitable for broad practical use, and the distributed scheme is also an efficient alternative if spatial heterogeneity is of more interest.

extraordinary advantage in computational efficiency.Our results suggest that the revised lumped "abcd" model can serve as an efficient and acceptable HE for complex GHMs and is suitable for broad practical use, and the distributed scheme is also an efficient alternative if spatial heterogeneity is of more interest.

Introduction
A global hydrological model (GHM) is an effective tool to understand how water moves between soil, plants and the atmosphere.In terms of spatial discretization, hydrological models can be classified into: 1) lumped models treating one basin as a homogeneous whole and disregarding spatial variations, such as the Sacramento Soil Moisture Accounting Model (Burnash et al., 1973); and 2) distributed models where the entire basin is divided into small spatial units (e.g., square cells or triangulated irregular network) to capture spatial variability, such as the PCRaster Global Water Balance (Van Beek and Bierkens, 2009) and the WASMOD-M (Widén-Nilsson et al., 2007).For simplicity, models with division of one basin into separate areas or sub-basins are also categorized as distributed ones here.The corresponding predictability and computational efficiency of GHMs may vary from model to model, due to difference in complexity and structure.Recent years have seen rapid progress in GHMs.They are widely used in assessing the impacts of climate change and land surface changes on the water cycle (Alcamo and Henrichs, 2002;Arnell and Gosling, 2013;Liu et al., 2013;Liu et al., 2014;Nijssen et al., 2001a), exploring spatial and temporal distribution of water resources (Abdulla et al., 1996;Alkama et al., 2010;Bierkens and Van Beek, 2009;Gerten et al., 2005;Tang et al., 2010), examining how human activities alter water demand and water resources (De Graaf et al., 2014;Döll et al., 2009;Hanasaki et al., 2008;Liu et al., 2015;Rost et al., 2008;Vörösmarty et al., 2000), and investigating the interactions between human activities and water availability by incorporating GHM with integrated assessment models (Kim et al., 2016).
Applying GHMs usually requires miscellaneous inputs, high computational costs, and a complex calibration process.These challenges stand out in practical situations, especially when the computational resources are limited.For instance, sensitivity analysis and uncertainty quantification are often needed for decision making, but the users usually cannot afford to run a large number of simulations with many GHMs like the VIC (also categorized as land surface model (LSM)) due to their high computational expense (Oubeidillah et al., 2014).Another situation is when the users seek reasonable estimates of water resources with minimal efforts rather than acquiring highly accurate estimates through expensive inputs of time and efforts.For example, when users seek to explore the hydroclimatology of a region and its long-term water balance (Sankarasubramanian and Vogel, 2002), then GHMs with fine spatial (e.g., 1/8 degree) and temporal resolution (e.g., hourly) are not necessarily needed.In this case, simple models that possess reasonable predictability and are computationally efficient tend to be more suitable.
In addition, some studies have shown that GHMs/LSMs are sometimes outperformed by simple empirical statistical models (Abramowitz, 2005;Abramowitz et al., 2008;Best et al., 2015), suggesting that some GHMs/LSMs may underutilize the information in their climate inputs and that model complexity may undermine accurate prediction.This also indicates the potential advantages of simple model over complex GHMs/LSMs.Thus, constructing simple models that can emulate the dynamics of more complex and computational expensive models (e.g., GHMs/LSMs) is warranted.
The motivation of this work arises from the need to construct a hydrological emulator (HE) that can efficiently mimic the complex GHMs to address the abovementioned issues for practical use, which provides the opportunity of speeding up simulations at the cost of introducing some simplification.We develop a HE that is ready-to-use and efficient for any interested groups or individuals to assess water cycle at basin/regional/global scales.This HE possesses the following features: 1) minimum number of parameters; 2) minimal climate input that is easy to acquire; 3) simple model structure; 4) reasonable model fidelity that captures both the spatial and temporal variability; 5) high computational efficiency; 6) applicable in a range of spatial scales; and 7) open-source and well-documented.
To achieve our goal of identifying a suitable HE, we have explored many hydrological models to find one that may meet our needs.We start with a simple baseline model characterized by mean seasonal cycle; i.e., the inter-annual mean value for every calendar day (Schaefli & Gupta, 2007).Among others, we also explore the "abcd" model because: 1) it is widely-used and proven to have reasonable predictability (Fernandez et al., 2000;Martinez and Gupta, 2010;Sankarasubramanian and Vogel, 2002;Sankarasubramanian and Vogel, 2003;Thomas, 1981;Vandewiele and Xu, 1992;Vogel and Sankarasubramanian, 2003); 2) it uses a monthly time step and requires less computational cost than daily or hourly models; 3) it has solid physical basis hence has potential to be extended to other temporal scales (Wang and Tang, 2014); 4) it requires minimal and easily-available inputs; 5) it only involves 4-7 parameters; and 6) it can simulate variables of interest such as recharge, direct runoff and baseflow that many other simple models can't simulate (Vörösmarty et al., 1998) .This study marks the first time that the "abcd" based model is applied globally, and also the first time the predictability and computational efficiency for both the lumped and distributed schemes are evaluated.Below we describe the baseline and the "abcd" models and data in Section 2; and we present the evaluation of the two models, discuss their appropriateness of serving as a HE in Section 3; finally, in Section 4 we summarize this work with concluding remarks.

Model description
We examine two simple models -baseline and the "abcd" model (both lumped and distributed scheme) in order to identify a suitable one for serving as a HE.

Baseline model
Following the work of Schaefli & Gupta (2007), we explore a baseline model characterized by the inter-annual mean value for every calendar day, i.e., climatology.In this study, we adapt the baseline model to monthly scale by first calculating inter-annual mean value for every calendar day from daily runoff of the benchmark product during 1971-2010 (see Section 2.3.2), and then aggregating daily runoff to monthly runoff.The model uses climatology for prediction, for example, if the inter-annual mean runoff for July in the Amazon basin is 100 mm mon -1 , then the prediction of total runoff for July of every year is 100 mm mon -1 .

The "abcd" model
The monthly "abcd" model was first introduced by Thomas (1981) to improve the national water assessment for the U.S., with a simple analytical framework using only a few descriptive parameters.It has been widely used across the world, especially for the U.S. (Martinez and Gupta, 2010;Sankarasubramanian and Vogel, 2002;Sankarasubramanian and Vogel, 2003).The model uses potential evapotranspiration (PET) and precipitation (P) as input.The model defines four parameters a, b, c, and d that reflect regime characteristics (Sankarasubramanian and Vogel, 2002;Thomas, 1981) to simulate water fluxes (e.g., evapotranspiration, runoff, groundwater recharge) and pools (e.g., soil moisture, groundwater).The parameters a and b pertain to runoff characteristics, and c and d relate to groundwater.Specifically, the parameter a reflects the propensity of runoff to occur before the soil is fully saturated.The parameter b is an upper limit on the sum of evapotranspiration (ET) and soil moisture storage.The parameter c indicates the degree of recharge to groundwater and is related to the fraction of mean runoff that arises from groundwater discharge.The parameter d is the release rate of groundwater to baseflow, and thus the reciprocal of d is the groundwater residence time.Snow is not part of the original "abcd" model, which may result in poor performance of the model in cold regions where snow significantly affects the hydrological cycle.The work of Martinez and Gupta (2010) has added snow processes into the original "abcd" model, where the snowpack accumulation and snow melt are estimated based on air temperature.Their work indicated that incorporation of the snow processes in the monthly "abcd" model has significantly improved model performance in snowcovered area in the conterminous United States (see Figure 4 in Martinez and Gupta (2010)).
In this study, we adopt the "abcd" framework from Martinez and Gupta (2010) (Fig. 1); meanwhile, we make three modifications to suit the needs of a HE for global applications.First, in order to enhance the model efficiency with as least necessary parameters as possible, instead of involving three tunable snow-related parameters in the calibration process, we set the values for two of the parameters (i.e., temperature threshold above or below which all precipitation falls as rainfall or snow) from literature (Wen et al., 2013) and only keep one tunable parameter msnow melt coefficient (0 < m < 1).Second, we introduce the baseflow index (BFI) into the calibration process to improve the partition of total runoff between the direct runoff and baseflow (see Section 2.4).Third, other than the lumped scheme as previous studies used, we first explore the values of model application in distributed scheme with a grid resolution of 0.5 degree.The detailed model descriptions and equations are presented in the Appendix A, and the descriptions and ranges of model parameters are listed in Table 1.

Model structure
In terms of the "abcd" model, we evaluate both the lumped and distributed model schemes, although most previous applications of the model are conducted in a lumped scheme (Bai et al., 2015;Fernandez et al., 2000;Martinez and Gupta, 2010;Sankarasubramanian and Vogel, 2002;Sankarasubramanian and Vogel, 2003;Vandewiele and Xu, 1992;Vogel and Sankarasubramanian, 2003).In the lumped scheme, each of the 235 river basins is lumped as a single unit, and each of the climate input (see Section 2.3.1) is the lumped average across the entire basin, and thus all the model outputs are lumped as well.In terms of the distributed one, however, each 0.5-degree grid cell has its own climate inputs, and likewise, the model outputs are simulated at the grid-level.Although the two schemes differ in the spatial resolution of their inputs and outputs, their within-basin parameters are uniform.We use basin-uniform rather than grid-specific parameters for the distributed scheme for two reasons: 1) to enhance computational efficiency; and 2) to avoid drastically different parameters for neighboring grid cells that may be unrealistic.Note that lateral flows between grid cells and basins are not included at this stage for the "abcd" model.For the baseline model, as it is derived from the benchmark product (see Section 2.3.2),which presents runoff estimates in a spatial resolution of 0.5-degree, and thus every grid cell of each basin has its own inter-annual mean monthly runoff estimates.

Climate data
The climate data needed for the "abcd" model only involve monthly total precipitation, monthly mean, maximum and minimum air temperature.The data we use is obtained from WATCH (Weedon et al., 2011), spanning the period of 1971-2010, and it is 0.5-degree gridded global monthly data.The climate data is used for model simulation over the global 235 major river basins (Kim et al., 2016).Additionally, we use the Hargreaves-Samani method (Hargreaves and Samani, 1982) to estimate potential evapotranspiration (PET), which is a required input for the "abcd" model, and it needs climate data of mean, maximum and minimum temperatures for the calculation.

Benchmark runoff product
In this study, the "abcd" model is tested for its ability to emulate the naturalized hydrological processes of a reference model since the "true" naturalized hydrological processes are unknown.The "perfect model" approach is well adopted in climate modeling studies where one model is treated as "observations" while the others are tested for their ability to reproduce "observations" (Murphy et al., 2004;Tebaldi and Knutti, 2007).Here, we use the process-based VIC model as the "perfect model", which was also driven by the WATCH climate forcing.
The VIC runoff product here is a global simulation with a daily time step and spatial resolution of 0.5 degree for the period of 1971-2010, and the VIC daily runoff is aggregated to monthly data to be consistent with the temporal scale of the "abcd" model.The VIC model settings used in this study are based on the University of Washington VIC Global applications (http://www.hydro.washington.edu/Lettenmaier/Models/VIC/Datasets/Datasets.shtml).The subgrid variability of soil, vegetation and terrain characteristics are represented in sub-grid areaspecific parameter classifications.Soil texture and bulk densities are derived by combining the World Inventory of Soil Emission Potentials database (Batjes, 1995) and the 5-min digital soil map of the world from the Food and Agricultural Organization (FAO, 1998).Based on the work of (Cosby et al., 1984), the remaining soil properties (e.g.porosity, saturated hydraulic conductivity and unsaturated hydraulic conductivity) are derived.Vegetation type data are obtained from the global land classification of (Hansen et al., 2000).Parameters including the infiltration parameter, soil layer depths and those governing the baseflow function were calibrated for major global river basins and transferred to the global domain as documented in (Nijssen et al., 2001b), based on which Zhang et al. (2014) and Leng et al. (2015) conducted additional calibrations in the China domain.In this study, the VIC model was forced by WATCH climate forcing at the daily time step (Weedon et al., 2011), based on the calibrated parameters from Nijssen et al. (2001b), (Zhang et al., 2014) and (Leng et al., 2015).The simulated runoff used in this study has recently been validated globally within the framework of the Inter-Sectoral Impact Model Intercomparison Project and shows reasonable performance compared to other hydrological models (Hattermann et al., 2017;Krysanova and Hattermann, 2017).
The VIC runoff product (Hattermann et al., 2017;Leng et al., 2015) is then used as a benchmark for calibrating and validating the "abcd" model due to two reasons.First, VIC runoff has been evaluated across many regions of the globe and is proved to be reasonably well (Abdulla et al., 1996;Hattermann et al., 2017;Maurer et al., 2001;Nijssen et al., 1997;Nijssen et al., 2001b).Second, since we have not involved river routing, reservoir regulations and upstream water withdrawals in the "abcd" model, the simulated monthly runoff is more representative of "natural conditions", and as such it tends to be more reasonable to compare the simulated runoff against the VIC runoff product rather than observed streamflow data from stream gauges (Dai et al., 2009;Wilkinson et al., 2014).Despite potential bias in the VIC runoff product, using it as a benchmark here is to demonstrate the capability of the HE developed in this work to mimic complex GHMs.Furthermore, the application of the HE is not tied to the VIC model and should be able to emulate other GHMs.
The VIC runoff product also compares well to other products (see Fig. S1, S2), including the UNH/GRDC runoff product (Fekete and Vorosmarty, 2011;Fekete et al., 2002) and the global streamflow product (Dai et al., 2009).The scatterplot pattern of the VIC long-term annual runoff product vs. the streamflow product matches well with that of the UNH/GRDC runoff vs. the streamflow product (streamflow is transferred to the same unit as runoff via dividing by the basin area).Further, the correlation coefficient of the VIC and the UNH/GRDC long-term annual runoff is as high as 0.83 across the global 235 basins.This suggests the reasonability of VIC runoff product, because the UNH/GRDC runoff is calibrated with the GRDC observations.At the same time, the discrepancies between the VIC runoff products and the streamflow products (Fig. S2) may be attributed to human activities, such as reservoir regulations and upstream water withdrawals, which are not embedded in the runoff but reflected in the streamflow.

Model calibration
Typically, most applications of the "abcd" model utilize single-objective optimization for total runoff (or streamflow) during the calibration process to minimize the difference between measured and simulated streamflow (Bai et al., 2015;Martinez and Gupta, 2010;Sankarasubramanian and Vogel, 2002).While this may lead to a good fit for simulated total runoff, however, it may result in inappropriate partition of total runoff between direct runoff and baseflow.To improve the accuracy of the simulated total runoff and the partition between direct runoff and baseflow, we introduce the baseflow index (BFI) into the objective function.
Unlike the baseline model, the "abcd" model requires a calibration step for reasonable parameterization so as to enable good prediction.As mentioned above, we incorporate BFI into the objective function during the calibration process.On one side, we maximize Kling-Gupta efficiency (KGE) (Gupta et al., 2009), which is used as a metric to measure the accuracy of the simulated total runoff relative to the VIC benchmark runoff.The KGE is defined as the difference of unity and the Euclidian distance (ED) from the ideal point, thus we maximize KGE through minimizing the ED.The KGE and ED are calculated as follows (Gupta et al., 2009): where r, α, β, and Covso are relative variability, bias, correlation coefficient, and covariance between the simulated and observed values (here we treat the VIC runoff as the observed), respectively; µ and σ represent the mean and standard deviation (subscript "s" and "o" stand for simulated and observed values).On the other side, we also nudge the simulated BFI towards the benchmark BFI (here we treat the benchmark BFI as the observed) -the mean BFI of the four products from (Beck et al., 2013).Then, the objective function is as follows: where min stands for minimizing the value in the parenthesis, abs represents absolute value, ED is the Euclidian distance between the simulated and observed total runoff (Gupta et al., 2009), BFIobs and BFIsim are the observed and simulated BFI, respectively.Here we treat the benchmark runoff from the VIC and BFI from Beck et al. (2013) as observed values.We then minimize the objective function for parameter optimization by utilizing a Genetic Algorithm (GA) routine (Deb et al., 2002).Note that for the distributed model scheme, we aggregate the grid-level total runoff estimates to basin-level and then nudge it toward basin-level benchmark total runoff during the calibration process.

Model simulations
To evaluate the predictability and efficiency of the baseline and the "abcd" model so as to identify a suitable one to serve as a HE, we have conducted a series of simulations.
Specifically, for the baseline model, no simulations are needed as it uses inter-annual mean value for each month -12 monthly values -as prediction, so we just replicate the 12 monthly runoff for 1971-2010 and for each of the global 235 basins, and then compare against the benchmark runoff product.For the "abcd" model, two sets of model simulations across the global 235 basins are conducted, with one set for calibration and the other one for validation, for both the lumped and distributed model schemes.For the first set, we run the model for each basin for the period of 1971-1990 to get basin-specific parameters by using the GA approach (see Section 2.4).For the second set, using the parameters identified in the first set of simulation, we run the model for the period of 1991-2010 to validate the model predictability and also evaluate the computational efficiency.Model inputs and outputs in the distributed scheme are at a spatial resolution of 0.5degree, whereas those in the lumped scheme are all in lumped single unit for each basin.All model simulations are conducted in a monthly time step.Note that broad users can run the identified HE for global 235 basins, or for as many basins as they want for either scheme, as all the related basin-specific input data and calibrated parameters for both schemes are open-source.

Comparison of performances between the baseline and the "abcd" model
Generally, we find baseline model performs worse than the "abcd" model (Fig. 2).The baseline model exhibits a lower global mean KGE value (0.61) than the lumped and distributed schemes of the "abcd" model (0.75 and 0.79, respectively).In addition, our analysis indicates that the incorporation of BFI into the objective function leads to significant improvement in the partition of total runoff between direct runoff and baseflow (Fig. S4), without compromising predictability for total runoff, i.e., the global mean KGE values for modeled total runoff with or without the incorporation of BFI are almost the same (0.75 vs 0.76).Specifically, for the case of involving both the total runoff and BFI in the objective function, the correlation efficiencies (r) between the long-term annual benchmark and modeled direct runoff, and between benchmark and baseflow from the lumped scheme across global basins are 0.97 and 0.96, respectively, which are much higher than those of 0.86 and 0.72 in the case of only involving the total runoff in the objective function (Fig. S4).Given the superiority of the "abcd" model over the baseline model, we focus in the following sections on evaluating the predictability and computational efficiency of the "abcd" model and its potential to serve as a HE.

Evaluation of model predictability
In terms of total runoff, we find the lumped and distributed schemes are comparably capable in simulating long-term mean annual quantity, temporal variations and spatial patterns for the vast majority of river basins globally (Fig. 3-5).Estimates of long-term mean annual total runoff from both the lumped and distributed schemes match very well with that of VIC total runoff across the 235 basins, with a correlation coefficient (r) of higher than 0.96, for both the calibration and validation period (Fig. 3).Similarly, the basin-level estimates of long-term mean annual direct runoff and baseflow also match well with those of the VIC across the globe, for both schemes and both periods (Fig. 3).This suggests both schemes possess the capability in partitioning total runoff.Furthermore, both schemes display good capability in capturing the seasonal signals of the total runoff (Fig. 4).Meanwhile, although the spatial patterns of annual total runoff from the lumped scheme present a general match with that of the VIC, it does not reflect the spatial variations inside a basin that is however captured by the distributed scheme (Fig. 5).Therefore, the distributed scheme provides overall slightly higher KGE (Fig. 6), with a global mean KGE value of 0.79 as compared to 0.75 for the lumped scheme (Fig. 2).
To ensure good model predictability for the major water fluxes, we also evaluate the modelled ET estimates.The modelled ET compares reasonably well with the VIC ET product as well as with the mean synthesis of the LandFlux-EVAL ET product (Mueller et al., 2013), displaying similar spatial variations (Fig. S5).Likewise, the distributed "abcd" scheme tends to have better capability in presenting spatial heterogeneity than the lumped one.Further, the good predictability of seasonality in runoff as illustrated in Fig. 4 also reflects similar performance for ET, given the runoff and ET are the two major water fluxes in the water mass balance and the soil moisture changes are negligible over long-term.
The distributed scheme appears to outperform the lumped scheme in term of goodnessof-fit, especially in some cold (e.g., Arctic, Northern European, Interior Tibet) and in some dry (e.g., North Africa) regions (Fig. 6).This is possibly because distributed inputs can reflect basinlevel heterogeneity, and thus better capture the characteristic of the hydrological conditions in those regions.However, both schemes do not perform well in the southern end of the Andes Mountains (Fig. 6).This may be attributed to the complex land surface characteristics in that mountainous area, which cannot be resolved due to the coarse spatial resolution.Moreover, the distributed scheme seems not performing very well in some cold regions (Fig. 6), which is possibly due to lack of representation for permafrost in the model.
The good agreement between our modelled water fluxes, including total runoff, direct runoff, baseflow and ET, and the benchmark products provides confidence in the capability of both the lumped and distributed schemes in estimating temporal and spatial variations in major water fluxes across the globe.In addition, to identify a suitable HE, the required computation cost is another key factor as detailed below.

Evaluation of computational efficiency
While the performance of model predictability is comparable for the lumped and distributed schemes as elucidated above, great disparity exists for runtime of the two schemes and the VIC model (Table S1).Take the Amazon basin that covers a total number of 1990 0.5degree grid cells as an example, it takes 11.05 minutes for model calibration via the GA method in the distributed scheme but only 0.16 minute for the lumped one.Similar disparity is also found for model simulation with calibrated parameters, with runtime of 0.03 and 3.20 seconds for a 1000-year simulation of the Amazon basin for the lumped and distributed schemes, respectively.However, according to the authors' experience, it will take ~1 week for the VIC model to accomplish the same job, which is far more computationally expensive.In general, the computational efficiency of the lumped scheme is two orders of magnitudes higher than the distributed one, although that of the distributed one is still much higher than the VIC (~five orders of magnitude) and many other GHMs and LSMs.Note that all of the simulations here are conducted on the Pacific Northwest National Laboratory (PNNL)'s Institutional Computing (PIC) Constance cluster using 1 core (Intel Xeon 2.3 GHz CPU) with the same configuration.

Potential application of the "abcd" model as a hydrological emulator
The good predictability and computational efficiency of both the distributed or lumped schemes as elucidated in Sections 3.2 and 3.3 suggest its suitability for serving as HEs that can efficiently emulate complex GHMs (e.g., the VIC or others).The source codes, input data, basinspecific parameters across the globe for both the lumped and distributed schemes are opensource and well-documented, which will make the HE ready to use and facilitate their wide and easy use with minimal efforts.
The choice of either the distributed or lumped scheme as HE depends on the user's specific needs.There is a tradeoff between the model predictability and computational efficiency.While the distributed scheme tends to better capture the spatial heterogeneity of water fluxes and can produce grid-level outputs that lumped scheme cannot, it incurs heavier computational cost than the lumped scheme.For applications that aim to strike a balance between predictability and computation cost, such as practical assessment of water resources, or estimation of water supply for integrated assessment models (IAMs), or quantification of uncertainty and sensitivity analyses, it would be reasonable to employ the lumped scheme as a HE.The lumped scheme is especially advantageous due to its minimal calibration and computational cost, parsimonious efforts for model implementation, and reasonable fidelity in estimating major water fluxes (e.g., runoff, ET).For users from the IAM community, the lumped scheme might be sufficiently suitable for their needs since 1) the lumped scheme can operate at the same spatial resolution at which IAMs typically balance water demands and supplies (Edmonds et al., 1997;Kim et al., 2006;Kim et al., 2016), and 2) the inherent uncertainty of the lumped scheme is likely comparable or even overshadowed by the intrinsic uncertainty of IAMs (Kraucunas et al., 2015;O'Neill et al., 2014).Similarly, for users who aim to conduct uncertainty and sensitivity analyses, the high computational efficiency of the lumped scheme allow the users to emulate the hydrological model of interest (e.g., GHMs, LSMs) and then run a large number of simulations to conduct their uncertainty and sensitivity analysis (Scott et al., 2016).Therefore, the high computational efficiency makes the lumped scheme more appealing as a HE in these cases.However, if the research questions hinge on the gridded estimates, or emphasize the spatial heterogeneity of the water fluxes or pools, it would be more desirable to deploy the distributed scheme as a HE instead.
Based upon our open-source HE and the validated basin-specific parameters across the globe, researchers can easily investigate the variations in water budgets at the basin/ regional/global scale of interest, with minimum requirements of input data, efficient computation performance and reasonable model fidelity.Likewise, researchers can utilize the framework of the HE with any alternative input data, or recalibrate the HE to emulate other complex GHMs or LSMs of interest, to meet their own needs.

Case study for uncertainty analysis
To demonstrate the capability of the examined "abcd" model serving as a HE, we use the lumped scheme to conduct parameter-induced uncertainty analysis for the runoff simulation at the world's sixteen river basins with top annual flow (Dai et al. 2009).Specifically, for each of the sixteen basins, we first apply ±10% change to each of the five calibrated parameters (a, b, c, d, m) to compose varying ranges; note that we just truncate the range to those valid in Table 1 if the ±10% change exceeds the valid range.Then we randomly sample the five parameters from corresponding ranges for 100,000 times (i.e., 100,000 combinations of parameters).After that, we run the lumped scheme 100,000 times for each basin with the 100,000 combinations of parameters to examine the parameter-induced uncertainty in total runoff.The uncertainty analysis indicates that most basins are robust to changes in parameters, other than the Tocantins, Congo and La Plata (Fig. 7).In other words, for basins Congo and La Plata, slight changes in parameters may lead to large changes in runoff estimates.Then the uncertainty in the calibrated parameters for the two basins may lead to large bias in the simulated runoff, which may more or less explain why modelled runoff for the two basins tend to have higher biases than other basins (Fig. 4).Notably, the 100,000 times of simulations only takes ~80 seconds on a Dell Workstation T5810 with one Intel Xeon 3.5 GHz CPU, which demonstrates the extraordinary computational efficiency of the lumped scheme and its advantage for serving as a HE.

Conclusions
Toward addressing the issue that many global hydrological models (GHMs) are computationally expensive and thus users cannot afford to conduct a large number of simulations for various tasks, we firstly construct a hydrological emulator (HE) that possesses both reasonable predictability and computation efficiency for global applications in this work.Built upon the widely-used "abcd" model, we have adopted two snow-related parameters from literature rather than tuning them for parameter parsimony, and also have improved the partition of total runoff between the direct runoff and baseflow by introducing baseflow index into the objective function of the parameter optimization.We then evaluate the appropriateness of the model serving as an emulator for a complex GHM -the VIC, for both the lumped and distributed model schemes, by examining their predictability and computational efficiency.
In general, both distributed and lumped schemes have comparably good capability in simulating spatial and temporal variations of the water balance components (i.e., total runoff, direct runoff, baseflow, evapotranspiration).Meanwhile, the distributed scheme has slightly better performance than the lumped one (e.g., capturing spatial heterogeneity), with mean Kling-Gupta efficiency of 0.79 vs. 0.75 across global 235 basins, and also it provides grid-level estimates that the lumped one incapable of.Additionally, the distributed scheme performs better in extreme climate regimes (e.g., Arctic, North Africa) and Europe.However, the distributed one incurs two more orders of magnitudes of computation cost than the lumped one.A case study of uncertainty analysis with 100, 000 simulations for each of the world's sixteen basins with top annual streamflow further demonstrates the lumped scheme's extraordinary advantage in terms of computational efficiency.Therefore, the lumped scheme could be an appropriate HEreasonable predictability and high computational efficiency.At the same time, the distributed scheme could be a suitable alternative for research questions that hinge on grid-level spatial heterogeneity.Finally, upon open-sourcing and well-documentation, the HE is ready to use and it provides researchers an easy way to investigate the variations in water budgets at a variety of spatial scales of interest (e.g., basin, region or globe), with minimum requirements of efforts, reasonable model predictability and appealing computational efficiency.

Appendix A: Descriptions and equations of the "abcd" model
The abcd model was first introduced by (Thomas, 1981), and Martinez and Gupta (Martinez and Gupta, 2010) added snow processes into the model.In this work, we adopted the snow scheme in Martinez and Gupta (2010): The model defines two state variables "available water" and "evapotranspiration opportunity", denoted as Wi and Yi, respectively.The Wi is defined as: where a and b are parameters detailed in Section 2.1.2.In the model framework, ii WY  is the sum of the groundwater recharge (RE) and direct runoff (Qd), and the allocation is determined by the parameter c:

Allocation of Wi between
(1 ) ( ) The baseflow from the groundwater (GW) pool is modeled as: where d is a parameter reflecting the release rate of groundwater to baseflow.Then the total runoff ( t Q ) is the sum of the direct runoff and baseflow: The i GW is the sum of groundwater storage at the end of last time step and the groundwater recharge minus the baseflow, and i GW is derived as: Then, all the water fluxes and pools are solved.Snow melt coefficient 0-1 (Wen et al., 2013) by assuming that the loss of soil moisture by ET will be proportional to PET as:

Figure Caption Figure 1
Figure Caption

Figure 2
Figure 2 Kling-Gupta efficiency of the simulated basin-level total runoff across the global 235

Figure 3
Figure 3 Comparison of basin-specific long-term annual total runoff, direct runoff and baseflow

Figure 4
Figure 4 Time series of basin-specific total runoff (Qtotal) from the VIC product, the lumped and

Figure 5
Figure 5 Spatial patterns of long-term annual total runoff (mm yr -1 ) across global 235 basins: a)

Figure 6
Figure 6The spatial pattern of Kling-Gupta efficiency (KGE) for the total runoff estimates of

Figure 7
Figure 7 Parameter-induced uncertainty in total runoff for the world's sixteen river basins with Figure 3 SM is soil moisture at the end of time step i.Further, Yi has a non-linear relationship with Wi as: i