GMDGeoscientific Model DevelopmentGMDGeosci. Model Dev.1991-9603Copernicus PublicationsGöttingen, Germany10.5194/gmd-10-3771-2017A globally calibrated scheme for generating daily meteorology from monthly statistics: Global-WGEN (GWGEN) v1.0SommerPhilipp S.philipp.sommer@unil.chhttps://orcid.org/0000-0001-6171-7716KaplanJed O.https://orcid.org/0000-0001-9919-7613Institute of Earth Surface Dynamics, University of Lausanne, Géopolis, 1015 Lausanne, SwitzerlandMax Planck Institute for the Science of Human History, 07745 Jena, GermanyPhilipp S. Sommer (philipp.sommer@unil.ch)16October201710103771379120February201714March201729August20177September2017This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this licence, visit https://creativecommons.org/licenses/by/3.0/This article is available from https://gmd.copernicus.org/articles/10/3771/2017/gmd-10-3771-2017.htmlThe full text article is available as a PDF file from https://gmd.copernicus.org/articles/10/3771/2017/gmd-10-3771-2017.pdf
While a wide range of Earth system processes occur at daily and even
subdaily timescales, many global vegetation and other terrestrial dynamics
models historically used monthly meteorological forcing both to reduce
computational demand and because global datasets were lacking. Recently,
dynamic land surface modeling has moved towards resolving daily and subdaily
processes, and global datasets containing daily and subdaily meteorology have
become available. These meteorological datasets, however, cover only the
instrumental era of the last approximately 120 years at best, are subject to
considerable uncertainty, and represent extremely large data files with
associated computational costs of data input/output and file transfer. For
periods before the recent past or in the future, global meteorological
forcing can be provided by climate model output, but the quality of these
data at high temporal resolution is low, particularly for daily precipitation
frequency and amount. Here, we present GWGEN, a globally applicable
statistical weather generator for the temporal downscaling of monthly
climatology to daily meteorology. Our weather generator is parameterized
using a global meteorological database and simulates daily values of five
common variables: minimum and maximum temperature, precipitation, cloud
cover, and wind speed. GWGEN is lightweight, modular, and requires a minimal
set of monthly mean variables as input. The weather generator may be used in
a range of applications, for example, in global vegetation, crop, soil
erosion, or hydrological models. While GWGEN does not currently perform
spatially autocorrelated multi-point downscaling of daily weather, this
additional functionality could be implemented in future versions.
Introduction
The development of the first global vegetation models in the 1970s
e.g., brought about the demand for meteorological
forcing datasets with global extent and relatively high spatial resolution,
e.g., 1∘× 1∘. While a global weather-station-based
monthly climate dataset was available at this time ,
limitations in computers and storage allowed only the simplest treatment of
these data. The first global simulations of the net primary productivity of
the terrestrial biosphere thus used rasterized polygons of
annual meteorological variables that had been crudely interpolated from the
station-based climatology. The next decade saw the development of better
computers and more sophisticated global vegetation models
that recognized the need
for forcing at a subannual time step, and development of these models was done
in parallel with the first global, gridded high-resolution (0.5∘)
monthly climatology . At the time, monthly
meteorological data were the only feasible global data that could be produced
in terms of the raw station data available to feed the interpolation process,
the processing time required to produce gridded maps, and the data storage
and transfer capabilities of contemporary computer systems and networks.
Global gridded monthly climate data thus became the standard for not only
large-extent vegetation modeling but also for a wide
range of studies on biodiversity and species distribution
e.g.,, vegetation trace gas emissions
e.g.,, and even the geographic
distribution of human diseases e.g.,.
Over subsequent years, the global gridded monthly climate datasets were
improved , developed with
very high spatial resolution , and
expanded beyond climatological mean climate to cover continuous time series
over decades . The
latter was an essential requirement for forcing dynamic global vegetation
models (DGVMs) e.g.,. However, despite
increasing quality, spatial resolution, and temporal extent in these
datasets, the basic time step remained monthly, partly for legacy reasons – models
had been developed in an earlier era subject to computational
limitations and therefore used a monthly time step for efficiency even if this
was no longer strictly a constraint – and partly because of the challenge
in developing a global, high-resolution climate dataset with a daily or
shorter time step still presented a major data management challenge.
On the other hand, there was increasing awareness that accurate simulation of
many Earth surface processes required representation of processes at a
shorter-than-monthly time step. Global simulation of surface hydrology
, crop growth
, or biogeophysical processes
needed submonthly forcing to
produce reliable results. To address this need for better forcing data, two
main approaches were taken: either monthly climate data were downscaled
online using a stochastic weather generator
e.g.,, or a subdaily, high-resolution,
gridded climate time series was generated directly by merging
high-temporal-resolution reanalysis data (e.g., NCEP, 6 h, 2.5∘) with
high-spatial-resolution monthly climate data (e.g., CRU, 0.5∘). The
latter process resulted in the CRUNCEP dataset
, which, while global, is
large even by modern standards (approximately 350 GB), is not available at spatial
resolution greater than 0.5∘, and covers only the period 1901–2014.
Forcing data for global vegetation and other models with shorter-than-monthly
resolution at higher spatial resolutions than 0.5∘, or for any other
period than the last approximately 120 years, e.g., for the future or the more distant
past, may therefore only be available through downscaling techniques. One
approach to overcome the limitations of currently available datasets could be
to use general circulation model (GCM) output directly; however, most GCM output currently available does
not have greater than 0.5∘ spatial resolution, with the current
generation of GCMs typically approaching 1∘× 1∘.
Furthermore, there is a general observation that daily meteorology produced
by GCMs is not realistic, particularly for precipitation
. An
alternative approach is, therefore, to perform temporal downscaling on
monthly meteorological data using a statistical weather generator.
Statistical weather generators were first developed primarily for crop and
hydrological modeling at the field to catchment scale
. The weather
generator was parameterized using daily meteorological observations at one or
more weather stations close to the area of interest, although some attempts
were made to generalize the parameterization over larger, subcontinental
regions e.g.,. Locally
parameterized weather generators have been applied to a very wide range of
studies and enhanced to include additional
meteorological variables beyond the original precipitation, temperature, and
solar radiation e.g.,. Applications of a weather
generator at continental to global scales was still limited, however, because
of the need to perform local parameterization.
The need to simulate daily meteorology in regions of the world with short,
unreliable, or unavailable daily meteorological time series brought about the
realization that certain features of weather generator parameterization might
be generalized across a range of climates . This ultimately led to the
development of globally applicable weather generators and
their incorporation in DGVMs .
The original global parameterization of these
weather generators was, however, limited to seven weather stations, mostly in
the temperate latitudes. does not publish the parameters
used in his global weather generator, but we assume these were the same as
the original and models.
Given the availability of (1) large datasets of daily meteorology and
(2) computers powerful enough to process these data, we therefore decided that it
would be valuable to revisit these parameterizations, perform a systematic
and quantitative evaluation of the resulting downscaled meteorology, and
potentially improve our ability to perform monthly to daily downscaling of
common meteorological variables with a single, globally applicable
parameterization.
In the following sections, we describe Global-WGEN (GWGEN), a weather
generator parameterized using more than 50 million daily weather observations
from all continents and latitudes. We demonstrate how updated schemes for
simulating precipitation occurrence and amount, and for bias correcting wind
speed, further improve the quality of the model simulations. We perform an
extensive model evaluation and parameter uncertainty analysis in order to
settle on a parameter set that provides the most accurate, globally
applicable results. We comment on the limitations of the model and priorities
for future research. GWGEN is an open-source, stand-alone model that may be
incorporated into any number of models designed to work at global scale,
including, e.g., vegetation, hydrology, climatology, and animal distribution models.
Schematic workflow
of GWGEN. After smoothing the monthly input, the
Markov chain is used to decide whether it is a dry or a wet day. If it is a
wet day, we draw a random number from the gamma–GP (generalized Pareto) distribution. Furthermore,
the other means of the variables (T‾min/max,c‾,w‾) are adjusted and their daily values are
calculated using the estimated standard deviations and residuals. The wind
speed furthermore undergoes a square root transformation before applying the
cross correlation and in the end is corrected using the bias correction. A
quality check in the end restricts our model to be within a 5 % range of
the observed total precipitation and to replicate the number of wet days from
the input.
Model description
GWGEN requires the following six monthly summary values as input: (1) total
monthly precipitation, (2) the number of days in the month with measurable
precipitation (i.e., wet days), (3, 4) monthly mean daily minimum and maximum
temperature, (5) mean cloud fraction, and (6) wind speed. The model outputs are
the same variables at daily resolution. This section summarizes the basic
workflow in the model which is also shown schematically in Fig.
and Algorithm 1.
The first approximation of the daily variables comes from smoothing the
monthly time series using a mean-preserving algorithm .
For precipitation, we then first use the Markov chain approach (Sect. )
to decide the wet/dry state of the day. If it is a wet day,
we calculate the gamma parameters using Eqs. ()
and (). The resulting distribution allows us to draw a
random number – the precipitation amount of the currently simulated day. If we
are above the threshold μ, we draw a second random number from the generalized Pareto (GP)
distribution parameterized via Eq. () and the chosen GP shape.
The next step modifies the means of temperature, wind speed, and cloud
fraction depending on the wet/dry state of the day (lines 11
and 15 in Algorithm 1). After that, we use the cross-correlation approach described
in (lines 18–20
and Sect. ) and calculate the daily values of these variables.
Finally, we use the quantile-based bias correction described in
Sect. to correct the simulated wind speed.
We restrict the weather generator to reproduce the exact number of wet days (±1)
as the input and to be within a 5 % range of the total monthly
precipitation (with a maximum allowed deviation of 0.5 mm). If the
program cannot produce these results, the procedure described above is
repeated (see line 4).
Model development
GWGEN is based on the WGEN weather generator , using
the method of defining the model parameters based on monthly summaries
described by and . GWGEN
diverges from the original WGEN by using a hybrid-order Markov chain to
simulate precipitation occurrence and a hybrid gamma–GP
distribution to estimate
precipitation amount. Temperature, cloud cover, and wind speed are calculated
following , using cross correlation and depending on
the wet/dry state of the day. We further add a quantile-based bias correction
for wind speed and minimum temperature, which improves the simulation results
significantly.
In the following subsections, we first describe the global weather station
database used to develop and evaluate the model, then describe the underlying
relationships that we use to define GWGEN's parameters.
Weather stations used for parameterization and evaluation of the
weather generator. The uppermost panel shows the locations of the stations
used for parameterizing precipitation and temperature; the middle panel shows
the stations for cloud fraction and wind speed, as well as for calculating
the cross correlations between temperature, cloud fraction, and wind speed.
The lower plot shows the location of the stations used to evaluate the model,
which were excluded from the parameterization stations.
Development of a global weather station database
To parameterize GWGEN, we assembled a global dataset of daily meteorological
observations. Precipitation and minimum and maximum daily temperature come
from the daily Global Historical Climatology Network (GHCN-Daily) database
. The GHCN-Daily consists of
observations collected at approximately 100 000 weather stations on all continents and
many oceanic islands. As the GHCN-Daily stations are highly concentrated in
some parts of the world, particularly in the conterminous United States, we
selected stations for our study using a geographic anti-aliasing filter to
avoid an especially strong geographic bias in the generation of the model
parameters. Dividing the world up into a 0.5∘ grid, we selected the
single station with the longest record in each cell, if one was present.
While the GHCN-Daily units for precipitation have a nominal precision of 0.1 mm,
several of the stations in the US reported precipitation in
fractions of an inch, which were later converted to mm. To ensure uniform
precision across all of our calibration stations (this was particularly
important when generating the probability density functions for precipitation
amount), we selected only those GHCN-Daily stations where all precipitation
amounts between 0.1 and 1.0 mm day-1 were reported in the record. This
resulted in 9508 stations covering all continents, although the distribution
was strongly heterogenous, with the majority of the stations in North America,
despite our geographic filter (Fig. , top panel). For
cloud cover, wind speed, and calculation of cross correlations between
temperature, cloud cover, and wind speed, we used the Extended Edited Cloud
Report Archive (EECRA) database . The geographic
distribution of the 6978 EECRA stations we selected is different than the
GHCN-Daily, with more stations in Europe (Fig. , middle
panel), but overall a relatively similar number of stations were used from
both datasets. For the observations from both GHCN-Daily and EECRA, we made
one additional filtering step, selecting only complete months, i.e., months
with no days having missing observations, for further processing. In total,
our database of daily meteorological observations used in the model
parameterization contains approximately 69 million individual records.
Finally, we reserved some weather station records for model evaluation that
were not used for model parameterization. These were individual stations or
two stations separated by a maximum distance of 1 km, where all of the daily
meteorological variables that GWGEN simulates
(P, Tmin, Tmax, c, w) were recorded on the same
dates in the EECRA database. This merged selection from EECRA and GHCN
resulted in a set of 921 stations representing approximately 15 million daily records,
with observations on all continents, although the geographic distribution is
once again highly heterogenous, with a particularly high density of stations
in Japan and Germany (Fig. , bottom panel).
Transition probabilities vs. wet fraction. The red density plot in
the background shows the density of the observations, and the blue lines
indicate the linear regression line of the probability against the wet fraction. The
fit for the p11 transition probability was forced to the point (1, 1);
the others were forced to (0, 0). The underlying data for the fits correspond
to the means of the multi-year series for each month for each
station.
ParameterizationPrecipitation occurrence
Following , we expect to find a good relationship
between the fraction of days in a month with measurable precipitation and the
probability that any given day will be wet. Following , we use
a hybrid-order model that retains first-order Markov dependence for wet
spells but allows second-order dependence for dry sequences; this
hybrid-order scheme has been shown to be a good compromise between
performance and simplicity. To parameterize the precipitation occurrence part
of the model, we thus calculated transition probabilities for a wet day being
followed by a wet day (p11), for a wet day being followed by a dry day
being followed by a wet day (p101), and for two dry days being followed
by a wet day (p001). We perform this analysis on a station- and
month-wise basis: we first extract each of the (complete) Januaries,
Februaries, etc. for a given station and then merge all of the Januaries
(Februaries, Marches, etc.) for this station into a single series
representing each month. Merging months over several years is particularly
important for stations that have relatively little precipitation in a given
month; for example, it could take several years of observations to observe a
single p101 event. The final transition probabilities were then
regressed against the fraction of days in the month with precipitation, which
show the characteristic linear relationship described by (Fig. ).
Because the transition probabilities (p001 and p101) must be zero
by definition when the fraction of wet days (fwet) is zero, i.e.,
a completely dry month, we force the linear regression between these
quantities to pass through the origin. Likewise, we require the regression
line for p11 to equal 1 when fwet is 1. One has to note,
however, that this methodology artificially increases the R2 coefficient
for the fit because we fix the intercept see, for example.
The analysis results in the following relationships:
p11=0.2549+0.7451⋅fwetp101=0.8463⋅fwetp001=0.7240⋅fwet.
In the weather generator (see line 6 in Algorithm 1),
we determine if any given day will have precipitation by
calculating the appropriate probability density function selected from
Eqs. ()–() on the basis of the precipitation
state of the previous day (or two). Comparing the calculated probability from
the selected equation with a random number u∈ [0, 1], a precipitation day
is simulated if u is greater than its corresponding probability.
Mean precipitation–gamma scale relationship. The blue line
represents the best fit line of the mean precipitation on wet days to the
estimated gamma scale parameter of the corresponding distribution. Each data
point corresponds to one multi-year series of 1 month for one
station.
Precipitation amount
Following the original WGEN , GWGEN disaggregates
precipitation amount using a statistical distribution. A number of different
probability density functions have been used to estimate precipitation amount
in weather generators including, e.g., single exponential or mixed
exponential, one- or two-parameter gamma, or Weibull distribution
. The strong relationship between the gamma scale
parameter and the mean precipitation on wet days noted by
makes generation of precipitation amounts with
only monthly input data feasible. It is based upon the fact that the expected
value of a gamma random variable equals the product of its two parameters,
i.e., E(Γ)=αθ. The gamma distribution, however, shows poor
performance in simulating high-precipitation events consistent with
observations. and
suggest that a hybrid probability density function, based on both gamma and
the GP distribution, has superior accuracy in simulating
extreme precipitation events when compared to gamma alone. Because of its
superior accuracy and ease of implementation, we therefore adopt the hybrid
gamma–GP distribution for simulating precipitation amount in GWGEN.
The probability density function (pdf) of the gamma distribution is defined as
f(x)=xα-1e-xθθαΓ(α)forx>00forx=0,
where α> 0 is the shape and θ> 0 the scale parameter. The
pdf of the GP distribution is defined via
g(x)=1σ1+ξ(x-μ)σ-1ξ-1forξ≠01σe-x-μσforξ=0,
with σ> 0 being the scale parameter and ξ∈R the
shape parameter. μ is the location parameter.
Following , we define the hybrid gamma–GP pdf as
h(x)=f(x)forx≤μ(1-F(μ))g(x)forx>μ,
where F(μ) describes the cumulative gamma distribution function at the
threshold μ. In our weather generator, however, we first draw a random
number from the gamma distribution and, if we are above the threshold, we
draw another random number from the GP distribution. Thus, the frequency of
precipitation events larger than μ is determined by the gamma
distribution, but the actual amount of precipitation simulated when above the
threshold μ is determined by the GP distribution .
To determine the parameters of the hybrid distribution for precipitation, we
started with the simple strategy by . As above,
when calculating the Markov chain parameters, we created multi-year series
for each of the parameterization stations for each month and extracted the
days with precipitation. If a series contained more than 100 entries, we fit
a gamma distribution using maximum likelihood to it in order to estimate the
α and θ parameters.
Following , we then fit a regression line of the
gamma scale parameter against the mean precipitation on wet days p‾d
(see Fig. ) and found the relationship
θ=1.262p‾d.
As proposed by , we use this relationship in our
model to estimate the scale parameter of the distribution. Using this
approach, the gamma shape parameter α is a constant, given via
α=p‾dθ=11.262.
The GP scale parameter σ on the other hand is calculated during the
simulation following via
σ=1-F(μ)f(μ).
The other parameters of the GP distribution are obtained through a
sensitivity analysis described in Sect. .
Temperature
Following the standard WGEN methodology ,
daily temperature is determined through two processes: first, the wet/dry state of the day and then the cross
correlation (Sect. ).
In the weather generator, we know from the Markov chain
(Sect. ) whether the current simulated day is a wet or dry
day. Based upon the simple linear relationships
x‾wet=c0,x,wet+c1,x,wet⋅x‾x‾dry=c0,x,dry+c1,x,dry⋅x‾,
we adjust the monthly mean x‾ of the variable x∈{Tmin, Tmax}.
To estimate the values of the parameters c0 and c1 in the above
equations, we follow the same procedure as for the parameters of the Markov
chain (Sect. ). We extracted the complete months
for Tmin and Tmax from the GHCN-Daily dataset and created
a multi-year series for each month and station. We then regressed the mean on
wet and dry days separated against the overall mean of each month (Figs.
and ). Through this procedure, we estimate the
parameters necessary for Eq. () (see Table ).
Fit results of temperature correlation for wet and dry days for
Figs. , , , and .
The coefficients c0 to c3 correspond to the coefficients used in
Eqs. () and ().
To estimate residual noise, we also need an estimate of the standard
deviation of the variable (see Sect. ). Figure
shows the correlation between standard deviation on wet and dry days and the
corresponding mean. The means of the standard deviations (black bars in
Fig. ) indicate a strong but nonlinear relationship
between the standard deviation and the corresponding mean. The correlation
changes particularly at 0 ∘C. We therefore use two different
polynomials of order 5 for the values below and above the freezing point.
Furthermore, to account for the sparse data below -40 ∘C and above
25 ∘C for minimum temperature (or -30 and 35 ∘C
for maximum temperature), we use an extrapolation for the extremes as
indicated by the blue and violet lines in Fig. . The
formulae for the standard deviations σ of minimum and maximum
temperature are therefore a combination of four polynomials:
σTmin,wet/dry=p1T‾min,wet/dry,forT‾min,wet/dry≤-40∘Cp5T‾min,wet/dry,for-40∘C<T‾min,wet/dry≤0∘Cp5T‾min,wet/dry,for0∘C<T‾min,wet/dry≤25∘Cp1T‾min,wet/dry,for25∘C<T‾min,wet/dryσTmax,wet/dry=p1T‾max,wet/dry,forT‾max,wet/dry≤-30∘Cp5T‾max,wet/dry,for-30∘C<T‾max,wet/dry≤0∘Cp5T‾max,wet/dry,for0∘C<T‾max,wet/dry≤35∘Cp1T‾max,wet/dry,for35∘C<T‾max,wet/dry.p1 in Eq. () denotes a polynomial of order 1; p5 a
polynomial of order 5. The coefficients of the different polynomials are
shown in Table .
Fit results of the correlation of temperature standard deviation
with the corresponding mean on wet/dry days for Fig. . The
underlying equations are shown in
Eq. ().
Correlation of minimum temperature on wet and dry days to the
monthly mean. The y axes show the mean minimum temperature on wet or dry
days, respectively; the blue line corresponds to the best fit line. Parameters
of the fits are also shown in Table .
Correlation of maximum temperature on wet and dry days to the
monthly mean. The y axes show the mean maximum temperature on wet or dry
days, respectively; the blue line corresponds to the best fit line. Parameters
of the fits are also shown in Table .
Correlation of standard deviation of the minimum and maximum
temperature on wet and dry days to the monthly mean. The y axes show the
standard deviation; the x axes the mean on wet or dry days, respectively. The
bars have a width of 0.1 ∘C (the data accuracy) and indicate the
mean standard deviation for a given mean minimum temperature in 1 month.
The lines are fitted to these bars where the green and red polynomials of
order 5 use all the data below or above 0 ∘C, respectively,
and the blue and violet lines indicate a linear extrapolation of the data below
-40 ∘C (or -30 ∘C for Tmax) or above
25 ∘C (or 35 ∘C), respectively. The red density plot in the
background indicates the spread of the data. The bars and the density plot
are based on the single month for each station (i.e., not the multi-year
monthly series as for, e.g., mean temperature; Figs.
and ). Parameters of the fits are also shown in
Table .
These coefficients are based on the means of the standard deviation (black
bars in Fig. ). We chose this procedure to give the same
weight to all temperatures. Otherwise, the fit would be dominated by the
temperature values around the freezing points.
Correlation of cloud fraction on wet and dry days to the monthly
mean. The y axes show the mean cloud fraction on wet or dry days,
respectively; the blue line corresponds to the best fit line. Parameters of
the fits are also shown in Table .
Correlation of standard deviation of the cloud fraction on wet and
dry days to the corresponding monthly mean. The y axes show the standard
deviation; the x axes the mean on wet or dry days, respectively. The blue
line corresponds to the best fit line. Parameters of the fits are also shown
in Table .
Cloud fraction
Monthly mean cloud fraction is disaggregated, as for temperature, using the
standard WGEN procedure of adding statistical noise to a wet- or dry-day mean
and accounting for cross correlation among the different weather variables.
For the parameterization of the cloud fraction equations, we used the EECRA
dataset. The original dataset contains eight measurements per day of the
total cloud cover in units of octas, i.e., values ranging from 0 (clear sky)
to 8 (overcast). Hence, to calculate the daily cloud fraction, those values
were averaged and divided by 8 to produce a daily mean.
To adjust the monthly mean depending on the wet/dry state of the day, we
could not use a simple linear relationship as we used for temperature because
cloud fraction is bounded by a lower limit of 0 and an upper limit of 1.
Furthermore, we observed that cloud cover on wet days is usually greater than or
equal to the monthly mean cloud cover, whereas the cloud cover on dry days is
usually less than or equal to the monthly mean cloud cover. This results in a
concave curve for the wet case and a convex curve for dry days. We used a
qualitative graphical analysis to develop “best guess” equations that had the
desired shape and propose the following formulae for the regression linking
cloud cover on wet or dry days to the overall mean:
c‾wet=-ac,wet-1ac,wet2⋅c‾-ac,wet2-ac,wet-1ac,wetc‾dry=-ac,dry-1ac,dry2⋅c‾-ac,dry2-ac,dry-1ac,dry,
with ac,wet< 0 and ac,dry> 0.
The standard deviation of cloud cover fraction becomes 0 when the mean
monthly cloud fraction reaches both the minimum or maximum limits of 0 and 1.
Hence, for csd,dry and csd,wet we have an concave
parabola with the formula
σc,wet=ac,wet2⋅c‾wet⋅1-c‾wetσc,dry=ac,dry2⋅c‾dry⋅1-c‾dry,
with ac,wet, ac,dry≥ 0.
Results of the fits can be seen in Figs. and
and the parameters in Table .
Correlation of wind speed on wet and dry days to the monthly mean.
The y axes show the mean cloud fraction on wet or dry days, respectively;
the blue line corresponds to the best fit line. Parameters of the fits are
also shown in Table .
Correlation of standard deviation of wind speed on wet and dry days
to the corresponding monthly mean. The y axes show the standard deviation;
the x axes the mean on wet or dry days, respectively. The blue line
corresponds to the best fit line; a third-order polynomial corresponds to the underlying
red density plot. The black bars have a width of 0.1 m s-1, the
accuracy of the input data, and indicate the mean standard deviations for the
given interval range. Parameters of the fits are also shown in
Table .
Wind speed
The parameterization of the mean wind speed is based upon the same linear
Eq. () as temperature. For the standard deviation,
however, we use a third-order polynomial that is forced through the
origin, given via
σw,wetw‾wet=c1,w,wetw‾wet+c2,w,wetw‾wet2+c3,w,wetw‾wet3σw,dryw‾dry=c1,w,dryw‾dry+c2,w,dryw‾dry2+c3,w,dryw‾dry3.
This better resolves the complex behavior close to
0 m s-1 compared to a linear fit. The plots are shown in
Figs. and and the parameters for the
fits are shown in Table .
Fit results of cloud correlation for wet and dry days for
Fig. . SE indicates standard error.
PlotVariableaSE of aR2cdry0.43020.00130.8745cwet-0.73760.00060.3881csd,dry1.04480.00040.2803csd,wet0.98810.00060.0802Cross correlation
Following we use cross correlation to add additional
residual noise to the simulated meteorological variables, which provides more
realism in the daily weather result. This methodology, based on
, preserves the serial and the cross correlation between the
simulated variables. It implies that the serial correlation of each variable
may be described by a first-order linear autoregressive model.
Given the cross-correlation matrix M0∈R4×R4 and
the lag-1 correlation matrix M1∈R4×R4, we calculate
A=M1M0-1BBT=M0-M1M0-1M1T.
The matrices A, B, M0, and M1 are
calculated using the stations from the EECRA database in Fig. . The results are
M0=1.0.5650.0410.0350.5651.-0.089-0.0430.041-0.0891.0.1140.035-0.0430.1141.M1=0.9330.550.0160.030.5570.417-0.066-0.0430.004-0.0950.5990.0930.011-0.0630.0610.672,
leading to
A=0.9160.031-0.0180.0010.4850.135-0.069-0.0470.004-0.0430.5920.0230.012-0.043-0.020.672B=0.3580.0.0.0.1120.8090.0.0.142-0.060.7850.0.077-0.0160.0610.733.
The columns and rows in the two matrices correspond to minimum and maximum temperature,
cloud fraction, and square root of wind speed, respectively.
Q–Q plots for all variables with all quantiles (1, 5, 10, 25, 50,
75, 90, 95, and 99) for μ= 5.0 mm and ξ= 1.5. The blue lines
indicate linear regression from simulation to observation. The red line shows the
ideal fit (the identity line). Blue shaded areas represent the 95 %
confidence interval. The plots compare the simulated quantile from the list
above one year of one station to the corresponding observed quantile of
the same year and station. The plot for wind speed used the bias
correction from Sect. .
In the weather generator, the variables Tmin, Tmax,
c, and w are then calculated using a combination of residual noise χi
(where i denotes the current simulated day) and the mean of the variables.
χi is determined by the other variables and the previous day using A
and B from above . Hence, χi is given via
χi=χTminχTmaxχcχw=Aχi-1+Bϵ∈R4.
The daily values for the variables are then calculated via
Tmin,i=χTmin⋅σTmin,wet/dry+T‾min,wet/dryci=χc⋅σc,wet/dry+c‾wet/dryTmax,i=χTmax⋅σTmax,wet/dry+T‾max,wet/drywi=χw⋅σw,wet/dry+w‾wet/dry2,
with σTmin,wet/dry,
σTmax,wet/dry from Eq. (),
σc,wet/dry from Eq. (),
σw,wet/dry from Eq. (),
T‾min,wet/dry, T‾max,wet/dry,
w‾wet/dry from Eq. (), and
c‾wet/dry from Eq. ().
Since this procedure always requires the residuals from the previous day,
χi-1, we initialize χ0 with 0, simulate the month, and then
simulate it again.
Note that, through the entire procedure, wind speed is subject to a
square-root transformation (also when calculating M0 and M1) to account
for the fact that it is not normally distributed.
Q–Q plot for different quantiles for precipitation for
μ= 5.0 mm and ξ= 1.5. The blue lines indicate linear regression
from simulation to observation. The red line shows the ideal fit (the
identity line). Blue shaded areas represent the 95 % confidence interval.
The plots compare the simulated quantile of one year of one station to the
corresponding observed quantile of the same year and
station.
Model evaluation
To evaluate GWGEN, we started with the daily meteorology at the evaluation
stations described above and calculated monthly summaries. We used these
monthly data to drive the model and simulate daily meteorology. The resulting
daily series now has the same length as the observed meteorology from the
GHCN and EECRA databases. Because we cannot expect the weather generator to
reproduce the weather exactly as observed (for example, the number of rainy
days in a month may be the same as observed but they may not occur in
precisely the same order), our evaluation is restricted to comparing the
statistical properties of the input observed versus the output simulated
daily meteorology.
Figure shows the comparison of simulated versus observed
values for each of the five meteorological variables handled by GWGEN. For
temperature, wind, and cloud fraction, the model does an excellent job of
downscaling monthly input to daily resolution.
Note that the plot for
wind speed has been bias corrected using the approach in
Sect. .
The comparison between precipitation amounts looks good
when considering all of the data; however, a closer look into the results
(Fig. ) shows that while the higher-precipitation
percentiles are well captured using the hybrid gamma–GP distribution, the
lower percentiles show somewhat worse results. This observation of poor
performance for very low values also holds true for wind speed (not shown
here). The lower values of the two variables, however, are very close to the
precision of the observation (0.1 mm for precipitation and 0.1 m s-1
for wind speed). Very small precipitation amounts
and low wind speeds are also less biophysically and ecologically important
compared to the higher percentiles. We therefore consider the results of the
evaluation largely acceptable.
In Table , we also compare the simulated versus the observed
frequencies for very light rain (≤ 1 mm), light rain (1–10 mm), heavy rain
(10–20 mm), and very heavy rain (> 20 mm). As we can see, our model
underestimates the occurrence of very light rain events (28.6 % instead of
36.4 %) and overestimates the light rain events (58.3 % instead of
48.6 %) but generally performs much better than GCMs ,
especially when it comes to the heavy rain events.
Bias correction
After evaluating the results of GWGEN for wind speed for the different
quantiles (see Sect. ), we found a strong, systematic
bias between the simulated and the observed values. This observation led us
to adopt a further measure to improve the quality of the model output by
implementing a quantile-based bias correction.
Basis for the wind bias correction. For the left plot, each data
point corresponds to the difference of a simulated percentile to the observed
percentile. For the right plot (wind speed), each data point corresponds to
the fraction of simulated to the observed wind speed for a given percentile.
The random number on the x axis represents the residual value from a normal
distribution centered at 0 with standard deviation of unity, as it is used in
the cross-correlation approach .
Simulated and observed precipitation frequencies for certain ranges.
The frequency is defined as the number of precipitation occurrences in the
specified range divided by the total number of precipitation
occurrences.
Precip. range (mm)SimulatedObserved(0, 1]0.2856880.364014(1, 10]0.5833300.486415(10, 20]0.0740630.090178(20, ∞]0.0569200.059392
We use an empirical distribution correction approach (quantile mapping)
to a posteriori correct the simulated data.
In the quantile evaluation (Sect. ), we saw that the
simulated wind speed is a linear function of the observed wind speed,
i.e., wsim= intercept + slope ⋅wobs (best fit line in
Fig. ). Therefore, we use two steps here: one is for
the difference between simulation and observation (ideally 0); the other one
is the fraction of observation and simulation (ideally 1). The first one
corresponds to the intercept with the y axis in Fig. ,
the second one to the slope of the best fit line. The analysis is based on
every second percentile between 1 and 100 (i.e., 1, 3, 5, and so on) and mapped
to its corresponding random number u∈R from a normal
distribution as it is used for the cross correlation in the weather generator
(Sect. ; x axis in Fig. and ).
Regarding the intercept (Fig. , left panel), we see that it strongly
follows an exponential function given through
fexp(u)=eau+b,a,b,u∈R.
The slope (Fig. , right panel), on the other hand, can be described by
a simple third-order polynomial given by
p3(u)=c0+c1u+c2u2+c3u3,c0,c1,c2,c3,u∈R.
Hence, given the best fit lines in Fig. , the simulated wind
speed is corrected via
wsim′=wsim-fexp(u)p3(u),
with a= 1.1582, b=-1.3359, c0= 0.9954, c1= 0.8508,
c2= 0.0278, and c3=-0.0671.
Sensitivity analysis
The generalized pareto part of the hybrid gamma–GP distribution, which we
used to simulate precipitation amount, has two parameters: the GP shape and
the threshold parameter. Unlike the gamma parameters, we were unable to
relate these GP parameters to any of the monthly summary data we use as input
to GWGEN. Hence, we decided to set fixed values for these parameters, and
determine them through a sensitivity analysis.
To select the “best” values of the GP parameters, we compared simulated with
observed precipitation amounts, running GWGEN with a wide range of realistic
parameter values. To quantitatively assess the model performance, we used two
metrics: (1) direct comparison of the quantiles (see previous section) and
(2) a Kolmogorov–Smirnov (KS) test that evaluates whether two data samples come
from significantly different distributions. Our criteria were
the R2 correlation coefficient between simulated and observed quantiles;
the fraction simulatedprecipitationobservedprecipitation
from the slopes in Fig. and its deviation from unity;
the fraction of simulated (station-specific) years that are significantly
different (KS test) from the observation; and
the mean of the above values.
We tried two different approaches to select the gamma–GP crossover threshold:
first, we tried a fixed crossover point; second, we used a quantile-based
crossover point. For the latter, the model chooses to use the GP distribution
if the quantile of the random number drawn from the gamma distribution is
above a certain quantile threshold. This introduces a flexible crossover
point in our hybrid distribution which, however, did not improve the results
significantly. We therefore show here only the results using the fixed crossover point.
The values of the crossover point for our sensitivity analysis were 2, 2.5,
3, 4, and from 5 to 20 in steps of 2.5, and 20 to 100 in steps of 5.
Furthermore, we varied the GP shape parameter from 0.1 to 3 in steps of 0.1
(810 experiments in total). The results of this sensitivity analysis are
shown in the Supplement (Fig. ).
In general, we found that the three criteria (1–3) could not be optimized all
together at the same time. The R2 is best for high thresholds and low GP
shape parameters, the slope is best for low to intermediate thresholds and a
low GP shape, and the KS statistic is best for low threshold and intermediate
GP shape parameters.
However, R2 did not vary that much (from 0.68 to 0.74), and from a visual
evaluation of the corresponding quantile plots we saw that the higher
quantiles (> 90) were much better represented for a better KS result. Hence, we
chose to follow the KS test criteria, which is also the strictest of our
evaluation methods but again compared the different quantile plots to get
good results for the higher quantiles. Finally, we chose a threshold of
5 mm and a GP shape parameter of 1.5. For this setting, 81.7 %
of the simulated years do not show a significant difference compared to the
observation, the mean R2 of the plots in Fig.
is 0.81, and the mean deviation of the slope from unity is 0.10 and for the
upper quantiles (90 to 100) it is 0.017.
Nevertheless, in total, the results seem to be fairly independent of the two
parameters since even the amount of years without significant differences
varies from 73 % to only 83 %. It is, however, better than the gamma
distribution alone which still has 78.6 % of station years not differing
significantly but with a slope deviation from unity for the upper quantiles
of 0.16. Thus, using the hybrid gamma–GP distribution improves the
simulation of high-amount precipitation events by roughly a factor of 10 compared
to a standard gamma approach.
Limitations
As demonstrated above, GWGEN successfully downscales monthly to daily
meteorology with good correlation and low bias when compared to observations.
However, there are a few limitations of the model as currently described that
should be noted. Importantly, this version of GWGEN neither downscales all
conceivable meteorological variables, nor does it provide a mechanism for
generating daily meteorological time series across multiple points that are
spatially autocorrelated. Concerning the former point, while GWGEN simulates
daily precipitation, temperature, cloud cover, and wind speed, it does not
currently handle other variables that might be important in land surface
modeling, such as humidity or wind direction. On the latter point, the lack
of explicit simulation of spatial autocorrelation may make GWGEN unsuitable
for certain applications, e.g., regional high-resolution hydrological
modeling in small catchments (<∼ 2500 km2), where having the capability
to simulate flood and other extremes is important. This is because the
weather generator could, e.g., simulate rainfall on different days in
different parts of the catchment, where in reality storm events would be
highly autocorrelated in space and controlled by mesoscale meteorological conditions.
Discussion and outlook
GWGEN successfully downscales monthly to daily meteorology for any point on
the globe, in any climate, in any season, and in any time in recent Earth
history and in the near future (e.g., next century). It extends the
original Richardson-type weather generators to simulate wind speed along with
precipitation, temperature, and cloud cover. The model requires only monthly
values of the meteorological variables to be downscaled and does not rely on
any other spatial information, e.g., whether or not the location is in the tropics.
In general, the results of our downscaled meteorology are excellent, with all
simulated variables showing both very high correlation and limited bias when
compared to observations. We improved the simulation of daily precipitation
amount by replacing the gamma distribution used in the original
Richardson-type weather generators with a hybrid gamma–GP distribution, which
results in the improved simulation of heavy precipitation events. The GP
distribution is based upon a globally fixed shape and location parameter,
which may be an oversimplification, but is still 10 times more accurate than
traditional methods that used gamma alone. Our extensive sensitivity analysis
to determine the best coefficients for the shape and location parameters of
the GP distribution suggest that further improvements might come through
correlating the GP parameters to geographic region and/or seasonality
or by introducing a
dynamical location parameter . Finally, we
introduced a step to correct for systematic bias in the downscaling of
temperature and wind speed.
Despite the limitations noted above, GWGEN will be useful in a wide range of
applications, from global vegetation and crop modeling to large-scale
hydrologic analyses, to understanding animal behavior, to forecasting of
fire, insect outbreaks, and other ecosystem disturbances. GWGEN may even be
envisaged as a potential replacement for very large and cumbersome gridded
datasets of high-temporal-resolution meteorology such as CRUNCEP
, especially for models that use meteorological forcing
at a daily time step. The weather generator is particularly suited for the
incorporation into models that run on a spatial grid; for example, GWGEN can
readily be incorporated into existing DGVMs such as LPJ-LMfire
or LPJ-ML
that already rely on a weather generator to provide daily meteorology for
certain processes.
While GWGEN does not handle spatial autocorrelation, in most DGVMs there is
no lateral connection between grid cells, and therefore an explicit
representation of spatial autocorrelation in the driving daily meteorological
data would have no effect on the model output. We further note that if the
monthly data used to drive the model are spatially autocorrelated (this
would be the case when using gridded climatology, for example) then the
result of the weather generator will also preserve this autocorrelation, at
least when the model results are analyzed on monthly or longer timescales.
The limitations present in this version of GWGEN could be addressed in future
versions. Methods for simultaneous multi-site weather generation exist
and could be adapted to GWGEN.
However, even simpler methods to approximate spatial autocorrelation could be
possible. Running GWGEN with gridded monthly meteorology (this is the
primary application we foresee for the current version of the model) means
that the input variables are already highly correlated in space, i.e., the
monthly climate in one grid cell generally closely resembles neighboring
cells outside of complex terrain containing sharp, monotonic climate
gradients, e.g., rain shadows. Thus, one simple way of achieving a measure of
spatial autocorrelation in GWGEN would be to impose a spatial autocorrelation
field on the sequence of random numbers used to impose stochastic noise in
the downscaling functions. If the random number sequence is similar between
grid cells, then, e.g., rain is likely to fall on the same day, given that the
transition probabilities will likely also be similar. Over moderate
distances (< 50 km), it might even be sufficient to use the same
random seed across all grid cells in a neighborhood. This would have the
effect of producing strongly autocorrelated daily meteorology in space, with
the only variations being imposed by the underlying input monthly climatology.
Furthermore, it would be straightforward to include additional meteorological
variables in the model framework, handling, e.g., humidity in the same way
that temperatures, cloud cover, and wind speed are disaggregated. Other
variables, such as pressure and wind direction, might be more difficult using
the basic GWGEN structure because of the importance of autocorrelation,
particularly at high spatial resolution, and might benefit from a different
approach towards weather generation. Finally, GWGEN only downscales
meteorology from monthly to daily values; for models that require an even
shorter time step, e.g., 6 hourly, some extension of the model functionality
would be required. For certain variables, e.g., temperatures, subdaily
downscaling could be easily implemented ;
for other variables, such as precipitation, a large literature on downscaling
methods exists e.g.,, and global
datasets of hourly meteorology for model calibration are available
e.g., the Integrated Surface Database;.
Conclusions
Compiling a global database of daily precipitation, temperature, cloud cover,
and wind speed measurements, we explored the relationship between daily
meteorology and monthly summaries first described in the context of weather
downscaling by . Our analysis of more than 50 million
individual records showed that daily to monthly relationships are relatively
stable in space and time, and constant across a very wide range of stations
from all latitudes and climate zones. With the resulting relationships, we
parameterized a WGEN/SIMMETEO-type weather generator, with the intention of
creating a generic scheme that could be applied anywhere over the Earth's
land surface for the past, present, and (near) future.
GWGEN is open-source software, and the code, utility
programs for parameterization, evaluation, and manipulation of the raw weather
station data, along with complete documentation, are available in
. The original weather station database can be made
available upon request to the authors or downloaded from
and . The weather
generator module is programmed in FORTRAN; the parameterization, evaluation,
and other supplementary tools are written in Python mainly using the
numerical python libraries NumPy and SciPy
, StatsModels
, as well as Matplotlib and
psyplot for the visualization. Detailed installation
instructions can be found in the user manual (https://arve-research.github.io/gwgen/).
Sensitivity analysis
Results of the sensitivity analysis for the (a) correlation
coefficient R2, (b) deviation from a slope of unity,
(c) the fraction of significant different station years, and
(d) the mean of (a)–(c). For the plots
in panels (a) and (b), we used the means of the 25th, 50th, 75th,
90th, 95th, and 99th percentiles. In general, 1 (dark green) is best;
0 (white) is worst. The dark red fields indicate experiments that failed
because of a too-low threshold and too-high GP shape parameter. Note the
logarithmic scale on the y axis.
JOK conceived the model and analyses, wrote the prototype code, and performed
preliminary analyses; PS developed and documented the final version of the
code (including parameterization and evaluation), performed all of the final
analyses, and created the graphical output. Both authors contributed to the
writing of the paper.
The authors declare that they have no conflict of interest.
Acknowledgements
This work was supported by the European Research Council (COEVOLVE, 313797)
and the Swiss National Science Foundation (ACACIA, CR10I2_146314). We thank
Shawn Koppenhoefer for assistance compiling and querying the weather
databases and Alexis Berne and Grégoire Mariéthoz for helpful
suggestions on the analyses. We are grateful to NOAA NCDC and the University
of Washington for providing (free of charge) the GHCN-Daily and EECRA
databases, respectively.
Edited by: Chiel van Heerwaarden
Reviewed by: two anonymous referees
ReferencesBennett, J. C., Robertson, D. E., Ward, P. G., Hapuarachchi, H. P., and Wang,
Q.: Calibrating hourly rainfall–runoff models with daily forcings for
streamflow forecasting applications in meso-scale catchments, Environ. Model.
Softw., 76, 20–36, 10.1016/j.envsoft.2015.11.006, 2016.Bhatt, S., Gething, P. W., Brady, O. J., Messina, J. P., Farlow, A. W., Moyes,
C. L., Drake, J. M., Brownstein, J. S., Hoen, A. G., Sankoh, O., Myers, M. F.,
George, D. B., Jaenisch, T., Wint, G. R. W., Simmons, C. P., Scott, T. W.,
Farrar, J. J., and Hay, S. I.: The global distribution and burden of dengue,
Nature, 496, 504–507, 10.1038/nature12060, 2013.Bondeau, A., Smith, P. C., Zaehle, S., Schaphoff, S., Lucht, W., Cramer, W.,
Gerten, D., Lotze-Campen, H., Müller, C., Reichstein, M., and Smith, B.:
Modelling the role of agriculture for the 20th century global terrestrial
carbon balance, Global Change Biol., 13, 679–706, 10.1111/j.1365-2486.2006.01305.x, 2007.Cesaraccio, C., Spano, D., Duce, P., and Snyder, R. L.: An improved model for
determining degree-day values from daily temperature data, Int. J. Biometeorol.,
5, 161–169, 10.1007/s004840100104, 2001.Dai, A.: Precipitation Characteristics in Eighteen Coupled Climate Models,
J. Climate, 19, 4605–4630, 10.1175/jcli3884.1, 2006.Elith, J., Graham, C. H., Anderson, R. P., Dudík, M., Ferrier, S., Guisan,
A., Hijmans, R. J., Huettmann, F., Leathwick, J., Lehmann, A., Li, J., Lohmann,
L. G., Loiselle, B. A., Manion, G., Moritz, C., Nakamura, M., Nakazawa, Y.,
McC. M. Overton, J., Townsend Peterson, A., J. Phillips, S., Richardson, K.,
Scachetti-Pereira, R., E. Schapire, R., Soberón, J., Williams, S., S. Wisz,
M., and E. Zimmermann, N.: Novel methods improve prediction of species'
distributions from occurrence data, Ecography, 29, 129–151, 10.1111/j.2006.0906-7590.04596.x, 2006.Friend, A. D.: Parameterisation of a global daily weather generator for terrestrial
ecosystem modelling, Ecol. Model., 109, 121–140, 10.1016/S0304-3800(98)00036-2, 1998.Frigessi, A., Haug, O., and Rue, H.: A Dynamic Mixture Model for Unsupervised
Tail Estimation without Threshold Selection, Extremes, 5, 219–235,
10.1023/A:1024072610684, 2002.Furrer, E. M. and Katz, R. W.: Improving the simulation of extreme
precipitation events by stochastic weather generators, Water Resources
Research, 44, n/a–n/a, 10.1029/2008wr007316, 2008.
Geng, S. and Auburn, J. S.: Weather simulation models based on summaries of
long-term data, International Rice Research Institute, Los Baños, Philippines, 237–254, 1987.Geng, S., Devries, F. W. T. P., and Supit, I.: A Simple Method for Generating
Daily Rainfall Data, Agr. Forest Meteorol., 36, 363–376, 10.1016/0168-1923(86)90014-6, 1986.Gerten, D., Schaphoff, S., Haberlandt, U., Lucht, W., and Sitch, S.:
Terrestrial vegetation and water balance – hydrological evaluation of a dynamic
global vegetation model, J. Hydrol., 286, 249–270, 10.1016/j.jhydrol.2003.09.029, 2004.
Gordon, H. A.: Errors in Computer Packages. Least Squares Regression Through
the Origin, J. Roy. Stat. Soc. Ser. D, 30, 23–29, 1981.Guenther, A., Hewitt, C. N., Erickson, D., Fall, R., Geron, C., Graedel, T.,
Harley, P., Klinger, L., Lerdau, M., Mckay, W. A., Pierce, T., Scholes, B.,
Steinbrecher, R., Tallamraju, R., Taylor, J., and Zimmerman, P.: A Global-Model
of Natural Volatile Organic-Compound Emissions, J. Geophys. Res.-Atmos., 100,
8873–8892, 10.1029/94jd02950, 1995.Hahn, C. and Warren, S.: Extended Edited Synoptic Cloud Reports from Ships and
Land Stations Over the Globe, 1952–1996 (with Ship data updated through 2008),
Carbon Dioxide Information Analysis Center, Oak Ridge, Tennessee, 10.3334/CDIAC/cli.ndp026c, 1999.Harris, I., Jones, P. D., Osborn, T. J., and Lister, D. H.: Updated high-resolution
grids of monthly climatic observations – the CRU TS3.10 Dataset, Int. J.
Climatol., 34, 623–642, 10.1002/joc.3711, 2014.Haxeltine, A. and Prentice, I. C.: BIOME3: An equilibrium terrestrial biosphere
model based on ecophysiological constraints, resource availability, and
competition among plant functional types, Global Biogeochem. Cy., 10, 693–709,
10.1029/96gb02344, 1996.Haxeltine, A., Prentice, I. C., and Creswell, I. D.: A coupled carbon and water
flux model to predict vegetation structure, J. Veg. Sci., 7, 651–666, 10.2307/3236377, 1996.Hijmans, R. J., Cameron, S. E., Parra, J. L., Jones, P. G., and Jarvis, A.:
Very high resolution interpolated climate surfaces for global land areas, Int.
J. Climatol., 25, 1965–1978, 10.1002/joc.1276, 2005.Hunter, J. D.: Matplotlib: A 2D Graphics Environment, Comput. Sci. Eng., 9,
90–95, 10.1109/MCSE.2007.55, 2007.Jones, E., Oliphant, T., Peterson, P., et al.: SciPy: Open source scientific
tools for Python, http://www.scipy.org/ (last access: 18 February 2017), 2001.Kaplan, J. O., Bigelow, N. H., Prentice, I. C., Harrison, S. P., Bartlein, P. J.,
Christensen, T. R., Cramer, W., Matveyeva, N. V., McGuire, A. D., Murray, D. F.,
Razzhivin, V. Y., Smith, B., Walker, D. A., Anderson, P. M., Andreev, A. A.,
Brubaker, L. B., Edwards, M. E., and Lozhkin, A. V.: Climate change and Arctic
ecosystems: 2. Modeling, paleodata-model comparisons, and future projections,
J. Geophys. Res.-Atmos., 108, 8171, 10.1029/2002jd002559, 2003.Krinner, G., Viovy, N., de Noblet-Ducoudré, N., Ogée, J., Polcher, J.,
Friedlingstein, P., Ciais, P., Sitch, S., and Prentice, I. C.: A dynamic global
vegetation model for studies of the coupled atmosphere–biosphere system, Global
Biogeochem. Cy., 19, GB1015, 10.1029/2003gb002199, 2005.Kucharik, C. J., Foley, J. A., Delire, C., Fisher, V. A., Coe, M. T., Lenters,
J. D., Young-Molling, C., Ramankutty, N., Norman, J. M., and Gower, S. T.:
Testing the performance of a dynamic global ecosystem model: Water balance,
carbon balance, and vegetation structure, Global Biogeochem. Cy., 14, 795–825,
10.1029/1999GB001138, 2000.Lafon, T., Dadson, S., Buys, G., and Prudhomme, C.: Bias correction of daily
precipitation simulated by a regional climate model: a comparison of methods,
Int. J. Climatol., 33, 1367–1381, 10.1002/joc.3518, 2012.
Leemans, R. and Cramer, W. P.: The IIASA database for mean monthly values of
temperature, precipitation, and cloudiness on a global terrestrial grid,
International Institute for Applied Systems Analysis, Laxenburg, Austria, 1991.Lieth, H.: Modeling the Primary Productivity of the World, Springer, Berlin,
Heidelberg, 237–263, 10.1007/978-3-642-80913-2_12, 1975.Maraun, D., Rust, H. W., and Osborn, T. J.: The annual cycle of heavy
precipitation across the United Kingdom: a model based on extreme value
statistics, Int. J. Climatol., 29, 1731–1744, 10.1002/joc.1811, 2009.Matalas, N. C.: Mathematical assessment of synthetic hydrology, Water Resour.
Res., 3, 937–945, 10.1029/WR003i004p00937, 1967.Menne, M. J., Durre, I., Korzeniewski, B., McNeill, S., Thomas, K., Yin, X.,
Anthony, S., Ray, R., Vose, R. S., Gleason, B. E., and Houston, T. G.: Global
Historical Climatology Network – Daily (GHCN-Daily), Version 3.22, NOAA National
Climatic Data Center, 10.7289/V5D21VHZ, 2012a.Menne, M. J., Durre, I., Vose, R. S., Gleason, B. E., and Houston, T. G.: An
Overview of the Global Historical Climatology Network-Daily Database, J. Atmos.
Ocean. Tech., 29, 897–910, 10.1175/jtech-d-11-00103.1, 2012b.Mitchell, T. D. and Jones, P. D.: An improved method of constructing a database
of monthly climate observations and associated high-resolution grids, Int. J.
Climatol., 25, 693–712, 10.1002/joc.1181, 2005.New, M., Hulme, M., and Jones, P.: Representing twentieth-century space–time
climate variability. Part I: Development of a 1961–90 mean monthly terrestrial
climatology, J. Climate, 12, 829–856, 10.1175/1520-0442(1999)012<0829:Rtcstc>2.0.Co;2, 1999.New, M., Hulme, M., and Jones, P.: Representing twentieth-century space–time
climate variability. Part II: Development of 1901–96 monthly grids of
terrestrial surface climate, J. Climate, 13, 2217–2238, 10.1175/1520-0442(2000)013<2217:Rtcstc>2.0.Co;2, 2000.New, M., Lister, D., Hulme, M., and Makin, I.: A high-resolution data set of
surface climate over global land areas, Clim. Res., 21, 1–25, 10.3354/cr021001, 2002.Neykov, N. M., Neytchev, P. N., and Zucchini, W.: Stochastic daily precipitation
model with a heavy-tailed component, Nat. Hazards Earth Syst. Sci., 14, 2321–2335,
10.5194/nhess-14-2321-2014, 2014.Parlange, M. B. and Katz, R. W.: An Extended Version of the Richardson Model
for Simulating Daily Weather Variables, J. Appl. Meteorol., 39, 610–622,
10.1175/1520-0450-39.5.610, 2000.Pfeiffer, M., Spessa, A., and Kaplan, J. O.: A model for global biomass burning
in preindustrial time: LPJ-LMfire (v1.0), Geosci. Model Dev., 6, 643–685,
10.5194/gmd-6-643-2013, 2013.Prentice, I.: Developing a Global Vegetation Dynamics Model: Results of an
IIASA Summer Workshop, Iiasa research report, IIASA, Laxenburg, Austria,
http://pure.iiasa.ac.at/3223/ (last access: 15 February 2017), 1989.Prentice, I. C., Cramer, W., Harrison, S. P., Leemans, R., Monserud, R. A., and
Solomon, A. M.: A Global Biome Model Based on Plant Physiology and Dominance,
Soil Properties and Climate, J. Biogeogr., 19, 117–134, 10.2307/2845499, 1992.Richardson, C. W.: Stochastic simulation of daily precipitation, temperature,
and solar radiation, Water Resour. Res., 17, 182–190, 10.1029/WR017i001p00182, 1981.Rust, H. W., Maraun, D., and Osborn, T. J.: Modelling seasonality in extreme
precipitation, Eur. Phys. J. Spec. Top., 174, 99–111, 10.1140/epjst/e2009-01093-7, 2009.Rymes, M. and Myers, D.: Mean preserving algorithm for smoothly interpolating
averaged data, Solar Energy, 71, 225–231, 10.1016/s0038-092x(01)00052-4, 2001.
Seabold, S. and Perktold, J.: Statsmodels: Econometric and Statistical Modeling
with Python, in: Proceedings of the 9th Python in Science Conference, edited by:
van der Walt, S. and Millman, J., 57–61, 2010.Sitch, S., Smith, B., Prentice, I. C., Arneth, A., Bondeau, A., Cramer, W.,
Kaplan, J. O., Levis, S., Lucht, W., Sykes, M. T., Thonicke, K., and Venevsky,
S.: Evaluation of ecosystem dynamics, plant geography and terrestrial carbon
cycling in the LPJ dynamic global vegetation model, Global Change Biol., 9,
161–185, 10.1046/j.1365-2486.2003.00569.x, 2003.Smith, A., Lott, N., and Vose, R.: The Integrated Surface Database: Recent
Developments and Partnerships, B. Am. Meteorol. Soc., 92, 704–708, 10.1175/2011BAMS3015.1, 2011.Sommer, P. S.: The psyplot interactive visualization framework, J. Open Source
Softw., 2, 10.21105/joss.00363, 2017.Sommer, P. S. and Kaplan, J. O.: GWGEN v1.0.2: A global weather generator for
daily data, 10.5281/zenodo.889213, 2017.Stephens, G. L., L'Ecuyer, T., Forbes, R., Gettelmen, A., Golaz, J.-C.,
Bodas-Salcedo, A., Suzuki, K., Gabriel, P., and Haynes, J.: Dreary state of
precipitation in global models, J. Geophys. Res.-Atmos., 115, D24211, 10.1029/2010JD014532, 2010.Sun, Y., Solomon, S., Dai, A., and Portmann, R. W.: How Often Does It Rain?,
J. Climate, 19, 916–934, 10.1175/jcli3672.1, 2006.Viovy, N. and Ciais, P.: A combined dataset for ecosystem modelling, available
at: https://vesg.ipsl.upmc.fr/thredds/catalog/store/p529viov/cruncep/catalog.html
(last access: 11 October 2017), 2016.
Walter, H. and Lieth, H.: Climate diagram world atlas, VEB Gustav Fischer
Verlag, Jena, 1967.Wei, Y., Liu, S., Huntzinger, D. N., Michalak, A. M., Viovy, N., Post, W. M.,
Schwalm, C. R., Schaefer, K., Jacobson, A. R., Lu, C., Tian, H., Ricciuto, D.
M., Cook, R. B., Mao, J., and Shi, X.: The North American Carbon Program
Multi-scale Synthesis and Terrestrial Model Intercomparison Project – Part 2:
Environmental driver data, Geosci. Model Dev., 7, 2875–2893, 10.5194/gmd-7-2875-2014, 2014.Wilks, D. S.: Multisite generalization of a daily stochastic precipitation
generation model, J. Hydrol., 210, 178–191, 10.1016/S0022-1694(98)00186-3, 1998.Wilks, D. S.: Interannual variability and extreme-value characteristics of
several stochastic daily precipitation models, Agr. Forest Meteorol., 93,
153–169, 10.1016/S0168-1923(98)00125-7, 1999a.Wilks, D. S.: Multisite downscaling of daily precipitation with a stochastic
weather generator, Clim. Res., 11, 125–136, 10.3354/cr011125, 1999b.
Wilks, D. S.: Simultaneous stochastic simulation of daily precipitation,
temperature and solar radiation at multiple sites in complex terrain, Agr.
Forest Meteorol., 96, 85–101, 10.1016/S0168-1923(99)00037-4, 1999c.Wilks, D. S.: Use of stochastic weathergenerators for precipitation downscaling,
Wiley Interdisciplinary Reviews: Climate Change, 1, 898–907, 10.1002/wcc.85, 2010.Wilks, D. S. and Wilby, R. L.: The weather generation game: a review of
stochastic weather models, Prog. Phys. Geogr., 23, 329–357, 10.1177/030913339902300302, 1999.Woodward, F. I., Smith, T. M., and Emanuel, W. R.: A global land primary
productivity and phytogeography model, Global Biogeochem. Cy., 9, 471–490,
10.1029/95GB02432, 1995.Woolhiser, D. A. and Pegram, G. G. S.: Maximum Likelihood Estimation of Fourier
Coefficients to Describe Seasonal-Variations of Parameters in Stochastic Daily
Precipitation Models, J. Appl. Meteorol., 18, 34–42, 10.1175/1520-0450(1979)018<0034:Mleofc>2.0.Co;2,1979.Woolhiser, D. A. and Roldan, J.: Stochastic Daily Precipitation Models: 2. A
Comparison of Distributions of Amounts, Water Resour. Res., 18, 1461–1468,
10.1029/WR018i005p01461, 1982.Woolhiser, D. A. and Roldán, J.: Seasonal and Regional Variability of
Parameters for Stochastic Daily Precipitation Models: South Dakota, U.S.A,
Water Resour. Res., 22, 965–978, 10.1029/WR022i006p00965, 1986.