GLEAM v3: satellite-based land evaporation and root-zone soil moisture

The Global Land Evaporation Amsterdam Model (GLEAM) is a set of algorithms dedicated to the estimation of terrestrial evaporation and root-zone soil moisture from satellite data. Ever since its development in 2011, the model has been regularly revised, aiming at the optimal incorporation of new satellite-observed geophysical variables, and improving the representation of physical processes. In this study, the next version of this model (v3) is presented. Key changes relative to the previous version include (1) a revised formulation of the evaporative stress, (2) an optimized drainage algorithm, and (3) a new soil moisture data assimilation system. GLEAM v3 is used to produce three new data sets of terrestrial evaporation and root-zone soil moisture, including a 36-year data set spanning 1980–2015, referred to as v3a (based on satellite-observed soil moisture, vegetation optical depth and snow-water equivalent, reanalysis air temperature and radiation, and a multi-source precipitation product), and two satellite-based data sets. The latter share most of their forcing, except for the vegetation optical depth and soil moisture, which are based on observations from different passive and active Cand L-band microwave sensors (European Space Agency Climate Change Initiative, ESA CCI) for the v3b data set (spanning 2003–2015) and observations from the Soil Moisture and Ocean Salinity (SMOS) satellite in the v3c data set (spanning 2011–2015). Here, these three data sets are described in detail, compared against analogous data sets generated using the previous version of GLEAM (v2), and validated against measurements from 91 eddycovariance towers and 2325 soil moisture sensors across a broad range of ecosystems. Results indicate that the quality of the v3 soil moisture is consistently better than the one from v2: average correlations against in situ surface soil moisture measurements increase from 0.61 to 0.64 in the case of the v3a data set and the representation of soil moisture in the second layer improves as well, with correlations increasing from 0.47 to 0.53. Similar improvements are observed for the v3b and c data sets. Despite regional differences, the quality of the evaporation fluxes remains overall similar to the one obtained using the previous version of GLEAM, with average correlations against eddy-covariance measurements ranging between 0.78 and 0.81 for the different data sets. These global data sets of terrestrial evaporation and root-zone soil moisture are now openly available at www.GLEAM.eu and may be used for large-scale hydrological applications, climate studies, or research on land–atmosphere feedbacks. Published by Copernicus Publications on behalf of the European Geosciences Union. 1904 B. Martens et al.: GLEAM v3


Introduction
Climate change alters the complex interplay between land and atmosphere, significantly impacting different processes in the global hydrological cycle (Huntington, 2006;Wild et al., 2008;Miralles et al., 2014b). Analysing these impacts requires long-term, observational, and consistent data sets of essential hydrological variables, such as soil moisture, precipitation, and terrestrial evaporation (or "evapotranspiration"). Unfortunately, the large-scale observation of terrestrial evaporation is hampered by the inability to sense this flux directly from satellites. Consequently, this crucial return flow of water from land into the atmosphere remains one of the most elusive and uncertain components of the global hydrological cycle (Dolman et al., 2014;Miralles et al., 2016b;Fisher et al., 2017).
However, the climate community is becoming increasingly aware of the crucial role that terrestrial evaporation plays in the Earth system, acting as a link in hydrological and biogeochemical cycles, and being a driver of air humidity, cloud formation, temperature, or precipitation Taylor et al., 2012;Miralles et al., 2012). Consequently, past decades have seen significant efforts to enhance our understanding of the global magnitude and variability of this flux. Some of these efforts have mainly concentrated on routinely measuring evaporation in the field (Wang and Dickinson (2012), and references therein), resulting in the increased availability of in situ observations from different climatic regions across the globe (Baldocchi et al., 2001;Jung et al., 2009). In addition, acknowledging the sparseness of current in situ networks and their inability to meet the spatiotemporal requirements for climatological studies, the potential of satellite remote sensing to enhance our understanding of terrestrial evaporation dynamics has been intensively explored. While in the near future, evaporation will remain undetectable from space, several models that combine remotely observable drivers of this flux (e.g. radiation, air temperature, soil moisture) have been developed and are being intensively used in recent years (Wang and Dickinson, 2012;McCabe et al., 2016).
Existing algorithms share the overarching objective of producing consistent, long-term, global data sets of terrestrial evaporation, but the methods and input data sets used in these models differ markedly (e.g. Mu et al., 2007;Fisher et al., 2008;Zhang et al., 2010;Miralles et al., 2011;Loew et al., 2016). Recently, McCabe et al. (2016), Michel et al. (2016), and Miralles et al. (2016a) evaluated the relative value of four of these evaporation models using standardized satellite and in situ based forcing data sets. As expected, results highlighted substantial differences in model performance, especially under conditions of water stress. In addition, these studies found pronounced deficiencies in the way evaporation is partitioned into its different components (i.e. transpiration, bare soil evaporation, open-water evaporation, interception loss, and sublimation). Miralles et al. (2016a) and Fisher et al. (2017) also highlighted the importance of advancing the physical representation of evaporation in these simple models, and the need to incorporate new technological advances in remote sensing science.
The Global Land Evaporation Amsterdam Model -GLEAM (Miralles et al., 2011) -is arguably the only one of these global evaporation models that is designed to be driven by remote sensing observations only. These observations are primarily derived from microwave sensors, including soil moisture and vegetation optical depth used in GLEAM to constrain the potential evaporation rates. Another key feature of the approach is the independent and detailed modelling of forest interception loss based on Gash's analytical model (Gash, 1979;Valente et al., 1997;Miralles et al., 2010b). Evaporation and root-zone soil moisture data sets from GLEAM have been widely used in the past to study spatial variability and trends in the water cycle (e.g. Jasechko et al., 2013;Greve et al., 2014;Miralles et al., 2014a;Zhang et al., 2016), as well as land-atmosphere feedbacks (e.g. Miralles et al., 2014b;Guillod et al., 2015). The first version (v1) of the model was developed by Miralles et al. (2011) and further refined (v2) by Miralles et al. (2014b); the present paper presents the third version (v3) of the methodology. In this new version, most components of GLEAM have been updated, except for the interception loss algorithm and the potential evaporation module. First, aiming at a more realistic representation of evaporative stress, observations of microwave VOD and root-zone soil moisture have been combined to represent the non-linear response of soil and vegetation to the drying of land (e.g. Colello et al., 1998;Serraj et al., 1999;Ronda et al., 2002;Combe et al., 2016). Second, the soil module has been adapted to represent the continuous drainage of precipitation through the vertical profile. Finally, the soil moisture data assimilation system -recently updated and validated for Australia  -has been optimized to work at the global scale and to integrate different data sets of satellite soil moisture observations. These changes have respected the minimalistic approach of GLEAM of targeting only the fundamental processes controlling large-scale evaporation rates while keeping the overall simplicity and observational nature of the model.
The main goal of this study is to present the new version of GLEAM and the resulting evaporation and root-zone soil moisture data sets, including a global validation using a large database of soil moisture measurements from 2325 in situ sensors, and evaporation measurements from 91 eddycovariance towers. In addition, the quality of these data sets is compared against analogous data sets generated using the former version of GLEAM, allowing us to evaluate the added value of the new formulations. The paper is organized as follows: Sect. 2 describes the new algorithms, highlighting the main changes upon the previous version. The forcing data and the in situ validation data are described in Sect. 3. Section 4 analyses the quality of the GLEAM data sets and dis-

Potential evaporation module
Priestley and Taylor equation driven by observed surface meteorology

Soil module
Multi-layer soil model driven by observed precipitation

Stress module
Semi-empirical relationship to root-zone soil moisture and vegetation optical depth  (2014) is used. The evaporative flux is calculated for each of these fractions separately and then aggregated to the scale of the pixel based on the fractional cover of each landcover type. First, the Priestley and Taylor (1972) equation is used to calculate the cover-dependent potential evaporation rate E p (mm day −1 ) based on air temperature and net radiation:

Rainfall interception model
where λ (MJ kg −1 ) is the latent heat of vaporization and (kPa K −1 ) is the slope of the saturated water vapourtemperature curve. Both variables can be estimated using empirical relationships with the air temperature (Henderson- Sellers, 1984;Maidment, 1993). ψ (kPa K −1 ) is the psychometric constant, α (-) is the Priestley and Taylor coefficient, R n (W m −2 ) is the net radiation, and G (W m −2 ) is the ground heat flux. G is calculated as a constant fraction of R n depending on the cover type (Miralles et al., 2011). For α, a value of 1.26 has been reported by Priestley and Taylor (1972) for well-watered grasslands, and has been used in numerous studies for a variety of ecosystems. However, empirical studies have highlighted the more conservative nature of tree stomata, often resulting in lower rates of potential evaporation in forested areas (Shuttleworth and Calder, 1979;Kelliher et al., 1993;Teuling et al., 2010). Therefore, the α for tall vegetation is defined following the findings by McNaughton and Black (1973), Shuttleworth et al. (1984), Viswanadham et al. (1991), Diawara et al. (1991), and Eaton et al. (2001), which report an average value of 0.97 (with a 0.08 standard deviation over the different studies) for various forests during unstressed and precipitation-free periods (i.e. no rainfall interception).
Estimates of E p are converted into actual transpiration or bare soil evaporation (depending on the land-cover type), using a cover-dependent, multiplicative stress factor S (-) ranging from 1 to 0. S is calculated as a function of microwave VOD and root-zone soil moisture (see Sect. 2.2.3). The latter is calculated using a multi-layer water-balance algorithm considering net precipitation (precipitation minus interception loss) and snowmelt as inputs, and evaporation and drainage as outputs (Miralles et al., 2011). The depth of the root zone is a function of the land-cover type and comprises three model layers for the fraction of tall vegetation (0-10, 10-100, and 100-250 cm), two for the fraction of low vegetation (0-10, 10-100 cm), and only one for the fraction of bare soil (0-10 cm). Forest rainfall interception loss is estimated independently using the analytical model introduced by Gash (1979) and further refined by Valente et al. (1997), forced with precipitation and considering both the characteristics of precipitation and vegetation (Miralles et al., 2010b). In the next section, we focus on the changes relative to the previous model version (Miralles et al., 2014b), and we refer to Miralles et al. (2011Miralles et al. ( , 2014b and  for more detailed descriptions of the model baseline.

Recent advances in GLEAM
2.2.1 Soil module Figure 2 shows a schematic of the conceptual root zone for the fraction of tall vegetation in a pixel. Each soil layer is subdivided into three different compartments. The first compartment (bottom) represents the water retained below the wilting point, w wp (m 3 m −3 ), and which is not available for root uptake; for the bare soil fraction, the residual soil moisture, w r (m 3 m −3 ), is used instead. The second compartment of the layer is bounded by w wp and the porosity of the soil matrix, w p (m 3 m −3 ), and represents the maximum volume of water available for evaporation. Finally, the third compartment represents the solid phase of the soil column and thus cannot hold any water. The soil properties used in GLEAM come from the database of Global Gridded Surfaces of Selected Soil Characteristics generated by the International Geosphere-Biosphere Programme Data and Information System (IGBP-DIS, Global Soil Data Task Group, 2000).
At every daily time step i, the state of any layer l is characterized by its water content w (l) i (m 3 m −3 ), which is updated using where w (l) i−1 is the volumetric soil moisture content of layer l at the previous time step (i − 1), F (l−1) s,i (mm day −1 ) is the volume of water slowly draining into the layer (slow draining flux), F (l−1) f,i (mm day −1 ) is the volume of water directly reaching the layer (fast draining flux), E (l) i−1 (mm day −1 ) is the evaporative flux from the previous day, F (l) s,i (mm day −1 ) is the slow drainage of water out of the reservoir, t is the temporal resolution (1 day), and z (l) (mm) is the depth of the soil layer. Note that for the first layer (l = 1), only F 0 f,i is considered as an input, as there is no draining layer on top.
In previous model versions, the entire volume of net precipitation (i.e. precipitation minus interception loss, plus snowmelt) was first stored in the top layer, which subsequently drained to field capacity into the next soil layer (Miralles et al., 2011(Miralles et al., , 2014b; the same process was used to calculate the vertical flow from the remaining layers. As a result, the soil moisture could not exceed field capacity, nor was drainage allowed to occur below that threshold. In GLEAM v3, net precipitation is first partitioned between the different soil layers based on the relative saturation at the beginning of the daily time step, in order to estimate the fast draining flux F (l) f,i . Next, the volume of water that slowly drains to the next layer (F (l) s,i ) is estimated using a simplified representation of Darcy's law, in which a fraction of the available water above the wilting point is drained to the next layer based on (1) the relative saturation of each layer and (2) the difference in soil moisture content between both layers.
The rationale behind this simple drainage algorithm is that the downward flux of water is expected to increase if (1) the relative soil moisture content is higher (physically resulting in increased hydraulic conductivities), and (2) the difference in soil moisture between source and sink is larger (resulting in higher differences in soil-water potential). This empirical drainage algorithm is preferred over wellknown alternatives such as the Richards equation (Richards, 1931), Brooks-Corey (Brooks and Corey, 1964) or Clapp-Hornberger (Clapp and Hornberger, 1978), due to its simplicity and the fact that it does not require the use of additional largely unconstrained ancillary data on soil properties at the global scale.

Data assimilation system
The original Kalman filter approach to assimilate microwave soil moisture observations -typically sensitive to the first few centimetres of the soil -into GLEAM was replaced in favour of a simple Newtonian nudging algorithm in v2 (Miralles et al., 2014b), which was recently further optimized . This Newtonian nudging scheme minimizes the computational demands and fits well within the rationale of GLEAM of keeping the model as simple and observation-driven as possible. While more complex algorithms like the ensemble Kalman filter have also been applied in GLEAM, the added value has been shown to be marginal (Miralles et al., 2014b). Therefore, in this new version, we adopt a similar approach to the Newtonian nudging scheme implemented by : where w (1)+ i is the a posteriori soil moisture state at the first model layer (i.e. after application of the data assimilation algorithm), w (1)− i is the a priori soil moisture state at the same layer (i.e. before assimilation of the observed soil moisture), K (-) is the nudging factor (a value of 1 is used to maximize the impact of the assimilation algorithm as in , γ (-) is the quality factor, andŵ o i (m 3 m −3 ) and w (1)− i (m 3 m −3 ) are the observed and modelled soil moisture anomalies, respectively. The latter represent deviations relative to the seasonal climatology of soil moisture -calculated as in De Lannoy and Reichle (2016) and Lievens et al. (2017) -as opposed to the absolute values of soil moisture assimilated by .
As most assimilation algorithms require bias-free observations in reference to the modelled states, a bias removal algorithm prior to (or during) the assimilation step has to be applied. However, no standard procedure exists to correct these constant or seasonally varying biases (Lievens et al., 2015;De Lannoy and Reichle, 2016); thus, the choice of the bias-removal algorithm remains to some degree subjective. As indicated by , the use of a classical CDF-matching approach prior to the assimilation step clearly introduced seasonal biases in the GLEAM soil moisture and evaporative fluxes. As a result, in GLEAM v3, soil moisture anomalies are assimilated instead, as this approach allows the correction of potential seasonal biases between the modelled and observed soil moisture states (De Lannoy and Reichle, 2016). As a triple collocation analysis (TCA, Scipal et al. (2008); Miralles et al. (2010a); Gruber et al. (2016)) is applied here to obtain the observation and model errors, the anomaly time series of the observations are scaled towards the modelled soil moisture anomalies using a linear regression model prior to the assimilation (Yilmaz and Crow, 2013). We note that for applying a TCA, a third independent data set of the same geophysical variable is required. For this purpose, soil moisture fields from the Noah model in the Global Land Data Assimilation System (GLDAS) (Rodell et al., 2004) are used. The three independent and rescaled anomaly time series of surface soil moisture are used in the TCA to estimate both the model and observation errors on a yearly basis. The latter two are then adopted to calculate the quality factor (γ ) as in : where σ (1) mod (m 3 m −3 ) and σ obs (m 3 m −3 ) are the standard deviations of the random model and observation errors, respectively.
Finally, in contrast to the assimilation of soil moisture observations in all model layers in GLEAM v2 (Miralles et al., 2014b;, only the first model layer is updated in the new version. The latter choice is motivated by the slower dynamics of the deeper model layers, which are strongly perturbed when soil moisture observations are directly assimilated into these layers using the simple Newtonian nudging scheme. The impact of the soil moisture update in this GLEAM v3 is thus propagated towards deeper layers by drainage processes only, which ensures a smooth transition of water through the vertical profile.

Stress module
Water availability, heat stress, or phenological constraints acting on evaporation are generally combined in a single empirical stress factor accounting for the decrease in potential evaporation (Sellers et al., 2007). In GLEAM, a multiplicative stress factor S ranging between 0 (maximum stress and thus no evaporation) and 1 (no stress and thus potential evaporation) is defined. In the first version (Miralles et al., 2011), S was parameterized separately for the fractions of tall and short vegetation using non-linear relationships between S and the soil moisture of the wettest layer. To account for changes in vegetation phenology of short vegetation, the VOD was also used in the calculation of S for this fraction. These functions were linearized in the second version, and the VOD was also introduced for the calculation of the stress for the fraction of tall vegetation (Miralles et al., 2014b). However, based on experimental evidence, a nonlinear response of S to soil moisture is expected for most vegetation types (e.g. Colello et al., 1998;Serraj et al., 1999;Ronda et al., 2002;Combe et al., 2016). As a consequence, a non-linear stress function for both tall and short vegetation is re-introduced in GLEAM v3: where VOD max (-) is the maximum VOD for a specific pixel, w c (m 3 m −3 ) is the critical soil moisture, and w (w) (m 3 m −3 ) is the soil moisture content of the wettest layer, assuming that plants withdraw water from the layer in which it is more easily accessible. As soil moisture decreases, S decreases (i.e. increased evaporative stress), since water becomes less easily available for the roots. As vegetation phenology is not explicitly accounted for, the VOD -closely linked to the vegetation water content (Liu et al., 2013) -is used to account for the effect of (seasonal or occasional) phenological constraints on evaporation (e.g. leaf-out, fires, pests), with decreasing VOD resulting in lower values for S and thus higher evaporative stress. As seen from Eq. (5) and Fig. 3, the stress function is thus defined by both the soil moisture content in the wettest soil layer and the VOD. If w (w) reaches w wp , Eq. (5) implies that the vegetation is incapable of retrieving water from the soil and S equals zero (and so does actual transpiration). On the other hand, for soil moisture values exceeding w c and the VOD reaching its maximum value, it is assumed that the vegetation is unstressed (i.e. S = 1; thus, transpiration equals potential transpiration). Figure 3 illustrates the shape of the stress function in Eq. (5) for a pixel dominated by a strong seasonality in VOD (left-hand side) and a site with a limited variability in VOD (right-hand side). For illustrative purposes only, it is assumed that the soil properties (w c and w wp ) are the same for both sites. As can be seen, where the range in VOD is low given the absence of a marked seasonality, S mainly depends on soil moisture. Conversely, if a large seasonality in the VOD is present (see the left-hand side figure), the VOD becomes more important for the calculation of S.
Finally, for the bare soil fraction, S is linearly related to the soil moisture state using the critical and residual soil moisture content as upper and lower boundary conditions, respectively: Since only the top layer is considered for the fraction of bare soil, S is fully driven by surface soil moisture (w (1) ).
3 Data 3.1 Input data sets Table 1 gives an overview of the selected forcing data sets in GLEAM v3. All input data sets have a daily resolution and are linearly re-sampled from their original spatial resolution to a common 0.25 • global grid. Given the aim to extract all valuable information on terrestrial evaporation from existing satellite records, forcing data sets are preferentially derived from satellite observations. However, since a key application of the GLEAM data sets is to analyse the impact of climate change on terrestrial hydrology, we also explore the use of alternative forcing data sets, such as reanalysis data, to yield multi-decadal records of terrestrial evaporation and root-zone soil moisture. Radiation inputs are based on measurements from the Clouds and Earth's Radiant Energy System (CERES) onboard Terra and Aqua (Wielicki, 1996), which are available globally from the year 2001 onwards on a 1 • regular grid. Additionally, radiation fluxes from the current reanalysis of the European Centre for Medium-Range Weather Forecasts (ECMWF), ERA-Interim (Dee et al., 2011), are also processed. ERA-Interim data are available globally from 1979 to present, with a temporal resolution of 3 h and a spatial resolution of approximately 0.75 • . For the precipitation forcing, the Tropical Rainfall Measurement Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA) 3B42v7 product (Huffman et al., 2007) and the Multi-Source Weighted-Ensemble Precipitation (MSWEP) data set (Beck et al., 2016) are selected. The TMPA 3B42v7 data set combines measurements from several satellites and is bias corrected on a monthly timescale using ground-based measurements of precipitation. The product is available for the period 1998-2015 and covers 50 • N-50 • S based on a 0.25 • grid. MSWEP on the other hand is based on a merger of selected satellite-, reanalysis-, and gauge-based products, and is available from 1979 until 2015 at a 0.25 • spatial resolution. Air temperatures are derived from measurements of the Atmospheric Infrared Sounder (AIRS, Aumann et al., 2003), which are available from 2003 onwards on a global 1 • regular grid. Air temperature estimates from ERA-Interim are also selected for the long-term GLEAM data set. As for the radiation, data are available globally from 1979 until near present at 3-hourly intervals. To estimate sublimation, observations of snow-water equivalent from the European Space Agency (ESA) GLOBSNOW product (Luojus et al., 2013) are used. This data set is mainly based on retrievals from the Scanning Multichannel Microwave Radiometer (SMMR), Special Sensor Microwave/Imager (SSM/I), and Advanced Microwave Scanning Radiometer (AMSR-E), and is available from 1980 onwards. The GLOBSNOW product only covers the Northern Hemisphere and is therefore merged with the National Snow and Ice Data Center (NSIDC) monthly snowwater equivalent climatology product (Armstrong et al., 2005) for the Southern Hemisphere. The latter is also based on measurements from SSMR and SSM/I. As discussed in Sect. 2.2.3, the phenological controls on transpiration are derived from observations of microwave VOD. Here, the 0.25 • product from Liu et al. (2011) is used, which is based on retrievals from several passive microwave sensors using the Land Parameter Retrieval Model (LPRM, Owe et al. (2008)). The product is available at the global scale and spans the period 1980-2012; in order to cover the period 1980-2015, it is merged with LPRM-based VOD retrievals from SMOS (van der Schalie et al., 2015, 2016) using a similar CDFmatching approach to the one used by Liu et al. (2011). The resulting data set contains gaps due to the repeating cycle of the satellites, the requirement of non-frozen conditions for parameter retrieval, and the presence of radio frequency interference (RFI). In order to obtain smooth and continuous time series, the VOD data set is gap-filled using a moving average filter with a 7-day window. Remaining gaps, generally occurring in winter time due to freezing temperatures and snow covers, are linearly interpolated between the last and next available retrieval. We note however that in periods for which the land is covered by snow, the VOD is not used as the entire evaporation flux is assumed to be sublimation. Finally, if any gaps remain, these are filled using nearest neighbour interpolation. It should be noted here that microwave sensors operating at different frequencies might be sensitive to diverse components of vegetation, varying at different timescales (e.g. Guglielmetti et al., 2007;Liu et al., 2011). Despite the CDF-matching, which corrects for differences in long-term statistics, the use of different microwave sensors might impact the temporal dynamics of the VOD data set used here. Finally, for the assimilation of microwave surface soil moisture, the SMOS Level 3 soil moisture product (Jacquette et al., 2010) and the ESA Climate Change Initiative soil moisture (ESA CCI SM v2.3) data set (Liu et al., 2012;Wagner et al., 2012) are selected. The latter is a blended product of soil moisture retrievals from several active and passive microwave sensors, available for the period 1978-2015 at the global scale. In addition, surface soil moisture fields from the Noah model in GLDAS (Rodell et al., 2004) are used as a third independent data set in the TCA (see Sect. 2.2.2). Despite fundamental differences between GLEAM and Noah, some degree of dependency between their soil moisture estimates might be present due to the presence of common precipitation observations embedded within MSWEP, TMPA 3B42, and the forcing of GLDAS Noah. However, the merging schemes used to produce the precipitation data sets are ultimately different (Rodell et al., 2004;Huffman et al., 2007;Beck et al., 2016). Such a dependency could penalize the satellite-based soil moisture in the TCA (Yilmaz and Crow, 2014), which would result in a lower Table 1. List of the selected forcing data sets together with their references, the original spatial resolution, and period of availability. The first column indicates the use of these data in GLEAM.  Mach et al. (2007) quality factor γ (see Eq. 4) applied in the data assimilation system and, subsequently, in a more conservative soil moisture update.
As discussed in Sect. 2, GLEAM also requires several static data sets describing the soil properties, land cover, and average rainfall climatology. For the land cover fractions, the MODIS Global Vegetation Continuous Fields product (MOD44B) is selected (Hansen et al., 2005). The highresolution product at 250 m is up-scaled to the required grid size of 0.25 • (note that in previous model versions, the low-resolution 0.25 • product produced by the MODIS team was used instead). Note that for the fraction of open water, the product produced by Tuanmu and Jetz (2014) is combined with the MODIS-based product. Soil properties such as wilting point, soil porosity, field capacity, and critical soil moisture are derived from the database of Global Gridded Surfaces of Selected Soil Characteristics, IGBP-DIS (Global Soil Data Task Group, 2000). Finally, as in Miralles et al. (2010b), a monthly rainfall intensity climatology is inferred from the Combined Global Lightning Flash Rate Density monthly product (Mach et al., 2007) produced by the National Aeronautics and Space Administration (NASA) agency.
Using various combinations of the forcing data, three different data sets of terrestrial evaporation and root-zone soil moisture are produced using GLEAM v3 (see also Table 1). The inputs of snow-water equivalent, the third independent data set used in the TCA, and the static fields are shared by all data sets. The first GLEAM data set (hereafter referred to as v3a) is a 36-year data set  covering the entire globe and is based on satellite-observed soil moisture, vegetation optical depth and snow-water equivalent, reanalysis air temperature and radiation, and the MSWEP data set for precipitation. Given the multi-decadal coverage of this data set, it is intended to foster climatological research. The remain-ing data sets (v3b and v3c) are driven by satellite-based data only, and span a shorter period. In addition, these data sets only cover 50 • N-50 • S due to the use of the TMPA 3B42v7 product. The differences between both satellite-based data sets are the VOD and soil moisture forcing, which are retrieved from SMOS only in the v3c data set, and from multiple active and passive microwave sensors in the v3b data set. This also implies different record lengths of 13 (2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015) and 5 years (2011)(2012)(2013)(2014)(2015) for the v3b and v3c data sets, respectively.

Validation data sets
For validation purposes, in situ soil moisture and evaporation measurements from different global networks are processed. Soil moisture measurements are sourced from the database of the International Soil Moisture Network (ISMN, Dorigo et al. (2011Dorigo et al. ( , 2013), whereas the FLUXNET 2015 synthesis data set (http://fluxnet.fluxdata.org/) is used to obtain the in situ measurements of evaporation (see Table A1 for an overview of the selected sites). Note that several studies have already highlighted the lack of closure in the energy balance at eddy-covariance sites and a consequential tendency to underestimate the latent heat flux (Wilson et al., 2002). All available measurements for 1980-2015 are considered for inclusion in the validation set. Measurements are masked using the quality flags provided in the corresponding data set archives and aggregated from their native temporal resolution (generally 30 min or 1 h) to the required daily scale. For the evaporation data sets, only days with less than 25 % missing data are processed. As in , the resulting daily time series are screened for extreme outliers and repetitive recorded values. Soil moisture measurements are subsequently masked for snow and air temperatures below 0 • C using the snow-water equivalent from GLOBSNOW and the Table 2. Average validation statistics for the different soil moisture data sets (v3a, v3b, and v3c) and for the first two model layers (w (1) and w (2) ) against in situ measurements: ubRMSD is the unbiased root mean square difference, R is the correlation, and N is the number of sites included in the sample. The first part of the table reports the averaged statistics over all available sites and the entire study period, while the second part shows the same statistics for a common sample of sites, and an overlapping study period (2011)(2012)(2013)(2014)(2015) for the three data sets. The same statistics for the data sets produced using GLEAM v2 are reported between brackets.

Data set Layer
Complete record Overlap period air temperature data sets, respectively (see Table 1). As eddycovariance measurements are generally less reliable during precipitation, rainy intervals are masked from the data sets of in situ evaporation. Finally, only sites with at least 365 daily measurements after masking are included in the validation data set. This yields a total of 91 quality-checked eddycovariance sites (see Table A1) and a total of 2325 soil moisture sensors covering various ecosystems across the globe. Note that the soil moisture sensors are installed at different depths below the soil surface and used to validate both the first (0-10 cm, 1119 sensors) and second (10-100 cm, 1216 sensors) model layers, depending on the installation depth. Sensors located in the same GLEAM grid cell (horizontally or vertically) are not combined, but are treated separately in the validation to avoid problems with potential artifacts resulting from merging sensors with different absolute values and gaps in their record. For the location of the in situ sites selected in this study, we refer to Figs. 4 and 7 (see Sect. 4).

Results and discussion
4.1 Validation of soil moisture 4.1.1 Accuracy of soil moisture estimates Table 2 summarizes the average correlation (R) and unbiased root mean square difference (ubRMSD) for the v3 soil moisture data sets against the in situ measurements. Validation statistics are calculated using all available in situ measurements within the spatio-temporal domain of each of the data sets, as well as using a common set of soil moisture sites and an overlapping period for the three data sets (i.e. 2011-2015). Statistics for analogous data sets obtained using GLEAM v2 (same input data, except for the MODIS land cover fractions) are shown between brackets for comparison. Differences in correlations for the three products are statistically tested for significance using a Student's t-test (at the 10 % level), af-ter applying a Fisher Z transformation to the time series. The autocorrelation of the daily time series was taken into account by reducing the degrees of freedom using an effective sampling size (De Lannoy and Reichle, 2016; Lievens et al., 2017). Note that the first year of each of the data sets is not taken into account for this validation exercise to avoid the effects of model initialization on validation statistics. As indicated by the statistics in Table 2, all data sets compare reasonably well against the in situ measured soil moisture, with correlations for the first model layer (w (1) ) of 0.64, 0.61, and 0.63 for the v3a, v3b, and v3c data sets, respectively. For the second model layer (w (2) ), correlations are lower (ranging from 0.49 to 0.53), which can be expected given the lower representativeness of a single in situ measurement over the thicker model layer (10-100 cm). The ubRMSD yields mean values of approximately 0.06 and 0.05 m 3 m −3 for the first and second model layers, respectively. Overall, the validation statistics shown in Table 2 point to a higher quality of the v3a soil moisture data set compared to v3b and v3c. This is also confirmed by the statistics obtained for the common validation period: for both model layers and in terms of correlations, the v3a soil moisture is superior, with correlation coefficients for w (1) being significantly higher at approximately 20 % of the sites (the opposite is true at 2 % of the sites only). Due to the high autocorrelation in the second layer's soil moisture -strongly reducing the degrees of freedom in the statistical test -correlations for the v3a data set are only higher at approximately 3 % of the individual sites. Permutations of the precipitation forcing amongst the different data sets indicate that the higher quality of the soil moisture in v3a is primarily due to the precipitation forcing used in each data set (not shown), and suggests an overall high accuracy of the MSWEP data as indicated by Beck et al. (2016). We note, however, that more than 75 % of the soil moisture probes are located in the CONUS (Continental United States), where gauge-based precipitation data sets are known to over-perform satellite-based products  Table 3. Average anomaly correlations for different soil moisture data sets (v3a and v3b) and for the first two model layers (w (1) and w (2) ) against in situ measurements: R an is the anomaly correlation and N is the number of sites included in the sample. The first part of the table reports the averaged statistics over all available sites and the entire study period, while the second part shows the same statistics for a common sample of sites, and an overlapping study period (2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015) for the two data sets. The same statistics for the data sets produced using GLEAM v2 are reported between brackets.

Data set Layer
Complete record Overlap period  (Beck et al., 2016). These findings should thus not be extrapolated to other regions. Finally, the difference in quality between v3b and v3c is relatively small, with slightly better statistics for the v3c data set, which integrates SMOS data. For comparison, Table 2 also reports the validation statistics for the same data sets obtained using GLEAM v2. Both the ubRMSD and the correlations suggest that the v3 soil moisture data sets have a higher quality: at 26 , 19 and 12 % of the in situ soil moisture sites, the correlations significantly improve in the case of the v3a, v3b, and v3c data sets, respectively (statistically significant deterioration only occurs at a small number of sites). Although for the second layer significant improvements of R only occur at around 8 % of the individual sites and for all three data sets, overall differences in R are more pronounced for this layer, mainly as a result of both the improved drainage formulation and the optimized data assimilation algorithm. Figure 4 shows maps of the difference in R against in situ measured surface soil moisture for the v3 and v2 data sets. Since most in situ sites are located in the CONUS domain, a detailed view of the results over this area is also presented. As illustrated in these maps, the quality of the soil moisture data sets improves in most regions and for the majority of sites (blue colour). It could be argued that in the Great Plains, the performance of GLEAM v3 is lower than for v2, yet only a limited number of sites are available in this area.
Finally, to better evaluate the skill of GLEAM v3 in capturing the effect of specific precipitation events on the estimated soil moisture -without the influence of the seasonal cyclecorrelations between the anomaly time series of GLEAM soil moisture and the anomaly time series of in situ soil moisture are also calculated (R an in Table 3). Note that to calculate a robust climatology, only in situ sites with at least 5 years of data were used, resulting in a lower subset of stations and no anomaly based evaluation for the v3c data set. As expected, correlations decay after the removal of the seasonal cycle, but remain in the range of 0.48-0.54 for the first layer and 0.43-0.49 for the second layer. In addition, results shown in Table 3 confirm the high accuracy of the v3a soil moisture as compared to the v3b data set (significantly better correlations are obtained at 55 % of the individual sites), and the higher performance of GLEAM v3 over the previous version (v2), with the number of significantly better correlations being similar to those for the regular correlations.

Impact of the data assimilation system
The left-hand side panel in Fig. 5 shows the differences in the correlations against the in situ measurements when data assimilation of satellite soil moisture is included in GLEAM v3 versus when it is not (i.e. model open loop). As more than 75 % of soil moisture validation sites are located in the CONUS, only the results for this region are shown here. For the v3a data set, the assimilation of the CCI soil moisture has a rather neutral to negative impact on the modelled soil moisture states of the first model layer. Generally, correlations are decreasing (red colour) after assimilation in very dry (e.g. western coast of the CONUS) and forested regions (e.g. eastern coast of the CONUS). This decrease is statistically significant at about 10 % of the in situ soil moisture sites, while a neutral effect is obtained at the majority of the sites (89 %). In regions of limited topography and dominated by sparse vegetation (e.g. Great Plains), the quality of the modelled soil moisture is slightly improving (blue colour). For the v3b and v3c data sets, the assimilation of satellitederived soil moisture (ESA CCI v2.3 in v3b and SMOS L3 in v3c) has -in general -a more pronounced and positive impact (blue colour) on the modelled soil moisture, especially in areas such as the Great Plains. The latter can be expected given the higher quality of microwave soil moisture retrievals in regions with low vegetation cover (Dorigo et al., 2015).
The negative impact of assimilating satellite observations of surface soil moisture in the v3a data set is partly ex- such as the eastern and western coasts of the CONUS (red colour), and similar patterns are obtained for the ubRMSD (not shown). In those regions, the assimilation of satellite soil moisture may decrease model performance, especially in the case of the v3a, where differences are more pronounced (see the correspondence to the left panel in Fig. 5). For the v3b and v3c data sets, the difference in quality between satellite soil moisture and open loop is lower (the model open loop is significantly better at 55 and 25 % of the in situ sites for the v3b and v3c data sets, respectively). Moreover, the difference in correlations even becomes positive in regions with low vegetation (see the central panels in Fig. 5), pointing to the higher quality of the satellite-based soil moisture observations as compared to the model estimates in those areas (e.g. the Great Plains). These maps point again to the above-mentioned lower quality of the v3b and v3c precipitation forcing in those regions (TMPA 3B42v7 compared to MSWEP). The subtle differences between the validation results for the v3b and v3c data sets relate to the different quality of the CCI and SMOS soil moisture observations, respectively, bearing in mind the different study period and the number of in situ stations used in their validation. Analogous results for anomaly time series are summarized in Fig. 6 and point to the same conclusions as drawn from Fig. 5.
Finally, it may be argued that differences in quality between the satellite-derived and modelled soil moisture should be reflected in the TCA-based quality factor (γ ) used in the data assimilation algorithm (see Sect. 2.2.2). As outlined in Sect. 2.2.2, the quality factor used in the Newtonian nudging algorithm is estimated on a yearly basis by applying a TCA to the soil moisture anomalies of three independent data sets. Based on Eq. 4 it can be seen that values of γ below (above) 0.5 point to a lower (higher) model error relative to the observation error. The multi-year average quality factor for each of the three data sets is shown in the right-hand side panel in Fig. 5. Spatial patterns in these maps agree well with the ones observed in the central maps, reflecting the ability of the TCA to capture the relative errors of modelled and observed surface soil moisture. Nonetheless, despite the overall low quality factors for the v3a data set (i.e. γ rarely exceeds 0.3) -which reflects the higher error of the observed soil moisture relative to the model open loop -a decrease in quality is often observed when this soil moisture data set is assimilated into GLEAM v3a (see the above discussion). As expected, the quality factors for the v3b and v3c data sets are higher and exceed 0.5 in some low-vegetated regions, indicating again the higher quality of the satellite-based soil moisture observations as compared to the model open loop in these areas. Nevertheless, our simple Newtonian nudging data assimilation system is still assumed to correct for random forcing errors, and other potential effects such as irrigation, that are not explicitly modelled in GLEAM. Table 4 lists the validation statistics for the different evaporation data sets. In contrast to the results of the soil moisture validation exercise (Tables 2 and 3), differences between the three data sets are less pronounced. For the overlapping period 2011-2015 and the common sample of sites, an average correlation of approximately 0.78 and a similar ubRMSD of 0.71 mm day −1 are obtained for all three data sets. In addition, differences are only significant at the 10 % significance level at two in situ stations. Analogous statistical inferences for the validation of GLEAM v2 are shown between brackets and differ only slightly from the ones calculated for the data sets obtained using the new model version. At the majority of sites, no statistically significant difference in R is obtained. Figure 7 shows maps of the differences in correlation against the in situ measurements for the v3 and v2 data sets. Given the low number of in situ sites, no clear conclusions on geographical patterns can be drawn. Over continental Australia, GLEAM v3 performs generally better, except for the v3c data set, where for some sites a deterioration of the results is shown when compared to the corresponding v2c. However, as the validation database for the latter contains a significantly lower number of measurements, due to the shorter time period, it may be less representative of the overall quality of the data set. Correlations for the anomaly time series are listed in Table 5 and confirm the above conclusions.

Accuracy of evaporation estimates
As an example, Fig. 8 shows time series of GLEAM and in situ measured evaporation for two validation sites, i.e. US-Ne3 (Great Plains; see Table A1) on the left-hand side and AU-ASM (central Australia; see Table A1) on the right-hand side. While for the first site the performance of GLEAM v3 tends to be lower, statistics are improving for the second site with respect to the previous version of the model. Overall, time series show a good correspondence between model and in situ measurements. For US-Ne3, correlations drop from 0.82 (v2a) and 0.83 (v2b) to 0.78 (v3a) and 0.77 (v3b); on the other hand, for the SMOS-based data sets (v2c and v3c), correlations increase from 0.73 to 0.76. Analogous differences are obtained in terms of ubRMSD. Despite the apparent decrease in quality for the v3 data sets, the time series shown for US-Ne3 illustrate that the estimates of evaporation are realistic and have no systematic errors. For AU-ASM, correlations consistently improve for all three data sets from 0.84 (v2a), 0.84 (v2b), and 0.78 (v2c) to 0.88 (v3a), 0.88 (v3b), and 0.84 (v3c), and similar improvements are obtained for the ubRMSD. Time series on the right-hand side of Fig. 8 show that the better results are mainly explained by the improved estimates of the evaporative flux during the dry season. For these periods, GLEAM v3 estimates lower volumes of evaporation, resulting in a closer agreement with the in situ measurements, although most of these differ-  (OL, in situ)). Maps in the central panel show the difference in anomaly correlations against in situ measurements for the satellite-derived soil moisture data sets and the GLEAM v3 soil moisture data sets without data assimilation (R an (SAT, in situ) − R an (OL, in situ)). Maps on the right show the quality factor γ calculated in the data assimilation system. The latter balances both the model and observation errors, with values above (below) 0.5 indicating a lower (higher) error in the observations relative to GLEAM. Table 4. Average validation statistics for the different evaporation data sets (v3a, v3b, and v3c) against in situ measurements: ubRMSD is the unbiased root mean square difference, R is the correlation, and N is the number of sites included in the sample. The first part of the table reports the averaged statistics over all available sites and the entire study period, while the second part shows the same statistics for a common sample of sites, and an overlapping study period (2011)(2012)(2013)(2014)(2015) for the three data sets. The same statistics for the data sets produced using GLEAM v2 are reported between brackets.

Data set
Complete record Overlap period ences are not statistically significant. This is mainly related to the new drainage formulation, which allows a faster dry-out during precipitation-free periods, leading to an increase in evaporative stress. Additionally, the new drainage algorithm also yields less extreme evaporation peaks after precipitation events, since the faster drainage implies that the soil profile requires stronger precipitation events to saturate. Results for AU-ASM indicate that these evaporation patterns are realistic under conditions of water stress, yet caution may be taken when extrapolating these findings to other climatic and ecological regimes.

Global magnitude and variability of terrestrial evaporation
The top row in Fig. 9 presents the mean annual evaporation from v3a (left) and a difference map with v2a (right). Analogous results are obtained for the v3b and v3c data sets, but are excluded for simplicity. As expected, the general climatic patterns of evaporation appear realistic, and are comparable to those reported by Miralles et al. (2016a) Figure 7. Difference in quality between the v3 and v2 data sets of terrestrial evaporation (R(GLEAM v3, in situ) − R(GLEAM v2, in situ)). Colours relate to the difference in correlations against in situ measurements for the v3 and v2 evaporation data sets. Statistics are calculated based on all available sites reporting measurements falling within the spatio-temporal domain of the different data sets.
(2016), based on a range of models and different forcing data. Differences in the annual totals between v3a and v2a amount to 100 mm yr −1 in several regions, with overall less evaporation in areas covered by short vegetation and more evaporation in deserts and tropical regions. The total continental evaporation (excluding inland water bodies) amounts to 66 × 10 3 km 3 (v3a) versus the 68 × 10 3 km 3 from the previous version (v2a); these numbers agree well with previously reported values from a range of independent sources -see Miralles et al. (2016a), and references therein.
The remaining maps in Fig. 9 show the partitioning of GLEAM evaporation into its different components, i.e. forest interception loss, transpiration, and bare soil evaporation. Note that for illustrative purposes only and to ease comparison to previous literature (Miralles et al., 2016a), the estimated sublimation is added to the bare soil flux and the evaporation from inland waters (open-water evaporation) is not considered here. Averaged over the entire land surface, approximately 74 % of the total flux of water from land into the atmosphere comes from transpiration, 15 % comes from bare soil evaporation, and about 11 % is the result of interception loss; for the v2a data set, 80, 8, and 12 % are obtained, respectively. These discrepancies are also evident in the difference maps shown in the right-hand side panel in Fig. 9. It can be seen that almost across the entire globe the bare soil evaporation is higher in the v3a data set; only for some drier regions such as the Namibian desert, central Australia, and parts of Chile is the bare soil evaporation decreased. In contrast, transpiration typically increases in these areas. As shown, the total flux of interception loss is generally lower in the new version, except for some parts of Amazonia, eastern China, and the CONUS, where a clear increase may be observed. All these differences are the result of the modified stress functions, but -more importantly -of the new (highresolution) land cover fractions used in GLEAM v3 which report an overall larger fraction of bare soils over the continents. The higher contribution of bare soil evaporation and the lower volumes of transpiration, especially in semi-arid regions like the Sahel, result in closer agreement with the partitioning obtained from other data sets (Wang et al., 2014;Schlesinger and Jasechko, 2014;Miralles et al., 2016a;Good et al., 2015). Nonetheless, Miralles et al. (2016a) recently raised awareness about the use of satellite-based evaporation algorithms to assess the contribution from different evaporation components, and suggested avoiding the use of any single model in isolation due to the large differences found in inter-model comparisons.

Conclusions
The available range of satellite-observable geophysical variables that relate to the process of evaporation -such as soil moisture, air temperature, and net radiation -is continuously growing and the quality of these data sets is constantly improving. As a result, models aiming at the accurate estimation of terrestrial evaporation from satellite observations need to be updated to optimally incorporate these new data. Concurrently, as our knowledge of the relevant physical processes advances based on new experimental evidence, these simple retrieval models should aim to increase their realism. With the overarching goal of improving our understanding of continental evaporation, a next version of the Global Land Evaporation Amsterdam Model (GLEAM v3) -a set of algorithms dedicated to the estimation of global terrestrial evaporation from satellite data -is presented in this paper. Three major modifications are included: (1) a revised representation of the evaporative stress, (2) an optimized water-balance module, and (3) a new soil moisture data assimilation strategy. Using GLEAM v3, three novel data sets of root-zone soil moisture and terrestrial evapo- ration are presented. The first data set (v3a) spans the 36year period 1980-2015, has a global coverage, and is produced using satellite-observed soil moisture, vegetation optical depth and snow-water equivalent, reanalysis air temperature and radiation, and a multi-source precipitation product. The remaining two data sets (v3b and v3c) are produced using satellite-based forcing only, with their difference being the use of SMOS-based VOD and soil moisture (v3c), as opposed to the corresponding CCI forcing (v3b). Both data sets are quasi-global (50 • N-50 • S) and span 2003-2015 for v3b and 2011-2015 for v3c. Results based on the validation of these three data sets against an extensive set of in situ measured evaporation and soil moisture point to a slightly higher quality of the v3a soil moisture data set as compared to the other two data sets, while the quality of the modelled evaporation is rather similar across all three. The higher accuracy of the v3a soil moisture is explained by the high quality of the MSWEP precipitation forcing over the regions where soil moisture probes are located, compared to the satellite-based forcing in the v3b and v3c data sets. Results, however, might be biased given that the vast majority (i.e. more than 75 %) of the in situ soil moisture sites are located in the CONUS, where gauge-based precipitation products are known to outperform satellite prod-ucts (Beck et al., 2016). Finally, the quality of the new v3 data sets is also compared to analogous data sets obtained using GLEAM v2. For the soil moisture, the modifications in GLEAM result in a consistent improvement across the vertical profile. These improvements mainly relate to the optimized drainage algorithm and the new data assimilation system, which allow a more realistic representation of the downward flux of water through the soil profile. On the other hand, the increased quality of the evaporation data is not revealed unambiguously by the in situ validation, likely hampered by the low availability of validation sites. It is illustrated that, on average, the performance of GLEAM v3 is comparable to that of the former version. The partitioning of terrestrial evaporation into its different components shows an increase in bare soil evaporation almost in every continental region, while interception loss generally decreases, and transpiration increases for some dry regions such as the Namibian desert and central Australia. These results are related to the static data set describing the land cover fractions per pixel, which is also updated in GLEAM v3.
Based on the results in this study, it can be concluded that the modifications in GLEAM have led to a more realistic representation of physical processes and an overall increased quality of the data sets, particularly in the case of the root- (e) (f) (g) (h) Figure 9. Global maps of terrestrial evaporation (top row) and the partitioning into its different components, i.e. forest interception loss (second row), transpiration (third row), and bare soil evaporation (bottom row) for the v3a data set. On top, the multi-annual total flux of evaporation for the v3a data set (left) and the difference with the v2a data set (right) are shown. The other maps show the percentage of the total flux in the v3a data set coming from the different components (left) and the difference with the same maps for the v2a data set (right).
zone soil moisture. Following the advances in satellite technology and the increased availability of new data, GLEAM will be further optimized in coming years. Current activities concentrate on the incorporation of new constraints on evaporation, the application of GLEAM to higher resolutions and near-real time, and the improved partitioning of evaporation into its different components. Meanwhile, the data sets of terrestrial evaporation and root-zone soil moisture presented in this study have been made available for studies of hydrological cycle dynamics and climate model benchmarking using www.GLEAM.eu as a gateway.
Code and data availability. The model code of GLEAM v3 is available upon request from the corresponding author. Data sets described in this paper can be freely accessed from www.GLEAM.eu.