Improving predictive power of physically based rainfall-induced shallow landslide models: a probabilistic approach

Distributed models to forecast the spatial and temporal occurrence of rainfall-induced shallow landslides are based on deterministic laws. These models extend spatially the static stability models adopted in geotechnical engineering, and adopt an infinite-slope geometry to balance the resisting and the driving forces acting on the sliding mass. An infiltration model is used to determine how rainfall changes pore-water conditions, modulating the local stability/instability conditions. A problem with the operation of the existing models lays in the difficulty in obtaining accurate values for the several variables that describe the material properties of the slopes. The problem is particularly severe when the models are applied over large areas, for which sufficient information on the geotechnical and hydrological conditions of the slopes is not generally available. To help solve the problem, we propose a probabilistic Monte Carlo approach to the distributed modeling of rainfall-induced shallow landslides. For the purpose, we have modified the Transient Rainfall Infiltration and Grid-Based Regional Slope-Stability Analysis (TRIGRS) code. The new code (TRIGRS-P) adopts a probabilistic approach to compute, on a cell-by-cell basis, transient pore-pressure changes and related changes in the factor of safety due to rainfall infiltration. Infiltration is modeled using analytical solutions of partial differential equations describing one-dimensional vertical flow in isotropic, homogeneous materials. Both saturated and unsaturated soil conditions can be considered. TRIGRS-P copes with the natural variability inherent to the mechanical and hydrological properties of the slope materials by allowing values of the TRIGRS model input parameters to be sampled randomly from a given probability distribution. [..]


I. INTRODUCTION
Rainfall is a primary trigger of landslides, and rainfallinduced landslides are common in many physiographical environments, e.g. [7]. Prediction of the location and time of occurrence of shallow rainfall-induced landslides remains a difficult task, which can be accomplished adopting empirical [1,10,27,28,47], statistical (Soeters and Van Westen, 1996;Guzzetti et al., 1999Guzzetti et al., , 2005Guzzetti et al., , 2006a; Committee on the Review of the National Landslide Hazards Mitigation Strategy, 2004); or process based [3,5,6,11,21,37,46,52,54] approaches, or a combination of them [16,22]. Inspection of the literature, reveals that process based (deterministic, physically based) models are preferred to forecast the spatial and the temporal occurrence of shallow landslides triggered * corresponding author: M. Alvioli, alvioli@pg.infn.it by individual rainfall events over a given area. Process based models rely upon the understanding of the physical laws controlling slope instability with a certain degree of modeling and simplification. Due to lack of information and the poor understanding of the physical laws controlling landslide initiation, only simplified, conceptual models are (currently) possible. These models extend spatially the simplified stability models widely adopted in geotechnical engineering, e.g. [51,57,58], and calculate the stability of a slope using parameters such as normal stress, angle of internal friction, cohesion, porewater pressure, root strength, seismic acceleration, external weights. Computation results in the factor of safety, an index expressing the ratio between the local resisting (R) and driving (S) forces, F S = R/S. Values of the index smaller than 1, corresponding to R < S, denotes instability, on a cell-by-cell basis, according to the adopted model. In order to calculate the resisting and the driving forces, the geometry of the sliding mass must be defined, including the geometry of the topographic surface and the location of the (hypothetical) slip surface. Most commonly, an infinite-slope approximation is adopted [51,57] and it is also used in the US Geological Survey Transient Rainfall Induced and Grid-Based Regional Slope-Stability Model (TRIGRS) model [3,5], within each user-defined cell. Within the infinite-slope approximation, in each cell the slip surface is assumed to be of infinite extent, planar, at a fixed depth, and parallel to the topographic surface. Forces acting on the sides of the sliding mass are neglected.
Modeling of shallow landslides (Fig. 1a) triggered by rainfall adopting the infinite-slope approach, requires time-invariant and time-dependent information. Timeinvariant information includes the mechanical and hydrological properties of the slope material (e.g. unit weight γ, cohesion c, angle of internal friction φ, water content θ, saturated hydraulic conductivity K s ), and the geometrical characteristics of the sliding mass (e.g. gradient of the slope and the sliding plane δ, depth to the sliding plane d fp ). The fact that these parameters are constant in time is an assumption of the model. Time-dependent information consists of the pressure head ψ, i.e. the pressure exerted by water on the sliding mass, a function of the amount (depth, d w ) of water in the terrain [17]. Determining the pressure head, and its spatial and temporal variations, requires understanding how rainfall infiltrates and water moves into the ground. This is described by the Richards equation [38]. This non-linear partial differential equation does not have a closed-form analytical solution, and approximate solutions are used for saturated [e.g. 32] and unsaturated [e.g. 42,43,49] conditions.
The numerical implementation of one such model has been accomplished by Baum et al. [3] in TRIGRS. The program calculates the stability conditions of individual grid cells in a given area, and models infiltration adopting the approach proposed by Iverson [32], for onedimensional vertical flow in isotropic, homogeneous materials, and for saturated conditions. In the code, the forces acting on each individual grid cell are balanced in the centre of mass of each cell, and all interactions with the neighboring grid cells are neglected.
In a second release of TRIGRS, Baum et al. [5] have extended the code to include unsaturated soil conditions, including the presence of a capillary fringe above the water table. TRIGRS can be used for modeling and forecasting the timing and spatial distribution of shallow, rainfall-induced landslides in a given area [3,5,6]. A problem when using TRIGRS, and similar computer codes (e.g. Shalstab, Montgomery and Dietrich, 1994; GEOtop-SF, Simoni et al., 2008), for the modeling of shallow rainfall-induced landslides over large areas resides in the difficulty (or operational impossibility) of obtaining sufficient, spatially distributed information on the mechanical and hydrological properties of the terrain. Adoption of a particular (single) value to describe the mechanical (unit weight γ s , cohesion c, angle of internal friction φ) and the hydrological (water content θ, satu-rated hydraulic conductivity K s ) properties of the terrain may result in unrealistic or inappropriate representations of the stability conditions of individual or multiple grid cells.
In this work, we propose a probabilistic, Monte Carlo approach in an attempt to overcome the problem of poor knowledge of terrain characteristics over large study areas. We obtain the input values for the parameters for the individual runs of TRIGRS using (known, inferred, or hypothesized) probability distributions. Multiple simulations are performed with different sets of randomly chosen input parameters and multiple sets of outputs of the model are obtained. We denote the newly developed code TRIGRS-Probabilistic, or TRIGRS-P. The different outputs are then analysed jointly to infer local stability or instability conditions as a function of the random variability of the input parameters and statistical significance of the multiple outputs is determined.
The paper is organized as follows. First, we summarize the model adopted in the software code TRIGRS, version 2.0 [5], and we introduce our probabilistic extension (Sect. II) implemented in the new code TRIGRS-P. Next, we present a comparison of the performance of the original and the probabilistic simulations for two study areas: Mukilteo, USA, and Frontignano, Italy (Sect. III). Results are discussed in Sect. IV, which focuses on the analysis of the performance of the geographical prediction of the shallow landslides, and on the potential application of the new stochastic code for modeling shallow rainfall-induced landslides over large areas (> 100 km 2 ).

II. OVERVIEW OF THE MODEL
Both TRIGRS and TRIGRS-P frameworks are pixelbased and adopt the same geometrical scheme, the same subdivision of the geographical domain and accept the same inputs; an additional set of parameters is used in TRIGRS-P to specify the variability of the characteristics of the terrain. Within each pixel, slopes are modeled as a two-layer system consisting of a lower saturated zone with a capillary fringe above the water table, overlain by an unsaturated zone that extends to the ground surface. The water table and the (hypothetical) sliding surface are planar and parallel to the topographic surface. The geographical domain represented by an array of grid cells, coincides with the elements of a digital elevation model (DEM) used to describe the topography of the study area (Fig. 1b).

A. Deterministic approach: TRIGRS code
In the original approach coded in TRIGRS [5], the stability of an individual grid cell is determined adopting the one-dimensional infinite-slope model Taylor [51]. The model assumes that failure of a grid cell occurs when the resisting forces R acting on the sliding surface are less than the driving forces S [57,58]. The ratio of the resisting R and the driving S forces gives by the factor of safety F S , where the internal friction angle φ, the cohesion c, and the soil unit weight γ s describe the material properties, γ w is the groundwater unit weight, δ is the angle of the (infinite) planar slope, and ψ is the pressure head (Fig. 1b, see Table IX). Failure occurs when F S < 1. Solution of Eq. (1) requires the computation of the pressure head ψ, which is governed by the Richards [38] equation: where, z is the slope-normal coordinate, t is the time, K z is the vertical hydraulic conductivity that depends on the pressure head ψ, and θ is the volumetric water content (Fig. 1b). Equation (2) is solved in TRIGRS adopting the modeling scheme proposed by Baum et al. [5]. For saturated conditions, TRIGRS uses a modified version of the analytical solutions of Eq. (2) proposed by Iverson [32], for short term and for long-term rainfall periods. Again, the modification consists chiefly in the possibility of using a complex rainfall history [5]. To linearize Eq. (2), Iverson [32] adopted a normalization criterion using a length scale ratio as follows: where D 0 is the maximum hydraulic diffusivity, A is the contributing area that affects hydraulic pressure at the potential failure plane depth d fp , and d 2 fp /D 0 and A/D 0 are the minimum time required for slope-normal (d 2 fp /D 0 ) and for slope-lateral (A/D 0 ) pore pressure transmission (see Table IX). Under the condition ε 1, simplification of Eq. (2) gives [32]: and For unsaturated conditions, the code uses a modified version of the analytical solution of Eq. (2) proposed by Srivastava and Yeh [49], for the case of one-dimensional, transient, vertical infiltration. The modification consists in the use of a variable rainfall history (intensity, duration), allowing modeling of complex rainfall patterns [5]. Equation (2) was linearized in Srivastava and Yeh [49], who adopted the following exponential model [19]: where K s is the saturated hydraulic conductivity, θ r is the residual water content, θ s is the saturated water content, and ψ = ψ − ψ 0 , ψ 0 = −1/α is a constant, with α the inverse of the vertical height of the capillary fringe above the water table [42,43]. Substitution of Eq. (7) into Eq. (2) leads to the partial differential equation: Equation (8) is a linear diffusion equation for which analytical solutions can be obtained using the Laplace, the Fourier, or the Green's function methods [33], once boundary conditions are specified, e.g.: where I ZLT is the steady (initial) surface flux, which can be approximated by the average precipitation rate necessary to maintain the initial conditions in the days to months preceding an event [6]. When a solution of Eq. (8) is obtained, the pore pressure head ψ can be calculated by inversion of Eq. (2). Solutions of Eq. (8) with the boundary conditions listed in Eq. (10) are given in Appendix A1. TRIGRS implements a simple surface runoff routing scheme to disperse the excess water from the grid cells where rainfall intensity exceeds the local infiltration capacity [5,31].

B. Probabilistic approach: TRIGRS-P code
In our extension of the TRIGRS code, we use the same model and equations as in the original code. The innovation consists of using probability distributions to model the slope material and hydrological properties, i.e. the values of the input parameters. The geometry of the slope (δ) and the position of the sliding plane (d fp ) remain unchanged. The model parameters appearing in the equations described in Sect. II A are replaced by functions of random numbers, i.e.: c = c(ξ c ), cohesion; φ = φ(ξ φ ), angle of internal friction; γ s = γ(ξ γ ), soil unit weight; D 0 = D 0 (ξ D0 ), hydraulic diffusivity; K s = K s (ξ Ks ), saturated hydraulic conductivity; θ r = θ r (ξ θr ), residual water content; θ s = θ s (ξ θs ), saturated water content; α = α(ξ α ), inverse height of capillary fringe (11) where ξ i is a random number, with the subscript i used to specify a different parameter, ξ c for cohesion, ξ φ for friction, etc., so that the parameters can be varied independently from each other. Replacing the parameters listed in Eq. (11) into Eqs. (1), (2), (4), (5), and (8), we obtain a system of equations that are initialized with a different, randomly chosen set of parameters at each run of TRIGRS-P. The solution of the various scenarios for saturated or unsaturated cases are performed in the very same way as in TRIGRS. The depth to the potential sliding plane d fp was assumed to coincide with the soil depth, and was estimated by Godt et al. [21] and Baum et al. [6] using variations of the models proposed by DeRose [12] and by Salciarini et al. [41]. Additional choices for initial conditions and corresponding sources of uncertainties will be discussed in the following.
We have implemented two probability density functions (pdf) for generating the modeling parameters: (i) the normal distribution function N , and (ii) the uniform distribution function U. If ξ is a standard normally distributed variable N (0,1) with meanξ = 0 and standard deviation σ = 1, the variable x = x + σ x ξ is normally distributed with meanx and standard deviation σ x , N (x, σ x ). Similarly, if ξ is standard uniformly distributed U(0, 1), the variable y = y a +(y b −y a )ξ is uniformly distributed in the range [y a , y b ], U(y a , y b ). The advantage of using these expansions is that their deterministic limits are obtained for σ x → 0 and for λ ≡ y b −y a → 0, respectively.
In this work, we calculated the stability conditions in the modeling domain for a given set of variables describing the slope materials properties (φ, c, γ s , K s , D 0 , θ r , θ s ) obtained by sampling randomly from the uniform distribution only. There is a conceptual difference between the two distributions for distributed landslide stochastic modeling. Adoption of the Gaussian distribution requires that the investigator has determined (e.g. through sufficient field tests or laboratory experiments) the uncertainty and measuring errors associated with the parameters. The mean and the standard deviation of the Gaussian distribution define unambiguously the uncertainty. Use of the uniform distribution implies that the investigator only knows the possible (or probable) range of variation of the parameters, ignoring the internal structure of the uncertainty. We consider the Gaussian distribution more appropriate to predict rainfall-induced landslides in small areas where sufficient field and laboratory tests were performed to characterize the physical properties of the geological materials, and the uniform distribution best suited in the investigation of large areas where information on the geo-hydrological properties is limited. Further, we consider use of the Gaussian distribution best suited to investigate how errors in the parameters propagate and affect the modeling results, provided that the errors are known. Conversely, use of the uniform distribution allows for investigating how the uncertainty in the model parameters affects the model results. The sensitivity of the extended model to the random variation of model parameters has been explored by running 16 independent simulations, each with a different set of input parameters while keeping unchanged, and equal to the run performed with the original fixed-input TRIGRS model, the terrain morphology (δ) and rainfall history.

III. DETERMINISTIC VS. PROBABILISTIC APPROACH
We tested the performance of the new stochastic version of the numerical code, TRIGRS-P 2.0, against the original TRIGRS code, version 2.0 [5], in two study areas. The first test was conducted in the Mukilteo study area, near Seattle, WA, USA (Fig. 2). This is the same geographical area where Godt et al. [21] and Baum et al. [6] demonstrated the use of TRIGRS in a broad geographical setting. The second test was performed in the Frontignano study area, Perugia, Italy (Fig. 7). This is part of the Collazzone geographical area where Guzzetti et al. [25,26] have investigated the hazard posed by shallow landslides using multivariate classification methods.

A. Mukilteo study area
The three square kilometre study area is located along the eastern side of the Puget Sound, about 15 km north of Seattle, WA, USA (Fig. 2). In this area, rainfall is the primary trigger of landslides. Slope failures are typically shallow (less than three meters thick), and involve the sandy colluvium and the weathered glacial deposits mantling the coastal bluffs [2,18]. The climate of the Seattle area is characterized by a pronounced seasonal precipitation regime with a winter maximum, and 3/4 of the annual precipitation falling from November to April [9]. Storms that trigger shallow landslides in Seattle are generally of long duration (more than 24 h) and of moderate intensity [20]. Three geological units crop out in the area [36] (Fig. 3a) including, from older to younger: (i) transition sediments, comprising the Lawton Clay (Qtb), (ii) advance outwash sand (Qva), and (iii) glacial till (Qvt). The mechanical and hydrological properties of the materials in the three geological units are known through field tests and laboratory experiments [20,21,35], and are summarized in Table I.

Predictions with the deterministic approach
For modeling purposes, the topography of the area was described by a 6 f t×6 f t (1.83 m×1.83 m) DEM obtained through airborne laser-swath mapping [30]. Initial conditions for infiltration were prescribed as zero pressure head at the depth of the lower boundary of colluvium. This is in agreement with field observations [4,21,44]. A constant rainfall intensity I = 4.5 mm h −1 for a period of 28 h was used to force slope instability, for a cumulative event rainfall E = 126 mm. The adopted rainfall history represents a limit case of the rainfall intensity-duration conditions that have resulted in landslides in the Mukilteo area in the winter 1996-1997 [6,21]. Figure 3 shows the results of the runs with deterministic input, for saturated (Eq. 4, Fig. 3b) and for unsaturated (Eq. 5, Fig. 3c) conditions. For the mechanical and hydrological properties of the geological materials (φ, c, γ s , K s , D 0 , θ r , θ s ) we considered the values listed in Table I.
In order to test the model prediction skills, i.e. the ability of the model to forecast the known distribution of rainfall-induced landslides [25], the two geographical distributions of the factor of safety F S were compared to a landslide inventory showing slope failures triggered by rainfall in the winter 1996-1997 [2,21], displayed by black lines in Fig. 3. For the comparison, all grid cells with F S < 1 were considered unstable (i.e. potential landslide) cells. Four-fold plots and maps showing the geographical distribution of the correct assignments and the model errors (Fig. 3e, f) are used to summarize and display the comparison. Four-fold plots are graphical representations of contingency tables (or confusion matrices), and show the fraction (or number) of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) [13,40]. In our analysis TP is the percentage of cells with observed landslides, which are predicted as unstable by the model; similarly, TN is the percentage of cells without landslides predicted as stable by the model. Correspondingly, FP (FN) are the percentage of predicted unstable (stable) cells without (with) observed landslides. We will refer to both TP and TN as correct assignment in the following, while FP and FN are model errors.
To further quantify the performance of the deterministic forecasts, different metrics were computed (Table II)

Predictions with the probabilistic approach
Based on the comparison of the results discussed in the previous section, for the stochastic modeling we used only the unsaturated soil conditions, and we exploited the same geomorphological information (i.e. the same DEM) and the same rainfall forcing input (i.e. 4.5 mm h −1 of rain for a 28-h period) used for the previous runs. For the mechanical and hydrological properties of the geological materials (φ, c, γ s , K s , D 0 , θ r , θ s ) we considered the values listed in Table I, used in the previous paragraph as fixed inputs of the model, as mean values of uniformly distributed variables U(y a , y b ), where y a and y b are the lower and upper limits of the uniform distribution determining the range of variation of each parameter. In our simulations, the range of variation of the individual parameters has been chosen as a fraction of the mean value of each variable. A range of variation λ = 0.01, 0.10 and 1.00 correspond to a variation of 1 %, 10 % and 100 % around the mean value of the variable, respectively. Note that the case with λ = 0.01 allows the various input parameters to vary in a very limited range, and it can be seen as a test of our code: the original TRIGRS results with fixed input parameters should be obtained.
We performed two sets of runs. In the first set, the mean values of the mechanical and hydrological parameters (Table I) were kept constant, and the range of variation of the individual parameters was modulated using λ = 0.01, 0.1, 0.5, 1.0. In the second set, a fixed range of variation for the individual parameters was selected, λ = 1.0, and the mean value of the parameters was modified (shifted) by ν = 0.2, 0.4, . . . , 1.0, 2.0. Note that when ν = 1.0, no shift of the mean value is performed. In each test, the same range of variation λ and the same shift of the mean value ν were applied to all the parameters. The simplification was adopted to reduce the time required to perform multiple runs. The results are shown in Fig. 4: (i) for the first set of runs, i.e. for fixed mean values of the model parameters and changing ranges of variation of the individual parameters, λ = 0.01 (Fig. 4a), λ = 0.5 (Fig. 4b), and λ = 1.0 (Fig. 4c), and (ii) for the second set of runs, i.e. for a fixed range of variation λ = 1.0, and shifting the mean value of the model parameters by ν = 0.8 ( Fig. 4d), ν = 0.9 (Fig. 4e), and ν = 1.1 (Fig. 4e). For the second set of runs, results obtained for ν < 0.8 and for ν > 1.1 are not shown in Fig. 4. For ν < 0.8 the number of unconditionally unstable cells was unrealistically large, and for ν > 1.1 the model performance decreased rapidly (see next paragraph). We used 16 runs for each set, resulting in 16 different maps of the factor of safety, which were used to evaluate the performance of the probabilistic approach. The results are shown in Fig. 5; for the same runs of Fig. 4, the maps show the geographical distribution of the correct assignments (TP, TN), the model errors (FP, FN), and the corresponding four-fold plots. Tables III and IV list metrics that quantify the performance of the probabilistic approach.

Analysis and discussion
Inspection of the results of the deterministic (Fig. 3) and the stochastic (Figs. 4 and 5) models, and of their forecasting skills (Fig. 6, Tables II-IV), allows for general considerations. Figure 6 shows a Receiver Operating Characteristics (ROC) plot [13], defined by the false alarm rate FPR and the hit rate TPR, plotted on the x-and y-axes, respectively. In the ROC space, a point located in the upper left corner represents a perfect prediction (TPR = 1 and FPR = 0), and points along the diagonal line for which TPR = FPR represent random predictions. An acceptable prediction requires TPR/FPR > 1 [13]. In Fig. 6, two separate points show the predictive performance of the two runs with deterministic inputs, for saturated ( Fig. 3b) and for unsaturated (Fig. 3c) conditions each of them producing a single pair FPR-TPR and a unique geographical distribution of the factor of safety F S . Analysis of Figs. 3 and 6, and of Table II indicates that the model prepared considering the soil unsaturated conditions (Fig. 3c) performed better than the model prepared considering saturated conditions (Fig. 3b). The larger value of the TPR/FPR ratio is a measure of the better predicting performance of the unsaturated model (TPR/FPR = 3.46, Fig. 6), compared to the saturated model (TPR/FPR = 1.87, Fig. 3), despite a lower TPR value (TPR = 0.42 vs. TPR = 0.71, Table II). This is in agreement with previous work of Godt et al. [21] and Baum et al. [6].
Within the two deterministic models, the one using the unsaturated soil conditions (Fig. 3c, f) performed better than the model that used the saturated soil conditions (Fig. 3b, e). The saturated model predicted a significantly larger fraction of the study area as unstable, mainly where terrain gradient exceeded 15 o . This resulted in a considerably larger number of true positives (TP, 7.1 % vs. 4.1 %), but also a significantly larger number of false positives (FP, 33.8 % vs. 10.4 %) and a correspondingly significantly lower number of true negatives (TN, 56.1 % vs. 79.5 %). In other words, the saturated deterministic model (Fig. 3b) was more pessimistic than the unsaturated deterministic model (Fig. 3c). This is well represented in Fig. 6, where a reduction of the false positive rate from 0.38 to 0.12 results in a reduction of the hit rate from 0.71 to 0.41 (Table II). The subsequent runs with probabilistic input were obtained assuming unsaturated soil water conditions. The results of the unsaturated stochastic models (Fig. 4) were similar to the results of the corresponding unsaturated deterministic model (Fig. 3c). This is a significant result, confirming that treating the uncertainty associated with the model parameters with a probabilistic approach has not significantly changed the model results, which have remained consistent. Availability of multiple model outputs for each run allowed preparing ROC curves to measure quantitatively the predictive performance of the stochastic models [13]. Since multiple values of F S are available for each pixel in the modeling domain, we can calculate the frequency of stability condition of each pixel. We attribute to this frequency the meaning of a probability and compare it with a given threshold. Modulation of the classification threshold allows us to obtain different FPR and TPR values, which can be used to construct a ROC curve [13]. In Fig. 6 two sets of ROC curves are shown using different colors. The red curves show the performances of the first set of runs, for λ = 0.01, λ = 0.5, and λ = 1.0, with ν = 1.0, and the blue curves show the performances of the second set of runs, for ν = 0.8, ν = 0.9, and ν = 1.1, with λ = 0.5. To construct the ROC curves, several probability thresholds were used, from 0.1 to 0.9 by 0.1 steps. The area under the ROC curve AUC is taken as a quantitative measure of the performance of the classification. If AUC = 0.5, a classification is poor and indistinguishable from a random classification, whereas a perfect classification has AUC = 1 [13,40].
Inspection of Figs. 6 and 5, and of Table III, suggests that an increase in the range of variation of the model parameters (from λ = 0.01 to λ = 1.0), corresponding to a significantly larger degree of uncertainty in the parameters, resulted in similar individual performance indices, but significantly larger values of the area under the ROC curve, AUC. In our experiment, the increase in the range of variation changed the performance index from AUC = 0.65 (for λ = 0.01) to AUC = 0.73 (for λ = 0.5), with an increase of performance of 16 %. A further increase of the range of variation to λ = 1.0, a possibly unrealistic range of variation for some of the modeling parameters, has resulted in a value of AUC = 0.67, decreasing the model performance. Modulation of the mean value of the parameters, using ν = 0.8, ν = 0.9, and ν = 1.1, resulted in better results (larger AUC values) for λ = 0.5 than for λ = 0.01. Moreover, the TPR, FPR, PPV and ACC metrics did not change significantly when the range of variation λ of the model parameters were modified, and remained similar to the values obtained with the deterministic models, for λ ≤ 0.5. We conclude that, in the Mukilteo study area, these metrics are not sensitive to introduction of the probabilistic determination of the model parameters. Second, the AUC showed a positive correlation with the range of variation in the model parameters.
In the probabilistic runs, a positive correlation was observed between the range of variation λ and the fraction of unconditionally unstable cells, i.e. the grid cells that have F S < 1 even in dry conditions when no rainfall is increasing pore pressure and slope instability. For the first set of runs, the fraction of unconditionally unstable cells was 0 % for λ = 0.01, 0.3 % for λ = 0.5, and 0.7 % for λ = 1.0. Moreover, a negative correlation was observed between ν, the width of shift in the mean value of the modeling parameters, and the fraction of unconditionally unstable cells. For the second set of runs, the fraction of unconditionally unstable cells was less than 5.0 % for ν ≥ 0.8, and was 0 % for ν > 1.0, independent of the range of variation of the parameters.

B. Frontignano study area
The Frontignano area is located in central Umbria, Italy, about 25 km south of Perugia, in the Collazzone area (Fig. 7). In this area, landslides are caused primarily by rainfall and rapid snowmelt [8,15,25,26]. Multiple deep-seated and shallow slides were identified in the area through the visual interpretation of multiple sets of aerial photographs and very-high resolution satellite images, and field surveys.
The shallow failures are typically less than three meters thick, and involve the soil and the colluvium mantling the slope. Soils range in thickness from a few decimetres to more than one meter; they have a fine to medium texture, and exhibit a xeric moisture regime, typical of the Mediterranean climate. In central Umbria, precipitation is most abundant in October and November, with a mean annual rainfall in the period 1921-2001 exceeding 850 mm. In the study area, terrain is hilly, and the lithology and the attitude of bedding planes control the morphology of the slopes. Gravel, sand, clay, travertine, layered sandstone and marl, and thinly layered limestone, crop out in the area [8,25,26].

Predictions with the deterministic approach
For modeling purposes, the topography of the Frontignano study area was described by a 5 m × 5 m DEM obtained interpolating 5-m contour lines shown on 1 : 10 000 scale topographic base maps [25,26]. Slope in the area ranges from 0 o to 62 o , with an average value of 10 o and a standard deviation of 5.6 o (Fig. 8d). The mechanical and hydrological properties of the five soil types cropping out in the area (Fig. 8a) were determined through laboratory tests and searching the literature [see, e.g. 14, 34, 45, and references therein] on the geotechnical properties (φ, c, γ s , K s , D 0 , θ r , θ s ) of the same or similar sediments in Umbria, Italy (listed in Table V). As for the Mukilteo area, the depth to the hypothetical sliding plane d fp was assumed to coincide with the soil depth, which was estimated using the model proposed by DeRose [12]. To calibrate the soil depth model, we exploited field observations indicating that the depth of the shallow landslides in the study area is d fp < 3 m, and that shallow landslides are most abundant where terrain gradient is in the range 7 o ≤ δ ≤ 20 o . Initial depth to the water table was set to a fraction of the depth to the failure plane, d w = 0.85d fp . We tested different rainfall histories, and adopted a forcing rainfall that produced shallow landslides in the area in the periods January-May 2004, October-December 2004, and October-December 2005 [15,29]. Specifically, we used a rainfall history composed of a 4-week initial rainfall period characterized by a constant mean rainfall intensity I = 0.36 mm h −1 , for a cumulative rainfall E = 242 mm, followed by a 60-min rainfall period characterized by a high rainfall intensity I = 90 mm h −1 , for a cumulative rainfall E = 90 mm. Results for the saturated (Eq. 4) and the unsaturated (Eq. 5) modeling conditions are shown in Figs. 8b, c, respectively.
In order to test the model performance, the geographical distributions of the factor of safety F S predicted by the TRIGRS model were compared to the known distribution of rainfall-induced landslides mapped in the same area in the periods January to May 2004, October to December 2004, and October to December 2005. The landslides were mapped through reconnaissance fieldwork and the visual interpretation of high-resolution satellite images [15,29], and are shown with black lines in Fig. 8. For the comparison, all grid cells with F S < 1 were considered unstable (i.e. landslide) cells. As for the Mukilteo test case, four-fold plots (Fig. 8e, f) and derived metrics (Table VI), ROC plots (Fig. 11), and maps showing the geographical distribution of the correct assignments and the model errors (Fig. 8e, f) were used to summarize and measure the comparison. Figs. 8 and 11, and analysis of Table VI, suggests that the saturated and the unsaturated models produce very similar results. This is different from the result obtained in the Mukilteo area, where the unsaturated model performed better than the saturated model. In the Frontignano area, the unsaturated model (Fig. 8c) resulted in a better forecasting accuracy (ACC, 0.86 vs. 0.75), but in a reduced TPR to FPR ratio (1.4 vs. 1.7). We maintain that the model prepared considering the saturated conditions (Fig. 8b) performed slightly better than the model obtained considering the unsaturated conditions (Fig. 8c).

Predictions with the probabilistic approach
The mechanical and hydrological properties of the geological materials (φ, c, γ s , K s , D 0 , θ r , θ s ) in the study area were chosen as listed in Table V (also used as input of the original TRIGRS model, in the previous paragraph) as mean values of uniformly distributed variables U(y a , y b ). To be consistent with the approach adopted in Mukilteo, we performed two sets of parametric analyses, varying the range (λ) and the mean value (ν) of the model parameters. The maps in Fig. 9 show the factor of safety F S calculated for: (i) fixed mean values of the model parameters ν = 1.0, and changing ranges of variation of the individual parameters, λ = 0.01 (Fig. 9a), λ = 0.75 (Fig. 9b), and λ = 1.0 (Fig. 9c), and (ii) a fixed range of variation λ = 0.75, and shifting the mean value of the model parameters by ν = 0.8 (Fig. 9d), ν = 0.9 (Fig. 9e), and ν = 1.1 (Fig. 9f). As in the previous case, λ = 0.01 corresponds to a very small range of variability of the parameters, and gives back the previous results; for ν = 1.0 no shift in the mean values of the model parameters is performed. The degree of accuracy of the two sets of runs for the Frontignano area is shown in Fig. 10, for the same (average) models shown in Fig. 9. The maps show the geographical distribution of the correct assignments (TP, TN), the model errors (FP, FN), and the corresponding four-fold plots. Tables VII and VIII list metrics that quantify the performance of the runs. The performance of the stochastic models is further analysed in Fig. 11 by two sets of ROC curves, shown using different colours; red curves for the case of variable range λ, and blue curves for the case of a variable mean ν. In the same plot, the grey circle shows the predicting performance of the saturated model (Fig. 8b), and the grey square the performance of the unsaturated model (Fig. 8c) both run with fixed input parameters.

Analysis and discussion
Inspection of the results of the fixed input runs (Fig. 8), the runs with input parameter sampled from a suitable probability distribution, (Figs. 9 and 10), and of their ability to forecast the spatial distribution of known land-slides (Fig. 11, Tables VI-VIII), allows for general considerations that are similar to those discussed for the Mukilteo study area (see Sect. 3.1.3), with a few differences. In the Frontignano area, the saturated and the unsaturated models resulted in nearly equivalent results, with the saturated model considered marginally superior primarily because of the reduced value of the TPR to FPR ratio. From a statistical point of view, given the reduced fraction of landslide area in Frontignano (1.5 %) compared to Mukilteo (4.2 %), the spatial prediction of landslides in Frontignano was more difficult than in Mukilteo. From a physical point of view, modeling the stability conditions in low gradient terrain is very sensitive to the initial conditions, which are uncertain and difficult to determine spatially. The runs with variable input parameters confirm the slightly poorer geographical predictive performance of the adopted physical framework in Frontignano, compared to Mukilteo Tables III and IV vs. Tables VII and VIII). Taking the area under the ROC curve (AUC) as the metric to compare the models, one can readily see that runs for the Mukilteo area resulted in 0.65 ≤ AUC ≤ 0.73, and for the Frontignano area exhibited 0.59 ≤ AUC ≤ 0.65. In other words, the "worst" result for Mukilteo (AUC = 0.65, for ν = 1.0 and λ = 0.01) has the same overall spatial predictive performance of the "best" result for Frontignano (AUC = 0.65, for ν = 0.8 or 0.9 and λ = 0.75). In the Frontignano area, despite a lower "absolute" performance (i.e. when compared to Mukilteo), adoption of a probabilistic approach improved the spatial forecasting skills. Again, taking AUC as a metric to compare the models, values of this metric increased from AUC = 0.59 (for ν = 1.0 and λ = 0.01), to AUC = 0.65 (for ν = 0.8 or 0.9 and λ = 0.75). This is a non-negligible improvement of about 10 %. The result confirms that adoption of a stochastic framework to the distributed modeling of shallow landslides results in improved spatial forecasts. The result further corroborates the finding that modeling the natural uncertainty (and poor understanding) of the mechanical and hydrological variables results in better spatial landslide predictions of the locations of rainfallinduced landslides (see insets in Fig. 4). First, the TPR, FPR, PPV and ACC metrics did not change significantly when the range of variation λ of the model parameters was changed. These metrics remained similar to the values obtained with the fixed input model, confirming that they are not sensitive to differences between stochastic framework runs with random variations of parameters and runs with fixed parameters. Second, the area under the ROC curve AUC confirmed its positive correlation with the range of variation in the model parameters λ, in support of the probabilistic approach. Third, the positive correlation between the range of variation λ and the fraction of unconditionally unstable cells, and the negative correlation between the shift in the mean value of the modeling parameters ν and the fraction of unconditionally unstable cells, were both confirmed.

IV. DISCUSSION
Our new probabilistic approach to the distributed modeling of shallow landslides proved effective in the two study areas where it was tested (Figs. 2 and 7). In both areas, the maps showing the geographical distribution of the (average) factor of safety F S obtained using TRIGRS-P were better predictors of the distributions of known rainfall-induced landslides than the corresponding maps obtained adopting the original TRIGRS approach. This conclusion is supported by the indices used to measure the forecasting skills of the different models, and particularly the area under the ROC, AUC (Tables II-IV for  Mukilteo, and Table VI, VII, VIII for Frontignano). The runs in which we allowed a large variability of the input parameters (e.g. λ = 0.50 or λ = 0.75) were better predictors of the geographical distribution of known landslides than the models prepared using a reduced variability in the model parameters (e.g. λ = 0.1) [25,40]. This is shown in the insets in Fig. 4, where a portion of the results for the Mukilteo study area is shown at a larger scale. The variability of the geographical distribution of the F S is also shown in Fig. 12 where we have plotted the minimum, the maximum, and the standard deviation of the computed F S values. In particular, the map of the standard deviation provides quantitative and spatially distributed evidence of the uncertainty associated with the distributed modeling of landslide instability.
We studied the variation of the computed factor of safety. Figure 13 shows histograms for the distribution of the values of the factor of safety F S in selected grid cells in the Mukilteo (Fig. 13a-c) and the Frontignano (Fig. 13d-f) study areas. For simplicity, in the Figure we show the results obtained for a single lithological type, i.e. the transition sediments (Qtb, indicated as unit 1 in Table I) in the Mukilteo area (Fig. 3a), and the sand-siltclay (unit 5 in Table V) in the Frontignano area (Fig. 8a). Results for other lithological types in the two study areas are similar. We adopted the following procedure to obtain the histograms. First, we performed 100 stochastic simulations to obtain a large set of values of the factor of safety F S , and we computed the average value of the factor of safety, F S for each grid cell in the two modeling domains. For both study areas, a value of λ = 0.50 (and ν = 1.0) was used for the variability of the geotechnical parameters. Next, we selected three subsets of 1000 grid cells, with 0 < F S ≤ 1.5, 1.5 < F S ≤ 3.0, and F S > 3, respectively. Finally, we used all the computed values of the F S in each subset to construct the histograms. Inspection of the histograms reveals that for F S > 3 (Fig. 13c, f) the distribution of the predicted factor of safety does not show a predominant value, and is almost uniform. Instead, for F S < 1.5 the distribution of the predicted factors of safety peaks at F S ≈ 1.0 (Fig. 13a, d). For 1.5 < F S ≤ 3.0, results are intermediate (Fig. 13b, e).
In conclusion, the probabilistic approach results in a number (we have performed 16 runs) of model outputs, each representing the geographical distribution of the F S values. Availability of multiple results allows for the analysis of the sensitivity of the model to variations in the input parameters controlling the stability conditions. Variability depends on multiple causes, including: (i) the natural variability in the geotechnical and hydrological properties of the soils; (ii) the inability of determining accurate values for the geotechnical and hydrological parameters, and (iii) the fact that the models are simplified and do not represent the natural (physical) conditions in the study area.
The probabilistic approach allowed the investigation of the combined effects of the natural variability inherent in the model parameters, and of the uncertainty associated with their definition over large areas. However, the approach cannot separate the two causes for the variability. Also, the probabilistic approach cannot validate the physics in the model better than the deterministic approach. It should be noted that in our runs with probabilistic input parameters, the geotechnical and hydrological properties were treated explicitly as independent (uncorrelated) variables. This was a simplification. In reality, some dependence (correlation) exists between the different geo-hydrological properties. As an example, the saturated water content θ s affects the saturated hydraulic conductivity K s and the hydraulic diffusivity D 0 . However, selection of values for the different properties based on field tests, laboratory experiments, or through a literature search resulted in values for the considered properties that were implicitly dependent. This is because, e.g. cohesion, angle of internal friction, soil unit weight, and hydraulic conductivity depend one upon the other. Furthermore, no spatial correlation of the individual variables was considered in the modeling. This was also a simplification, because spatial correlation exists between the geo-hydrological properties [e.g. 39,56]. Adoption of the uniform distribution to determine the possible range of variation of the individual parameters, combined with the accepted modeling simplifications, has resulted in more "extreme" results, but not in unrealistic results.
Results of our approach were obtained adopting the uniform distribution to describe the uncertainty associated with the geo-hydrological parameters. TRIGRS-P allows for the use of the Gaussian and the uniform distributions. In the runs presented in this work, we explored only part of the variability associated with the physical model describing slope instability forced by rainfall infiltration (Fig. 1b), and specifically the variability associated with the mechanical and hydrological parameters of the materials involved in the hypothetical landslides. We did not consider the local morphological variability, e.g. the uncertainty in the description of the terrain given by the DEMs. Terrain gradient is an important parameter for the computation of the factor of safety F S . Inspection of Eq. (1) shows that variability in the terrain gradient δ results in variability in the local stability conditions, measured by F S . Furthermore, in our runs soil depth was a (non-linear) function of the local slope [12,41]. Varia-tions in the slope will result in variations in soil thickness, and in the local stability conditions. Preliminary results obtained adding a uniform random perturbation to the DEM for the Frontignano area confirmed the (large) sensitivity of the physically based models to the topographic information [37,50,53].
Rainfall history and geographical pattern also control the local stability conditions, and their temporal and spatial variations. For Mukilteo, we used the measured rainfall history that triggered shallow landslides in the winter 1996-1997. For Frontignano, we used the rainfall history that has resulted in shallow landslides in the winter 2004-2005. However, sensitivity of the models to the temporal and spatial variation of rainfall was not investigated. The rainfall data used in the two runs were obtained from rain gauges located in the vicinity of the study areas. The rainfall measurements may not represent the exact amount of rainfall at each grid cell in the modeling domain. We further assumed a uniformly distributed rainfall in the geographical modeling domains. Runs performed in the Frontignano area adopting different rainfall histories (e.g. (i) a uniform rainfall rate of 0.36 mm h −1 for a 4-week period, for a cumulated rainfall E = 242 mm, (ii) a single rainfall event with 5 mm h −1 for 24 h, E = 121 mm, and (iii) intermittent 3-day rainfall periods with I = 1.0 mm h −1 separated by 4-day dry periods, for a 4-week period, E = 288 mm) revealed that the geographical distributions of the F S obtained with the different rainfall histories were similar. However, the local instability conditions (F S ≤ 1) were reached at different times. The difference may be significant if the model results are used in a landslide early warning system [1,20]. We did not evaluate the sensitivity of the model parameters to the different rainfall histories.
Eventually, it should be noted that the probabilistic approach of TRIGRS-P could be used to infer reasonable values of the parameters describing terrain characteristics, where they are largely unknown, by exploring a large parameter space in a random way and comparing with known distributions of landslides.
Adoption of a probabilistic approach with multiple runs using randomly generated different set of input parameters results in longer computer processing times. The time required for a single TRIGRS-P simulation is only slightly longer than the time needed for the corresponding TRIGRS simulation, since the random variables were computed before running the slope stability and infiltration model. The time for this initial step depends on the size (in grid cells) and complexity of the modeling domain. The processing time of the multiple runs required by the TRIGRS-P approach to have a statistical significance may be easily reduced by exploiting the multi-core architecture of modern CPUs, just running simultaneously multiple instances of the TRIGRS-P code initialised with different sets of parameters. Since our aim is to eventually use the TRIGRS as a region-wide and possibly nation-wide early warning system, we give an estimate of the computing resources required. Us-ing the same spatial resolution, a larger area will require a larger processing time, with the time increasing linearly with the number of grid cells. The time required for a simulation depends also on rainfall history. A more complex history (i.e. a shorter step between two subsequent inputs of rainfall intensity) will result in a longer processing time, with time increasing with the square of the time steps. Finally, processing time depends on the type of hydrological model used, with the saturated model requiring roughly half the time of the unsaturated model.
When using the probabilistic approach, we adopted a strategy based on a convergence level, η. First, we computed two stochastic sets with n and m > n simulations. Next, for the two independent sets and for each grid cell, we computed the mean of the factor of safety F S . Then, we obtained the difference of the mean values of the factor of safety ∆F S for each cell, and we identified the maximum value of max(∆F S ) in the modeling domain. If max(∆F S ) ≤ η, the convergence level was reached and no additional simulations were performed. Instead, if max(∆F S ) > η convergence was not reached, a larger stochastic set was prepared, and the test repeated. In our two study areas 16 simulations were sufficient to obtain a convergence level η = 0.05. This level was considered adequate for the two study areas. This may not be the case in other areas, in significantly large areas, or in areas characterized by a larger physiographical variability. For simulations covering large areas, we hypothesized areas extending between 10 1 and 10 5 km 2 with grids of resolution from 1 m × 1 m to 30 m × 30 m, and computed the memory usage and execution time for (i) a single deterministic simulation adopting a saturated soil model (Fig. 14a), and (ii) a stochastic set of 16 simulations using an unsaturated soil model (Fig. 14b).
Since the TRIGRS (and TRIGRS-P) model uses a cellby-cell description of the study area, and the equations describing the stability of each cell are independent from the neighboring cells behavior, the code is most suited for a parallel implementation using MPI libraries. We have performed preliminary simulations, showing that a significant speedup ( 1/N , with N the number of processing elements used) can be obtained for the computingintensive portions of the code. One problem associated with significatly large areas is the utilization of memory. In a truly parallel implementation of the code, each computing element or core should load into memory only the portion of data relevant to its task, which is currently not implemented and this kind of improvement is beyond the scope of this work.

V. CONCLUSIONS
We prepared a probabilistic version of the Transient Rainfall Infiltration and Grid-Based Regional Slope-Stability Analysis code, TRIGRS [3,5], and tested the new code TRIGRS-P in two study areas: Mukilteo, near Seattle, USA, and Frontignano, near Perugia, Italy. The tests suggest that the runs initialized with random values of the input parameters generated according to proper probability distribution functions, were better predictors of the spatial location of rainfall induced shallow landslides than the corresponding original TRIGRS runs. This was measured by different metrics used to evaluate the comparison of the spatial forecasts of the instability conditions (F S values) against maps showing recent rainfall induced landslides, in the two study areas. Adoption of a probabilistic-initiated framework allowed the investigation of the sensitivity of the model used to determine the stability conditions to the geotechnical and hydrological properties of the terrains where landslides can develop. The observed sensitivity was attributed to the combined effect of the natural variability inherent to the geotechnical and hydrological properties of the slope materials, and to the fact that the numerical model is an approximate representation of the complex processes controlling rainfall induced slope instability in an area. However, the stochastic approach cannot separate the two causes of variability. Stochastic modeling of rainfall induced shallow landslides requires longer processing times, when compared to the corresponding deterministic modeling. A parametric study proved that the approach is computationally feasible even for very large areas (10 4 km 2 , 10 8 grid cells) if a computer grid is used, and a parallel computing strategy is adopted. We expect the probabilistic approach to improve the current capability to forecast the occurrence of rainfall induced shallow landslides, and to facilitate the investigation of the variability of slope material properties over large areas.
Appendix A: Formulation of the model In this appendix we summarize the solutions of Eq. (2) implemented in deterministic code TRIGRS [6]. Approximations are given for: (i) unsaturated soil conditions [49], (ii) saturated soil conditions [32], and (iii) a twolayer soil model [6] represented schematically in Fig. 1b.

Unsaturated soil
In their model for an unsaturated soil, Srivastava and Yeh [49] use relation (Eq. 2) to linearize Eq. (2). The explicit solution for the hydraulic conductivity (Eq. 7), subject to the initial and boundary conditions given by Eq. (8), is the following: where α 1 = α cos 2 δ, I ZLT is the initial surface flux (where the subscript LT indicates the long term infiltration rate), I Z is the surface flux of a given intensity for the considered time interval, D ψ = α 1 K s /(θ s − θ r ) and Λ m are the positive roots of the pseudoperiodic characteristic equation tan (Λα 1 d w ) + 2Λ = 0. The pressure head ψ (Z, t) in the unsaturated zone is obtained by inversion of Gardner [19] equation, Eq. (7):

Saturated soil
For wet initial conditions, Iverson [32] gave an explicit solution of the linearized Richards equation, for long term and for short term behavior. The long term represents the steady component: where z = Zcosδ and d w is the depth to the water table (see Fig. 1b). The short term represents the transient component: where T is the rainfall duration, D = 4D 0 cos 2 δ is an effective hydraulic diffusivity, and erfc is the complementary error function:

Two-layer soil model
The linearized Richards equation allows for the superposition of solutions. Baum et al. [3] have extended the Iverson [32] and the Srivastava and Yeh [49] solutions to the case of a time-varying sequence of surface fluxes with variable intensity and duration. They also considered an unsaturated layer of depth d and depth to the top of the capillary fringe d u (see Fig. 1b). Solution of Eq. (A1) was generalized as follows: where I nZ is the surface flux of a given intensity for the n-th time interval, and H(t − t n ) is the Heaviside step function. The Iverson [32] solutions of Eqs. (A4) and (A5) are generalized as:   (Fig. 3a). c, cohesion; D0, hydraulic diffusivity; Ks, saturated hydraulic conductivity; θs, saturated water content; θr, residual water content; α, inverse of capillary fringe. The friction angle φ has a common value of 33 1. (A) Example of a rainfall-induced shallow landslide of the soil slide type in the Collazzone area, Umbria, Italy (Fig. 7).
(B) Schematic representation of the slope-infinite model showing the coordinate system and variables used in the deterministic and stochastic models. See Table IX for the symbols description.    Table I 6. The results of simulations for the Mukilteo study area, presented using ROC curves. The grey square and circle represent the results obtained using the original TRIGRS code with saturated and unsaturated initial conditions, respectively (Fig. 3b, c); the curves correspond to the results obtained with the TRIGRS-P code, using the variability of input parameters shown in the inset as described in the text (Figs. 4 and 5).   Table V); clay (2 in Table V); flysch deposits (3 in Table V); gravel, sand, silt, and clay (4 in Table V); sand, silt, and clay (5 in Table V) (Fig. 8b, c) and with the probabilistic approach with random input parameters (Figs. 9 and 10).