Automatic delineation of geomorphological slope-units with r . slopeunits v 1 . 0 and their optimization for landslide susceptibility modelling

Automatic subdivision of landscapes into terrain units remains a challenge. Slope-units are terrain units bounded by drainage and divide lines, but their use in hydrological and geomorphological studies is limited because of the lack of reliable software for their automatic delineation. We present the r.slopeunits software for the automatic delineation of slope-units, given a digital elevation model and a few input parameters. We further propose an approach for the selection of optimal parameters controlling the terrain subdivision for landslide susceptibility modelling. We tested the software and 5 the optimization approach in Central Italy, where terrain, landslide, and geo-environmental information was available. The software was capable of capturing the variability of the landscape, and to partition the study area into slope-units suited for landslide susceptibility modelling and zonation. We expect r.slopeunits to be used in different physiographical settings for the production of reliable and reproducible landslide susceptibility zonations.


Introduction
The automatic subdivision of large and complex geographical areas, or even entire landscapes, into reproducible, geomorphologically coherent terrain units remains a conceptual problem and an operational challenge.Terrain units (TUs) are subdivisions of the terrain that maximize the within-unit (internal) homogeneity and the between-unit (external) heterogeneity across distinct physical or geographical boundaries (Guzzetti et al., 1999;Guzzetti, 2006;Komac, 2006Komac, , 2012;;Saito et al., 2011;Sharma and Mehta, 2012;Fall et al., 2006;Li et al., 2012;Erener and Düzgün, 2012;Schaetzl et al., 2013;Mashimbye et al., 2014).A slope unit (SU) is a type of morphological TU bounded by drainage and divide lines (Carrara, 1988;Carrara et al., 1991Carrara et al., , 1995;;Guzzetti et al., 1999), and corresponds to what a geomorphologist or an hydrologist would recognize as a single slope, a combination of adjacent slopes, or a small catchment.This makes SUs easily recognizable in the field, and in topographic base maps.Compared to other terrain subdivisions, including grid cells or unique-condition units (Guzzetti et al., 1999;Guzzetti, 2006), SUs are related to the hydrological and geomorphological conditions and processes that shape natural landscapes.For this reason, SUs are well suited for hydrological and geomorphological studies, and for landslide susceptibility (LS) modeling and zonation (Carrara et al., 1991(Carrara et al., , 1995;;Guzzetti et al., 1999;Guzzetti, 2006).
SUs can be drawn manually from topographic maps of adequate scale and quality (Carrara, 1988).However, the manual delineation of SUs is time-consuming and error-prone, limiting the applicability to very small areas.Manual delineation of SUs is also intrinsically subjective.This reduces the reproducibility -and hence the usefulness -of the terrain subdivision.Alternatively, SUs can be delineated automatically using specialized software.The latter exploits digital representations of the terrain, typically in the form of a digital elevation model (DEM) (Carrara, 1988), or adopts image segmentation approaches (Flanders et al., 2003;Dragut and Blaschke, 2006;Aplin and Smith, 2008;Zhao et al., 2012).In both cases, the result is a geomorphological subdivision of the terrain into mapping units bounded by drainage and di-Published by Copernicus Publications on behalf of the European Geosciences Union.
vide lines, which can be represented by polygons (in vector format) or groups of grid cells (in raster format).
Large and complex geographical areas or landscapes can be partitioned by different SU subdivisions.Unique (i.e., universal) subdivisions do not exist, and optimal (best) terrain subdivisions depend on multiple factors, including the size and complexity of the study area, the quality and resolution of the available terrain elevation data, and -most importantly -the purpose of the terrain subdivision (e.g., geomorphological or hydrological modeling, landslide detection from remote-sensing images, landslide susceptibility, hazard or risk modeling).An open problem is that an optimal SU subdivision for LS modeling cannot be decided unequivocally, a priori, or in an objective way, and the quality and usefulness of a LS zonation depends on the SU subdivision (Carrara et al., 1995).
In this work, we propose an innovative modeling framework to determine an optimal terrain subdivision based on SUs best suited for LS modeling.For the purpose, we also present the r.slopeunits software for the automatic delineation of SUs, and we propose a method to optimize the terrain subdivision in SUs performed by the software.The r.slopeunits software is written in Python for the GRASS GIS (Neteler and Mitasova, 2007), and automates the delineation of SUs, given a DEM and a set of user-defined input parameters.We tested the r.slopeunits software and the proposed optimization procedure for LS modeling in a large area in central Italy, where sufficient landslide and thematic information was available to us (Cardinali et al., 2001(Cardinali et al., , 2002)).
The paper is organized as follows.First, we present the proposed approach for the delineation of an optimal terrain subdivision into SUs best suited for LS modeling, based on an optimization method (Sect.2).Next (Sect.3), we present the method for the automatic delineation of SUs, which we have implemented in the r.slopeunits software for the GRASS GIS, and (in Sect.4) we describe a segmentation metric useful for the evaluation of the SU internal homogeneity.Next, we introduce landslide susceptibility modeling (Sect.5) and our optimization approach to the SU partitioning (Sect.6).This is followed (in Sect.7) by a description of the study area, in central Italy, and of the data used for LS modeling.In Sect.8, we present the results obtained in our study area in central Italy; then we summarize the obtained results (Sect.9), and outline possible uses of the r.slopeunits software and of the optimization procedure (Sect.10).

Modeling framework
We propose a new modeling framework for the parametric delineation of SUs and their optimization, as a function of a few input parameters, for the specific purpose of determining landslide susceptibility (LS) adopting statistically based classification methods (Guzzetti et al., 1999(Guzzetti et al., , 2005(Guzzetti et al., , 2006;;Guzzetti, 2006;Rossi et al., 2010).The framework is exemplified in Fig. 1, and consists of the following steps: 1. First, the r.slopeunits SU delineation softwaredescribed in Sect.3, and whose flowchart is shown in Fig. 2 -is run multiple times with different combinations of the input modeling parameters.Each software run results in a different subdivision of the landscape into a different set of SUs.In each run, the number and size of the SUs depend on the input modeling parameters.
2a.Second, the internal homogeneity and external inhomogeneity of each SU subdivision -required by any meaningful terrain subdivision -are defined in terms of terrain aspect, measured by the circular variance of the unit vectors perpendicular to the local topography represented by all grid cells in a slope unit.For each set of SUs obtained in step 1 using different input parameters, the quality of the aspect segmentation is evaluated adopting a general-purpose segmentation objective function, presented in Sect. 4. 2b.At the same time, for each set of SUs obtained in step 1, a LS model is calibrated adopting a logistic regression model (LRM) for SU classification (Sect.5).In the LRM, each SU is classified as stable (i.e., free of landslides) or unstable (i.e., having landslides) depending on a (in our case, linear) combination of the local terrain conditions (i.e., the geo-environmental variables).The performance of the model calibration is evaluated using the area under the curve (AUC) of receiver operating characteristic (ROC), referred to as the AUC ROC metric, a standard and objective metric commonly adopted in the literature to evaluate the performance of LS models (Rossi et al., 2010).
3. Lastly, an overall (combined) objective function is defined by properly combining the segmentation (step 2a) and the AUC ROC (step 2b) objective functions, as described in Sect.6. Maximization of this quantity allows to single out the optimal set of SUs (i.e., the optimal terrain subdivision) that, simultaneously, (i) provides a good aspect segmentation and (ii) results in an effective calibration of the LS model.
4. Maximization of the combined objective function allows selecting objectively the optimal combination of the input terrain modeling parameters best suited for LS modeling (step 4 in Fig. 1) and the corresponding SU subdivision.
In summary, the proposed modelling framework relies on an optimization procedure that maximizes a proper, specific function that contains information on (i) the morphology of the study area, represented by the aspect segmentation metric Figure 1.Logical framework for the proposed method for (1) the parametric delineation of SUs, (2a) the assessment of the quality of terrain aspect segmentation using of a proper segmentation objective function, (2b) the calculation and assessment of the quality of the LS modeling using a standard AUC ROC metric, and (3) the definition and maximization of a combined objective function for (4) the determination of optimal parameters for the delineations of SUs best suited for LS modeling and zonation.
(step 2a), and on (ii) the specific landslide processes under investigation (in our case, slow-to very slow-moving shallow slides, deep-seated slides, and earth flows), represented by the LS model performance and the associated AUC ROC metric (step 2b).The optimization approach removes subjectivity from the SU delineation algorithm, and produces a result that is objective and completely reproducible.This is a significant advantage over manual methods (Carrara, 1988), or specialized software that needs multiple parameters and specific calibration procedures.

Automatic delineation of slope units
Automatic delineation of SUs can be performed adopting two strategies.The first strategy defines a large number of small homogeneous areas, and enlarges or aggregates them progressively, maximizing the aspect homogeneity of the SUs (Zhao et al., 2012).Following this approach, the size of the initial polygons representing the small homogeneous areas is significantly smaller than the size of the desired (final) SUs, which results from the aggregation of multiple areas per-formed maximizing an objective function (Espindola et al., 2006).The second strategy defines an initial small number of large or very large areas, and progressively reduces their size until a satisfactory result is obtained (Carrara, 1988;Carrara et al., 1991Carrara et al., , 1995)).In the second strategy, the study area is subdivided into large subcatchments, which can be further subdivided into left and right sides (looking downstream with respect to the main drainage), with the resulting two sides named half basins (HBs).The size of the initial HB is much larger than the desired size for the SUs.
For both strategies, the final subdivision of the landscape into SUs does not maintain memory of the terrain partitioning represented by the initial areas or HBs.In both strategies, deciding when to stop the aggregation or the partitioning to obtain a terrain subdivision suitable for a specific use (in our case, LS modeling) is critical.Both strategies are subject to the selection of user-defined modeling parameters, which introduce subjectivity and reduce the reproducibility of the results.These are conceptual and operational problems that hamper the design and the implementation of an automatic procedure for the effective delineation of terrain subdivisions based on SUs (Carrara et al., 1995;Espindola et al., 2006;Dragut et al., 2010Dragut et al., , 2014)).

Slope unit delineation algorithm
For the delineation of the SUs, we adopt the second strategy outlined above, i.e., we start from a relatively small number of large HBs, and we gradually reduce their size by subdividing the HBs into smaller TUs.Hydrological conditions and terrain aspect requirements control the subdivision of the large HBs into smaller TUs.The approach is adaptive, and it results in a geomorphological subdivision of the terrain based on SUs of different shapes and sizes that capture the real (natural) subdivisions of the landscape.
We implemented the approach to the delineation of SUs in a specific algorithm, coded in the r.slopeunits software (Fig. 2).The algorithm (and the software) uses the hydrological module r.watershed (Metz et al., 2011) available in the GRASS GIS.Using a DEM to represent terrain morphology, r.watershed produces a map of HBs adopting an advanced flow accumulation (FA) area analysis.Each grid cell in the DEM is attributed the total contributing area FA, based on the number of cells that drain into it.The FA values are low along the divides and increase downstream along the drainage lines.This information is used to single out streams and divides, i.e., the main elements bounding a SU (Carrara, 1988).The r.watershed module can use single flow direction (SFD) or multiple flow direction (MFD) strategies.In the SFD strategy, water is routed to the single neighboring cell with the lowest elevation, and in the MFD strategy water is distributed to all the cells lower in elevation, proportionally to the terrain slope in each direction.The r.slopeunits software adopts the MFD strategy to distribute water to the neighboring cells.map showing plain areas to be excluded from the processing; DEM, (demmap) digital elevation model; t, (thresh) initial FA threshold area; a, (areamin) minimum area; c, (cvmin) circular variance; r, (rf) reduction factor; maxarea, maximum SU area; cleansize, size of candidate SU to be removed; see text for detailed explanation.(b) r.watershed processing in GRASS GIS (Neteler and Mitasova, 2007).(c) Tests if the produced HBs are in a plain area.(d) The average area of each new HB child in each HB parent is checked against a.(e) HB child is individually checked against a and c requirements.(f) The process proceeds to iteration i + 1 for each and every HB child that still does not meet the requirements, with an updated t i+1 = t i −t i /r FA threshold.(g) Small polygons are removed from the candidate SU set.
The r.slopeunits software requires a DEM, demmap, accepts an optional layer showing alluvial plains (APs), plainsmap, and the following user-defined numerical parameters (Fig. 2a): (i) the flow accumulation area (FA) threshold, thresh; (ii) the minimum surface area for the SU, areamin (in square meters); (iii) the minimum circular variance (Nichols, 2009) of terrain aspect within a slope unit, cvmin; (iv) a reduction factor, rf; (v) the maximum surface area for the SU, maxarea (in square meters, optional); and (vi) a threshold value for the cleaning procedures, cleansize (in square meters, optional; see Sect.3.2).For simplicity, in the following we refer to the numerical values of thresh, areamin, cvmin, and rf as t, a, c, and r, respectively.
The software adopts an iterative approach to partition a landscape into SUs.In the first iteration, r.watershed uses the threshold t for controlling the partitioning into HBs.The parameter t has to be smaller than the maximum surface accumulation area (FA) for the study area to allow for the delineation of at least two HBs that are large enough to be further subdivided.Grid cells with FA > t are recognized as drainage lines (i.e., streams), and used to delineate the river network by r.watershed, grouping all grid cells in the DEM that drain into a given stream segment.These cells collectively represent the catchment drained by the stream segment, which can be further subdivided into left and right HBs (Fig. 2b).A low value of the contributing (FA) area t results in a dense hydrological network draining a large number of small HBs, and a large value of t results in a reduced number of streams draining a smaller number of relatively large HBs.
At each iteration, where a GIS layer showing APs is available, r.slopeunits identifies the grid cells in the APs and excludes them from the analysis (Fig. 2c).When the SUs are exploited for the analysis of the processes causing slope instability, the exclusion is justified by the empirical observation that landslides do not occur in plain areas.The value of the areamin parameter a defines the smallest possible (planimetric) area for a SU.The circular variance is defined as 1 − |R|/N v , in the range between 0 and 1, where N v is the number of grid cells in each HB and |R| is the magnitude of the vector R that results from the sum of all unit vectors describing the orientation of each grid cell.As an example, Fig. 3a shows that a group of unit vectors dispersed 23 • apart, on average, is characterized by a circular variance of 0.1, and Fig. 3b shows that a group of unit vectors dispersed 62 • apart, on average, is characterized by a circular variance of 0.6.Thus, a value of 0.1 of the circular variance represents grid cells all facing nearly in the same direction, whereas a value of 0.6 represents more dispersed grid cells.In the algorithm, the circular variance is controlled by the value of the cvmin parameter c, affecting the homogeneity of terrain aspect in the HBs.Small values of c result in more uniform HBs, and large values of c in less uniform HBs, in terms of terrain aspect.
At each iteration, r.watershed splits each existing parent half basin, HB parent (Fig. 2d) into nested child half basins, HB child .The average area of the HB child defined for each HB parent is checked against a.When the average area is smaller than a, the subdivision is rejected and the HB parent is selected as a candidate SU.When the average area is larger than a, the procedure keeps the HB child for the next step of the analysis.The rejection procedure prevents very small HBs from being selected as a candidate SU.The next step of the procedure consists in comparing the size and circular variance of each HB child with the userdefined values a and c.Where the circular variance is smaller than c, or the size is smaller than a, the HB child is taken as a candidate SU (Fig. 2e).If the user defines the optional input parameter maxarea, the control on the circular variance is not performed for the HB child having a size larger than maxarea.This imposes a constraint on the maximum size of the candidate SU (not shown in Fig. 2e).Each and every remaining HB child is fed to the next iteration of r.watershed, which is initialized using a smaller value of t.At the ith iteration, the value of t depends on the value of the reduction factor r, according to t i+1 = t i − t i /r (Fig. 2f).The decrease of t i is faster for small values of r ≥ 2. We checked empirically that values of r > 10 lead to visually better results, at least in our study area.This is due to the fact that a slow decrease of t i allows for a better (finer) control on the subdivision of the parent half basin, HB parent .On the other hand, a fast decrease obtained with r = 2 prevents the algorithm from checking if intermediate values of t i produce a candidate SU.Thus, a slow decrease in t i results in a larger number of iterations, which produce better results at the expenses of a longer computing time required to complete the iterations.
At each iteration, each HB child is processed again by r.watershed as HB parent .The iterative procedure ends when the entire study area is subdivided into candidate SUs that match the user requirements, in terms of minimum area (a) and circular variance (c).In the resulting map, the size and shape of the candidate SUs can be determined by the constraint of minimum surface area, or by the constraint of the minimum circular variance of the terrain aspect.The final terrain subdivision contains candidate SUs whose minimum size approximates a.
The final SU partitioning is obtained after an additional (cleaning) step intended to identify and process candidate SUs exhibiting unrealistic or unacceptable size or shape (Fig. 2g).This is discussed in the next section.

Slope units cleaning procedure
The r.slopeunits software may produce locally unrealistic candidate SUs which are too small, too large, or oddly shaped.As an example, in large open valleys where terrain is flat or multiple channels join in a small area, unrealistically small subdivisions can be produced.Another example is represented by unrealistically elongated or large candidate SUs found along regular and planar slopes.These terrain subdivisions, although legitimate from the algorithm hydrological perspective, are problematic for practical applications and should be revised and removed eventually.Very small candidate SUs consisting of a few grid cells are often the result of artifacts (errors) in the DEM.These candidate SUs should also be removed.To remove candidate SUs with unrealistic or unwanted sizes (Fig. 2g), we have implemented three distinct software tools based on three different methods.The three methods require the user to set the cleansize parameter that imposes a strict constraint on the minimum possible size (planimetric area) for a slope unit.
The first method simply removes all candidate SUs smaller than cleansize.The adjacent candidate SUs are enlarged to fill the area left by the removed units, using the r.grow GRASS GIS module.The second method, in addition to removing all candidate SUs smaller than cleansize, also removes odd-shaped polygons from the set of the candidate SUs.Enabled via the -m flag in combination with cleansize, the second method removes markedly elongated candidate SUs with a width smaller than two grid cells.The third method merges small candidate SUs with neighboring ones based on the average terrain aspect.Enabled via the -n flag in combination with cleansize, the third method calculates the average terrain aspect of all the grid cells in each small candidate SU.The average aspects of the neighboring SUs are compared, and the two adjacent units exhibiting the smallest difference in average terrain aspect are merged, provided that the SU that is removed shares a significant part of its boundary with the neighboring unit.The final result is a terrain subdivision in which the vast majority of the SU has an area larger than cleansize.A drawback of the third method is that it is significantly more computer intensive than the other two methods, requiring longer processing times.

Segmentation metric
To evaluate the terrain partitioning into SUs, we use a simple metric originally proposed for the evaluation of the quality of a segmentation result (Espindola et al., 2006).In digital image processing, segmentation consists in the process of partitioning an image into sets of pixels, such that pixels within the same set share certain common characteristics.Here, we consider the terrain aspect grid map as an image to be segmented, and we assume that the segmentation metric proposed by Espindola et al. (2006) is appropriate to evaluate the terrain partition into SUs.We base the assumption on the observation that the metric makes a straightforward evaluation of the internal homogeneity of the SUs using the local aspect variance V , and the external heterogeneity of the SUs using the autocorrelation index, I .The two quantities are defined as follows: and where n labels all the N SUs in a given partition, s n is the surface area of the nth SU, c n is the circular variance of the aspect in the nth SU, α n is the average aspect of the nth SU, α is the average aspect of the entire terrain aspect map, and w nl is an indicator for spatial proximity, equal to unity if SU n and l are adjacent, zero otherwise.The local variance V , defined in Eq. ( 1), assigns more importance to large SUs, avoiding numerical instabilities produced by small SUs.The autocorrelation index I of Eq. ( 2) has minima for partitions that exhibit well-defined boundaries between different SUs.The optimal selection of the input parameters is the one that combines small V and small I .This is quantified by the following objective function (Espindola et al., 2006): where V min(max) and I min(max) are the minimum (maximum) values of the quantities in Eqs. ( 1) and (2) as a function of the input parameters.To calculate I , we rewrite Eq. ( 2) to make it consistent with the terrain aspect map.The aspect map contains values in degrees, and the average values and products cannot be taken straightforwardly, and the following definitions have to be considered.The average values of the angles are a vectorial sum of unit vectors, so that where the index j runs over all the grid cells in the terrain aspect map.A similar definition holds for α i , the average aspect inside the ith slope unit, if the sum in Eq. ( 4) is limited to the grid cells belonging to the ith SU.The difference (α i − ᾱ) should also be intended vectorially, as follows: Lastly, the product at the numerator of Eq. ( 2) is taken as the scalar product of the vectors (α i − ᾱ) and (α j − ᾱ), as follows: where Care must be taken in expressing angles and arcs consistently in degrees or radians.
The segmentation metric F (V , I ) is a measure of the degree of fulfillment of the SU internal homogeneity requirement, and it is related only to the SU geometrical and morphological delineation.The metric can be used to define the optimal (best) partition of the territory in terms of aspect segmentation by maximization of F (V , I ) = F (V (a, c), I (a, c)) as a function of the a and c user-defined modeling parameters.

Landslide susceptibility modeling
Landslide susceptibility (LS) is the likelihood of landslide occurrence in an area, given the local terrain conditions, including topography, morphology, hydrology, lithology, and land use (Brabb, 1984;Guzzetti et al., 1999;Guzzetti, 2006).Various types of LS modeling approaches are available in the literature.The approaches differ -among other thingson the type of the TU used to partition the landscape and to ascertain LS (Guzzetti et al., 1999;Huabin et al., 2005;Guzzetti, 2006;Pardeshi et al., 2013).Interestingly, instead of trying to define where landslides are not expected, (e.g., Marchesini et al., 2014), over the last decades a lot of efforts have been spent on the prediction of the spatial probability of the slope failures.Among the many available types of TUs (Guzzetti, 2006), SUs have proved to be effective terrain subdivisions for LS modeling.
To model LS, we considered slow-to very slow-moving shallow slides, deep-seated slides, and earth flows, and we excluded rapid to fast-moving landslides, including debris flows and rock falls.The r.slopeunits software performs the delineation of the SUs, and does not perform the LS modeling.For the latter, we exploit specific modeling software (Rossi et al., 2010;Rossi and Reichenbach, 2016).
In this work, we prepare LS models adopting a single multivariate statistical classification model.For the purpose, we use a logistic regression model (LRM) to quantify the relationship between dependent (landslide presence/absence) and independent (geo-environmental) variables.We use the presence/absence of landslides in each SU as the grouping (i.e., dependent) variable.Adopting a consolidated approach in our study area (Carrara et al., 1991(Carrara et al., , 1995;;Guzzetti et al., 1999), SUs with 2 % or more of their area occupied by landslides are considered unstable (having landslides), and SUs with less than 2 % of the area occupied by landslides are considered stable (free of landslides).The 2 % threshold value depends on the accuracy of a typical landslide inventory map (Carrara et al., 1991(Carrara et al., , 1995;;Guzzetti et al., 1999;Santangelo et al., 2015).Users of the r.slopeunits software may select a different value for the landslide presence/absence threshold in the LRM (or any other classification model, where considered more appropriate, in different study areas, and with different input data.Numerical (i.e., terrain elevation, slope, curvature, and other variables derived from the DEM), and categorical (i.e., lithology and land use) variables are used as explanatory (i.e., independent) variables in the LS modeling.Each SU is characterized by (i) statistics calculated for all the numerical variables, and (ii) percentages of all classes for the categorical variables.
The LS evaluation is repeated many times using different SU terrain subdivisions obtained changing the r.slopeunits (a, c) modeling parameters, and cleaned from small areas using the first (and simplest) method described in Sect.3.2.We evaluate the performance skills of the different LS models calculating the AUC ROC (Rossi et al., 2010).ROC curves show the performance of a binary classifier system when different discrimination probability thresholds are chosen (Fawcett, 2006).A ROC curve plots the true positive rate (TPR) against the false positive rate (FPR) for the different thresholds.The area under the ROC curve, AUC ROC , ranging from 0 to 1, is used to measure the performance of a model classifier that, in our case, corresponds to the LS model.
Hereafter, we quantify the AUC ROC metric with the R(a, c) function, for each value of the a (surface area) and c (circular variance) input parameters of the r.slopeunits software.We stress that we use the AUC ROC metric to evaluate the fitting performance of the LS model, i.e., to evaluate the ability of the LS model to fit the same landslide set used to construct the LS model (the landslide training set), and not an independent landslide validation set (Guzzetti et al., 2006;Rossi et al., 2010).This is done purposely, because the purpose of the procedure is not to evaluate the performance of the LS classification but to help determine an optimal terrain subdivision for LS modeling, and thus before any LS model is available for proper validation.When an optimal SU subdivision is obtained (see Sect. 6), and a corresponding LS model is prepared, the prediction skills of the model can be evaluated using independent landslide information (where this is available) (Guzzetti et al., 2006;Rossi et al., 2010).
In addition to the AUC ROC , which is a direct measure of the fitting/prediction performance of a binary classifier, the performance of the LRM model can be analyzed in terms of how the different input variables (both numerical and categorical) contribute to the final result (Budimir et al., 2015).In the model, a p value can be associated to each variable, and used to establish the significance of the variable in the LRM.The fraction of significant variables used by the LRM can be used to qualitatively understand the behavior of the classification model as a function of the r.slopeunits software input parameters or, in turn, as a function of the average size of the SUs.

Optimization of SU partitioning for LS zonation
Once, for all the SU sets computed using different modeling parameters (i.e., different a (surface area) and c (circular variance) values), (i) the SU terrain aspect segmentation metric F (a, c) -that assesses how well the requirements of internal homogeneity/external heterogeneity are fulfilled -has been established (2a in Fig. 1 and Sect. 4;Espindola et al., 2006); and (ii) the performance of the individual LS models are estimated using the AUC ROC metric R(a, c) -that assesses the calibration skills of the LRM used for LS modeling -has also been established (2b in Fig. 1 and Sect.5; Rossi et al., 2010), a proper objective function S that combines the aspect segmentation metric (F (a, c)) and the AUC ROC calibration metric (R(a, c)) is established.Maximization of S(a, c) as a function of the a (the minimum surface area of the slope unit) and c (the slope unit circular variance) modeling parameters allows to single out the optimal SU terrain subdivision for LS modeling in the given study area.
The reason for proposing a combination of the aspect segmentation and the AUC ROC metrics in the search for an optimal terrain subdivision for LS modeling is the following.The single segmentation metric F (a, c) is a measure of the degree of fulfillment of inter-unit homogeneity and intra-unit heterogeneity for a SU delineation obtained with given (a, c) values.As such, F (a, c) is related only to the geometrical delineation of the SUs, and does not consider the subsequent application of a LS model, the presence/absence of landslides, or any quantity other than terrain aspect.
To combine the two functions R(a, c) and F (a, c) into a single objective function, we normalized them to [0, 1] as follows: The functions R o (a, c) and F o (a, c) are then multiplied to obtain the final objective function S(a, c): which embodies information on the quality of the terrain aspect map segmentation and on the performance of the LS model in a consistent, objective, and reproducible way.The function S(a, c) assumes values in the range [0, 1]: the larger the value the better the SU partitioning in terms of (i) SU internal homogeneity and external heterogeneity, and (ii) suitability of the subdivision for LS zonation in our study area.

Test area
We tested our proposed modeling framework for the delineation of SUs, and for the selection of the optimal modeling parameters for LS assessment, in a portion of the upper www.geosci-model-dev.net/9/3975/2016/Geosci.Model Dev., 9, 3975-3991, 2016 To model LS, we use a digital representation of the terrain elevation, an inventory of known landslides, and relevant geo-environmental information.We use a DEM with a ground resolution of 25 m × 25 m obtained through linear interpolation of elevation data along contour lines shown on 1 : 25 000 topographic base maps (Cardinali et al., 2001) (Fig. 4a).The landslide inventory (Fig. 4b) was obtained from the visual interpretation of multiple sets of aerial pho-tographs flown between 1954 and 1977, aided by field surveys, review of historical and bibliographical data, and geological, geomorphological, and other available landslide maps (Cardinali et al., 2001;Galli et al., 2008;Guzzetti et al., 2008;Alvioli et al., 2014).To prepare the LS model, we considered only the shallow slides, the deep-seated slides, and the earth flows.These landslides are (i) slow-to very slowmoving failures, and (ii) they typically remain in the slope (i.e., the slope unit) where they occur.For the deep-seated landslides and the earth flows, the landslide source (depletion) area and the deposit were considered together.This is a standard approach in modeling LS for these types of landslides.We excluded from the LS modeling all the rapid to fast-moving landslides, including debris flows and rock falls that may travel outside the slope (i.e., the slope unit) where they form.
We obtained lithological information (Fig. 5a) from available geological maps, at 1 : 10 000 scale (Regione Umbria, 1995-2015), prepared in the framework of the Italian national geological mapping project CARG and in other regional geological mapping projects.The original maps, in vector format, were edited to eliminate and reclassify polygons coded as landslide deposits or debris flow deposits, and to reclassify the original 186 geologic formations and 20 cover types in five lithological complexes or lithological domains (Table 1).Definition of the five lithological complexes was made on the basis of the characteristics of the rock types (carbonate, terrigenous, volcanic, post-orogenic sediments) and the degree of competence or composition of the differ- ent geological units (massive, laminated sandstone/peliticrock ratio).Applying these criteria, the original 206 geological units were grouped in 17 lithological units.We obtained information on land cover from a map, at 1 : 10 000 scale (Regione Umbria, 1995-2015), prepared through the visual interpretation of color aerial photography at 1 : 13 000 scale, acquired in 1977 (Fig. 5b).The map contains 13 land cover classes, of which the most common are forest (42 %), arable land (31 %), meadow and pasture (10 %), built up areas (10 %), and vineyards, live trees, and orchards (6 %).
The study area (Fig. 4) corresponds to an alert zone used by the Italian National Department for Civil Protection to issue landslide (and flood) regional warnings.The boundary of the alert zone is partly administrative, and does not correspond locally to drainage and divides lines.As a result, our SU partitioning intersects locally the boundary of the alert zone.For convenience, for our analysis we considered only SUs that fall entirely within the alert zone.As a drawback, the extent of the study area varies slightly depending on the combination of the selected a and c parameters.We maintain that this has a negligible effect on the final modeling results.

Slope units delineation
In the study area, we ran the r.slopeunits software for 99 different combinations of user-defined input parameters, resulting in 99 different terrain subdivisions.In the itera-Table 1. Lithological codes used in Fig. 5. SD is sandstone, P is pelitic rock, CO is conglomerate, S is sand, G is clay (Regione Umbria, 1995Umbria, -2015)).tive procedure (Fig. 2), the initial FA threshold (t) area and the reduction factor (r) control the numerical convergence, and do not have an explicit geomorphological meaning.On the other hand, the minimum area (a) and the circular variance (c) determine the size and control the aspect of the SUs.For the analysis, we selected a large value for the FA area (t = 5 × 10 6 m 2 ) keeping it constant for all the different model runs.The large value was chosen to obtain large initial HBs that could be further subdivided into smaller SUs by the iterative procedure (see Sect. 4).The value is also consistent with (i.e., substantially larger than) (i) the size of the average SU used to partition the same area in a different LS modeling effort (Cardinali et al., 2001(Cardinali et al., , 2002)), and (ii) the average area of the considered landslides (shallow slides, the deep-seated slides, and the earth flows) in the study area (Guzzetti et al., 2008).Selection of the reduction factor (r = 10) was heuristic and motivated by the fact that this figure provided stable results compared to those obtained using larger values.The two most relevant parameters, the minimum area a and the circular variance c, were selected from broad ranges: a = 10 000, 25 000, 50 000, 75 000, 100 000, 125 000, 150 000, 200 000, and 300 000 m 2 , and 0.1 < c < 0.6, at evenly spaced values with an increment of 0.05.For all the r.slopeunits model runs, we used the same value for cleansize = 20 000 m 2 , to remove candidate SUs with area < 20 000 m 2 using the first (and simplest) of the three methods described in Sect.3.2.Figure 6 shows 9 of the 99 results of the terrain subdivisions obtained using different combinations of the a and c modeling parameters.The map in the upper left (lower right) corner shows the finest (coarsest) SU partitioning, determined using small (large) values of a and c.The map in the center was obtained using intermediate values for the two user-defined modeling parameters.For a limited portion of the study area, Fig. 7 shows different partitioning results.In particular, Fig. 7a and b show the overlay of the three different partitions with the landslide inventory map (Fig. 4), using a two-and three-dimensional visualization, respectively, and a shaded image relief of the terrain, whereas Fig. 7c, d, and e show separately the same partitions using a three-dimensional representation.Visual inspection of Fig. 7a, b, and c reveals that the coarser subdivision (c = 0.60 and a = 0.3 km 2 ), shown in blue, defines large SUs characterized by heterogeneous orientation (aspect) values.On the other hand, the finest SU subdivision (shown in green in Fig. 7a, b and e) obtained using c = 0.10 and a = 0.01 km 2 , is too small to completely include many of the landslides (shown in yellow).The combination that uses intermediate values c = 0.35 and a = 0.15 km 2 resulted in the SU subdivision shown in red in Fig. 7a, b, and d.
Figure 8 shows the effect of different combinations of a and c on the average SU size.The average area of the SU increases significantly with c (less homogeneous, more irregular slope), and it is less sensitive to the increment of a (Fig. 8a).The effect of c becomes predominant in Fig. 8b.The standard deviation of the area of the SU varies significantly with c, highlighting that larger values of c increase the average and the variability of the SU size (Fig. 8a and b, respectively).

Segmentation metric
For each of the 99 terrain subdivisions obtained using the procedure described above, we calculated the segmentation objective function value given by Eq. (3).The segmentation metric F (a, c) (Fig. 9) is a measure of the performance of our SU delineation algorithm, and an assessment of how well the requirements of internal homogeneity/external heterogeneity are fulfilled by the procedure, as a function of a and c (2a in Fig. 1, Sect. 4).Where the SUs are relatively small, their de- gree of internal homogeneity is large, but the requested heterogeneity between adjacent units is not completely fulfilled.
Where the SUs are large, the requested internal homogeneity is not fulfilled entirely, because in each SU the aspect variability is large.The analysis of the F (a, c) values in Fig. 9 suggests that SU subdivisions obtained using c smaller than about 0.2 and a smaller than about 50 000 m 2 , or c larger than about 0.5, should not be considered in the analysis because they are too small or too large to satisfy the aspect variability requirement.

Landslide susceptibility modeling
For each of the 99 SU delineations, we prepared a different LS zonation using a LRM (Sect.6) adopting the modeling scheme described by Rossi et al. (2010); Rossi and Reichenbach (2016).Figure 10 shows the results obtained for 9 (out of 99) combinations of the a and c parameters.For each of the 99 susceptibility assessments, we evaluated the fitting (calibration) performance of the models computing AUC ROC (Rossi et al., 2010), and Fig. 11 shows the obtained AUC ROC as a function of the a and c parameters (Sect.5).
The larger values of AUC ROC were obtained for LS zonations based on SU partitions resulting from large values of c and a.Such combinations may locally result in SUs with extremely large internal heterogeneity.Signatures of the het- erogeneity within very large slope units can be found both in the segmentation metric and in the LRM model results.
In the segmentation metric case, this is very straightforward since the values of F (a, c), which is a direct measure of the slope aspect homogeneity within SUs, clearly decrease with increasing values of a and c, as shown in Fig. 9. Very large SUs, however, are not only heterogeneous in terms of terrain aspect, but also in terms of the morphometric and thematic variables used as input of the LRM.This is reflected in the number of input variables that significantly contribute to the susceptibility model results as a function of a and c.We computed the number of statistically significant variables in each realization of the LRM (i.e., the variables with a p value < 0.05).Figure 12 shows that the number of significant variables ranges from 5 % (of the 50 morphometric and thematic variables) for large SUs, to 35 % for small SUs.From this analysis we observe that for very large slope units, only very few variables are effectively used by the LRM.A more detailed analysis reveals that, in our test case, lithological variables significantly control the results, whereas other local settings (e.g., terrain slope) are neglected.The relevant variables are typically the terrigenous sediments and carbonate lithological complexes, where landslides are expected and not expected, respectively.As a result, in the region of large (a, c) parameters, even if we obtain high values of AUC ROC , the LRM can be replaced by a simple heuristic analysis of the lithological map.The fine details of the remaining input variables are lost and a multivariate statistical approach is of little use.We clarify that to evaluate the model performance, any statistical metric based on the comparison of observed and predicted data (e.g., confusion matrices and derived indexes), would exhibit the same or similar trend as the AUC ROC as a function of the SU size.

Discussion
We have run the r.slopeunits software with a significant number of combinations (99) of the (a, c) input parameters, and a corresponding number of realizations of the LS model.Results showed that new r.slopeunits software was capable of capturing the morphological variability of the landscape and partitioning the study area into SU subdivisions of different shapes and sizes well suited for LS modeling and zonation.As a matter of fact, depending on the type of landslides, the scale of the available DEM, the morphological variability of the landscape, and the purpose of the zonation, the detail of the terrain subdivision may vary.A detailed terrain partitioning, with many small SUs, is required to capture the complex morphology of badlands, or to model the susceptibility to small and very small landslides (i.e., soil slips).A coarse terrain subdivision is best suited for modeling the susceptibility of very old and very large, deep-seated, complex and compound landslides.Coarse subdivisions can also be used to model the susceptibility to channeled debris flows that travel long distances from the source areas to the depositional areas.Subdivisions of intermediate size may be required for medium to large slides and earth flows (Carrara et al., 1995).By tuning the set of user-defined model parameters, r.slopeunits can prepare SU terrain subdivisions for LS modeling in different geomorphological settings.
Concerning the LS model, we acknowledge that our selection of the 2 % presence/absence threshold may influence the production of the appropriate SU subdivision, and may affect the results of the LS zonation.Examination of different thresholds is not investigated in the present work, because it is not an input parameter of the r.slopeunits software and does not change the logic of the approach or the rationale behind our optimization procedure.
We clarify that the subdivisions produced by r.slopeunits using different (a, c) parameters are nested, i.e., the boundaries of a coarse resolution subdivision encompass the boundaries of intermediate and finer subdivisions (see Fig. 7c, d, e).This is a significant operational advantage where landslides of different sizes and types coexist, posing different threats and requiring multiple and combined susceptibility assessments, each characterized by a different terrain subdivision (Carrara et al., 1995).Optimal values of the (a, c) parameters have to be determined to obtain the best SU subdivision for a particular goal, in our case, LS modeling.We defined a custom objective function S(a, c) to determine such optimal values.S(a, c) is the product of a segmentation quality measure, F (a, c), and LS model performance in calibration, R(a, c).If only the F (a, c) metric is used to select a particular set of modeling parameters, the resulting optimal (best) set of SUs has the only meaning of "best partition of the territory in terms of aspect segmentation".Similarly, the R(a, c) metric considers solely the classification results of the LS model and not the geometry of the single SU, some of which may be inadequate (e.g., too large, too irregular, too small) for the scope of the terrain zonation.Values of F (a, c) indicate that there are combinations of the c and a parameters that result in SU subdivisions that do not satisfy the user requirements in terms of SU internal homogeneity and external heterogeneity (Fig. 8) (Sect.8.2).On the other hand, the AUC ROC metric increases with the average size of the SU (Fig. 11) (Sect.8.3).To select the optimal terrain partitioning for LS zonation in our study area, we exploit the objective function S(a, c), which simultaneously quantifies (Sect.6): (i) the SU internal homogeneity and external heterogeneity (Fig. 13a), and (ii) the (fitting) performance of the LS model (Fig. 13a).Maximization of S(a, c) (Fig. 13b) provides the best combination of the (a, c) modeling parameters for a terrain subdivision optimal for LS modeling in our study area.
In addition to the AUC ROC , we have analyzed the performance of the LRM model by studying the fraction of significant variables used by the LRM, to qualitatively understand the behavior of the classification model as a function of the r.slopeunits software input parameters or, in turn, as a function of the average size of the SU.The LRM is expected to use the input data less efficiently when the average SU size grows, resulting in a smaller number of significant input variables, which is indeed what we observed.This is due to the LRM inability to discriminate between input variables when the SUs are too large, since each unit usually contains all the possible values of the variables.Using www.geosci-model-dev.net/9/3975/2016/Geosci.Model Dev., 9, 3975-3991, 2016 less data makes it easier for the LRM model to produce a high AUC ROC result, which does not necessarily correspond to the optimal SU set.The complication is removed using the S(a, c) = F o (a, c)R o (a, c) function, whose F (a, c) component prevents unrealistic SU sets (both large and small) to have a high overall score.
The function S(a, c) (Fig. 13b), calculated in our test case for different combinations of the a and c modeling parameters, has a maximum value at a = 150 000 m 2 and c = 0.35.The set of SUs that corresponds to the optimal combination of the modeling parameters can be singled out as our optimal (best) result.The LS results corresponding to the optimal combination were already shown in the central box of Fig. 10, and the optimal SU map is presented in Fig. 14.

Conclusions
Despite the clear advantages of SUs over competing mapping units for LS modeling (Guzzetti, 2006), inspection of the literature reveals that only a small proportion (8 %) of the LS zonations prepared in the last three decades worldwide was performed using SUs (Malamud et al., 2014).The limited use of SUs for LS modeling and zonation is due to (among other factors) the unavailability of readily available, easy-touse software for the accurate and automatic delineation of SUs, and to the intrinsic difficulty in selecting a priori the appropriate size of the SUs for proper terrain partitioning in a given area.
To contribute to filling this gap, we developed new software for the automatic delineation of SUs in large and complex geographical areas based on terrain elevation data (i.e., a DEM) and a small number of user-defined parameters.We further proposed and tested a procedure for the optimal selection of the user parameters in a 2000 km 2 area in Umbria, central Italy.
We expect that the r.slopeunits software will be used to prepare terrain subdivisions in different morphological settings, contributing to the preparation of reliable and robust LS models and associated zonations.We acknowledge that further work is required to investigate the optimization of SU partitions for different statistically based tools used in the literature for LS modeling and zonation (e.g., discriminant analysis, neural network).Guzzetti et al. (2012) have argued that lack of standards hampers landslide studies.This is also the case for the production of landslide susceptibility models and associated maps.We expect that systematic use of the modeling framework proposed in this work (Fig. 1, Sect. 2) and of the r.slopeunits software for the objective selection of the user-defined modeling parameters, will contribute to the production of more reliable landslide susceptibility models.It will also facilitate the meaningful comparison of landslide susceptibility models produced, e.g., in the same area using different modeling tools, or in different and distant areas using the same or different modeling tools.
Finally, we argue that the proposed modeling framework and the r.slopeunits software are general and not siteor process-specific, and can be used to prepare terrain subdivisions for scopes different from landslide susceptibility mapping, including, e.g., definition of rainfall thresholds for possible landslide initiation, distributed hydrological modelling, statistically based inundation mapping, and the detection and mapping of landslides and other instability processes from satellite imagery.

Code availability
The code r.slopeunits is a free software under the GNU General Public License (v2 and higher).Details about the use and redistribution of the software can be found in the file https://grass.osgeo.org/home/copyright/that comes with GRASS GIS.The software and a short user manual can be downloaded at http://geomorphology.irpi.cnr.it/tools/slope-units (Geomorphology Research Group, 2016).
The Supplement related to this article is available online at doi:10.5194/gmd-9-3975-2016-supplement.

Figure 3 .Figure 3 .
Figure 3. Graphical representation of the circular variance of terrain aspect, 1 − |R|/Nv.Two groups of unit vectors are shown.The unit vectors represent the local direction of terrain aspect for each grid-cell, resulting in circular variance (A) of 0.1 and (B) of 0.6.Unit vectors are perpendicular to the local topography, represented by the grid-cell.

Figure 4 .
Figure 4. (a) Shaded relief image of the study area located on the upper Tiber River basin, Umbria, central Italy.Shades of green to brown show increasing elevation.Inset shows location of the study area in Italy.(b) Landslide inventory map(Cardinali et al., 2001).Inset shows the detail of the landslide mapping.Landslides (shown in red) were used to prepare the landslide susceptibility zonations shown in Fig.10.The maps are in the UTM zone 32, datum ED50 (EPSG:23032) reference system.

Figure 6 .
Figure 6.A group of 9 (out of 99) SU terrain subdivisions for a portion of the study area.The SUs were obtained changing the a and c parameters used by the r.slopeunit software.The maps are in the UTM zone 32, datum ED50 (EPSG:23032) reference system.

Figure 7 .
Figure 7. Example of subdivisions into SUs for a portion of the study area.Legend: blue, red, and green lines show boundaries of SUs of increasing density and corresponding decreasing average size.Yellow areas are landslides.The five maps show the same area in plan view (a) and in perspective view (b, c, d, e).

Figure 8 .
Figure 8.(a) Average and (b) standard deviation of the size of the SUs, for different combinations of a and c parameters.The minimum size of the SUs is fixed by the a parameter, and the maximum size of the SUs is independent of a.

Figure 9 .
Figure 9. Segmentation objective function values (Eq. 3) calculated for 99 SU partitions obtained using different combinations of the a and c parameters.

Figure 10 .
Figure 10.The group of 9 (out of 99) LS maps obtained with different SU partitions resulting from different combinations of the a and c parameters.The maps are in the UTM zone 32, datum ED50 (EPSG:23032) reference system.

Figure 11 .Figure 12 .
Figure11.Values of the AUC ROC calculated for LS maps estimated using SU partitions derived for different combination of the a and c user-defined modeling parameters.

Figure 14 .
Figure 14.The SU subdivision corresponding to the best parameters selected by our optimization procedure.The values of the parameters are a = 150 000 m 2 , a = 0.35.The associated LS result is shown in Fig. 10, in the central box, corresponding to the optimal parameters values.