System for Automated Geoscientiﬁc Analyses (SAGA) v. 2.1.4

. The System for Automated Geoscientiﬁc Analyses (SAGA) is an open source geographic information sys-tem (GIS), mainly licensed under the GNU General Public License. Since its ﬁrst release in 2004, SAGA has rapidly developed from a specialized tool for digital terrain analysis to a comprehensive and globally established GIS platform for scientiﬁc analysis and modeling. SAGA is coded in C + + in an object oriented design and runs under several operating systems including Windows and Linux. Key functional features of the modular software architecture comprise an application programming interface for the development and implementation of new geoscientiﬁc methods, a user friendly graphical user interface with many visualization options, a command line interpreter, and interfaces to interpreted languages like R and Python. The current version 2.1.4 offers more than 600 tools, which are implemented in dynamically loadable libraries or shared objects and represent the broad scopes of SAGA in numerous ﬁelds of geosci-entiﬁc endeavor and beyond. In this paper, we inform about the system’s architecture, functionality, and its current state of development and implementation. Furthermore, we high-light the wide spectrum of scientiﬁc applications of SAGA in a review of published studies, with special emphasis on the core application areas digital terrain analysis, geomorphology, soil science, climatology and meteorology, as well as remote sensing.


Introduction
During the last 10 to 15 years, free and open source software (FOSS) became a recognized counterpart to commercial solutions in the field of geographic information systems and science. Steiniger and Bocher (2009) give an overview of free and open source geographic information system (GIS) software with a focus on desktop solutions. More recently, Bivand (2014) discussed FOSS for geocomputation. The System for Automated Geoscientific Analyses (SAGA) (http: //saga-gis.org), the subject of this paper, is one of the recognized developments in this field. SAGA has been designed for an easy and effective implementation of spatial algorithms and hence serves as a framework for the development and implementation of geoscientific methods and models . Today, this modular organized programmable GIS software offers more than 600 methods comprising the entire spectrum of contemporary GIS from multiple file operations, referencing and projection routines over a range of topological and geometric analyses of both raster and vector data up to comprehensive modeling applications for various geoscientific fields.
The idea for the development of SAGA evolved in the late 1990s during the work on several research and development projects at the Dept. for Physical Geography, Göttingen, carried out on behalf of federal and state environmental authorities. In view of the specific needs for highquality and spatially explicit environmental information of the cooperating agencies, the original research focus was the analysis of raster data, particularly of digital elevation models (DEM), which have been used to predict soil proper-  (2004source: Source-Forge.net, 2014). ties, terrain controlled process dynamics as well as climate parameters at high spatial resolution. The development and implementation of apparently new methods for spatial analysis and modeling resulted in the design of three applications for digital terrain analysis, namely SARA (System zur Automatischen Reliefanalyse), SADO (System für Automatische Diskretisierung von Oberflächen) and DiGeM (Programm für Digitale Gelände-Modellierung), each with specific features but distinctly different architectures.
Due to the heterogeneity of the applied operating systems and tools in the working group, a cross operating system platform with integrated support for geodata analysis seemed necessary for further development and implementation of geoscientific methods. Due to the lack of a satisfying development platform at that time, SAGA has been created as a common developer basis and was first published as free open source software in 2004 in order to share its advantageous capabilities with geoscientists worldwide. Since then SAGA has built up a growing global user community (Fig. 1), which also led to many contributions from outside the developer core team and moreover fostered the foundation of the SAGA User Group Association in 2005, aiming to support a sustainable long-term development covering the whole range of user interests. Since 2007 the core development group of SAGA has been situated at the University of Hamburg, coordinating and actively driving the development process.
The momentum and dynamics of the SAGA development in the past 10 years is mirrored in both, the increasing number of methods and tools (Fig. 2), which rose from 119 tools in 2005 (version 1.2) up to more than 600 tools in the present version 2.1.4, and, particularly, in the fast growing user community. With about 100 000 downloads annually in the last 3 years (Fig. 3), SAGA today is an internationally renowned GIS developer platform for geodata analysis and geoscientific modeling. Figure 4 gives a rough overview of the different fields of data analysis and management addressed by the SAGA toolset. The categories have been derived from the menu structure, which might not reflect accurately the usability of all tools, e.g., in the case of multipurpose tools. But it can be seen that there is a quite comprehensive set of general tools for raster as well as vector data analysis and management and also that terrain analysis still can be seen as a strength of SAGA. This paper aims to respond to the frequent user requests for a review article. In the first section, we introduce the architecture of the SAGA framework, the state of development and implementation and highlight basic functionalities. Thereafter, we demonstrate its utility in various geoscientific disciplines by reviewing important methods as well as publications in the core fields of digital terrain analysis and geomorphology, digital soil mapping, climatology and meteorology, remote sensing and image processing.

The system
The initial motivation for the SAGA development was to establish a framework that supports an easy and effective implementation of algorithms or methods for spatial data anal-  yses. Furthermore, the integration of such implementations into more complex work flows for certain applications and the immediate accessibility in a user friendly way was one major concern. Thus, instead of creating one monolithic program, we designed a modular system with an application programming interface (API) at its base, method implementations, in the following referred to as tools, organized in separate program or tool libraries, and a graphical user interface (GUI) as a standard front end (Fig. 5). A command line interpreter as well as additional scripting environments were integrated as alternative front ends to run SAGA tools. In 2004, SAGA was firstly published as free software. Except for the API, source codes are licensed under the terms of the GNU General Public License (GPL) (Free Software Foundation, 2015). The API utilizes the Lesser GPL (LGPL), which also allows development of proprietary tools on its basis. The SAGA project is hosted at SourceForge (http://sourceforge.net), a web host for FOSS projects providing various additional services like version control systems, code trackers, forums and newsgroups. Although many details changed since its first version, the general system architecture remained the same. The following comments refer to the most recent SAGA version 2.1.4.
The system is programmed in the widespread C++ language (Stroustrup, 2014). Besides its support for objectoriented programming, one of its advantages is the availability of numerous additional GPL libraries and code snippets. Apart from the C++ standard library, SAGA's core system solely depends on the cross-platform wxWidgets GUI library (Smart et al., 2005). Especially the GUI extensively accesses the classes and functions of wxWidgets, but the API also employs the library, amongst others for string manipulation, platform independent file access, dynamic library management and XML (eXtensible Markup Language) formatted input and output. Several SAGA tool libraries link to other third party libraries, of which some are discussed later more explicitly. Due to the implementation of the wxWidgets library, SAGA compiles and runs on MS Windows and most  Unix like operating systems including FreeBSD and with some limitations regarding the GUI MacOSX. Makefiles and projects are provided for gcc and VisualC++ compilers with support for parallel processing based on the OpenMP library (http://openmp.org).

Application programming interface
The main purposes of SAGA's API are the provision of data structures, particularly for geodata handling, and the definition of tool interfaces. Central instances to store and request any data and tools loaded by the system are the Data Manager and the Tool Manager. Besides these core components, the API offers various additional classes and functions related to geodata management and analysis as well as general computational tasks, comprising tools for memory allocation, string manipulation, file access, formula parsing, index creation, vector algebra and matrix operations, and geometric and statistical analysis. In order to support tool developers, an API documentation is generated by means of the Doxygen help file generator (http://www.doxygen.org) and published at the SAGA homepage (http://www.saga-gis.org/saga_api_doc/html).
All classes related to geodata share a common base class that provides general information and functionality such as the data set name, the associated file path, and other specific metadata (Fig. 6). The supported data types currently comprise raster (grids) and tables with or without a geometry attribute, e.g., vector data representing either point, multipoint, polyline or polygon geometries (shapes). Specific vector data structures are provided by point cloud and TIN classes. The point cloud class is a container for storing mass point data as generated for instance by LiDAR scans. The TIN class creates a triangular irregular network for a given set of points providing topological information concerning point neighborhoods. Each data type supports a generic built-in file format. Raster data use a SAGA-specific binary format with an accompanying header. Table data use either tabbed text, comma separated values, or the DBase format. The latter is also applied for storing vector data attributes with the ESRI shapefile format. In order to enhance read and write performance, point clouds also employ a SAGA-specific binary file format. Besides, each stored data file is accompanied by a metadata file providing additional information such as map projection and original data source. Additionally, the metadata contain a data set history, which assembles information about all tools and settings that have been involved to create the data set.
SAGA tools are implemented in dynamically loadable libraries (DLLs) or shared objects, thus supporting the concept of modular plug-ins. Each SAGA tool is derived from a tool base class, which is specified in the API and defines the standard interface and functionality. In this class, the tool-specific input and output data of various data types as well as tool options are declared in a parameter list. At least two functions of the base class have to be implemented by each tool. The constructor defines the tool's interface with its name, a description of its usage and methodology, and the list of toolspecific parameters. Its parameter list is automatically evaluated by the system's framework prior to the execution of a tool. The execution itself is started by a call to the second compulsory function, which implements the tool's functionality.
Specialized variants of the tool base class are available for enhanced processing of single raster systems or for interaction of the tool with the GUI (i.e., to respond to mouse events occurring in a map). The API uses a callback system to support communication with the front end, e.g., giving a message of progress, error notification, or to force immediate update of a data set's graphical representation. The tool manager loads the DLLs and makes them accessible for the front ends. The tool manager also facilitates the call of existing tools, e.g., to run a tool out of another one. The GUI uses this feature, e.g., to read data file formats that are not generically supported by SAGA, for projecting geographic coordinate grids to be displayed in a map view, and to access and manipulate data through a database management system. Furthermore, this possibility of executing any loaded tool is used for the processing of tool chains. Tool chains are comparable to the models created with ArcGIS Model-Builder (ESRI, 2015) or QGIS Processing Modeler (QGIS Development Team, 2014), but unlike these, SAGA does not yet include a graphical tool chain designer. Tool chains are defined in a simple XML-based code that is interpreted by the tool chain class, another variant of the general tool class. This code has two major sections. The first part comprises the definition of the tool interface, e.g., the tool's name, description and a list of input, output and optional parameters. The second is a listing of the tools in the desired execution order. Tool chains are an efficient way to create new tools based on existing ones and perform exactly like hard coded tools. Since it is possible to create a tool chain directly from a data set history, a complex workflow can be developed interactively and then be automated for the analysis of further data sets.

General purpose tools
Besides more specific geoscientific methods, SAGA provides a wide range of general purpose tools. Since SAGA has a limited generic support for data file formats, the group of data import and export tools is an important feature to read and write data from various sources and store them to specific file formats supported by other software. Within this group, a toolset interfacing the Geospatial Data Abstraction Library (GDAL) should be highlighted (http://www.gdal.org) (Bivand, 2014). The GDAL itself provides drivers for more than 200 different raster and vector formats, and therefore the SAGA API's data manager automatically loads unknown file formats through the GDAL by default.
A powerful alternative to file-based data storage is provided by database management systems (DBMS), which offer the possibility of querying user defined subset selections. Various DBMS can be addressed with a toolset based on the Oracle, ODBC and DB2-CLI Template Library (OTL, http: //otl.sourceforge.net). A second toolset allows accessing of PostgreSQL databases (http://www.postgresql.org) and supports direct read and write access for vector and raster data, as provided by the PostGIS extension for spatial and geographic objects (http://postgis.net).
Tools related to georeferencing and coordinate systems are indispensable for the work with spatial data. Particularly the coordinate transformation tools make use of two alternative projection libraries, the Geographic Translator GEO-TRANS (http://earth-info.nga.mil/GandG/geotrans), and the Cartographic Projections Library PROJ.4 (http://trac.osgeo. org/proj/). Due to SAGA's original focus on raster data analysis, numerous tools are available for addressing this field, compris-ing tools for map algebra, resampling, and mosaicking. Nevertheless, the tool sets related to vector data also cover common operations such as overlays, buffers, spatial joins, and selections based on attributes or location. Overlay operations like intersection, difference, and union utilize the functions provided by the Clipper polygon clipping and offsetting library (http://sourceforge.net/projects/polyclipping). Besides, various methods for raster-vector and vector-raster conversions are available, including contour line derivation and interpolation of scattered point data.

Graphical user interface
In order to apply SAGA tools for geoprocessing, a front end program is needed, which controls tools and data management. SAGA's GUI allows an intuitive approach to the management, analysis, and visualization of spatial data (Fig. 7). It interactively gives access to the data and tool management and is complemented by a map management component. General commands can be executed through a menu and a tool bar. More specific commands for all managed elements, i.e., tools, data, and maps, are available through context menus. The properties of the selected element are shown in a separate control. While the number and type of properties depend on the respective element, a group of settings and a description are common to all managed elements. In the case of a tool for instance, the settings give control to input and output data selection as well as to further tool-specific options, while in the case of a data set, it provides several options for visualization in maps. Maps are the standard way of geodata display and offer various additional features, including scale bars, graticules, printing, and clipboard copying. Supplementary data visualization tools comprise histograms, charts, scatterplots, and 3-D views. Tools can be executed either from the tools manager or through the main menu's geoprocessing subgroup, where by default all tools can be found following submenu categories. Due to the large number of tools, a find and run command is a supplementary option to conveniently access all tools. In summary, the GUI is a good choice for interactive work on a single data selection with immediate visualization. However, if complex work flows are applied repeatedly to numerous data sets, alternative front ends with scripting support are certainly more suitable.

Scripting and integration with other systems
The SAGA command line interpreter (CLI) is used to execute SAGA tools from a command line environment without any visualization or data management facilities. Therefore, the file paths for all input and output data have to be specified within the command. The CLI enables the creation of batch or shell script files with subsequent calls of SAGA tools to automate complex work flows and automatically apply them to similar data sets. Furthermore, the CLI allows calling of SAGA tools from external programs in an easy way. This feature is used by the RSAGA package, which integrates SAGA tools with the R scripting environment (Brenning, 2008). Likewise, the Sistema EXTremeño de ANálisis TErritorial (SEXTANTE) makes SAGA tools accessible for various Java-based GIS programs (gvSIG, OpenJUMP). In 2013, SEXTANTE was ported to Python to become a functional addition to QGIS (QGIS Development Team, 2014), another popular free and open source GIS software, thus spreading SAGA tools amongst many more GIS users. Alternatively to CLI-based scripting, the SAGA API can also be accessed directly from Python. This connection is generated by means of the Simplified Wrapper and Interface Generator -SWIG (http://www.swig.org) -and provides access to almost the complete API. While this allows higher level scripting, the CLI remains easier to use for most purposes. Another option for integrating SAGA is direct linkage of the API. A very recent development is the integration of SAGA by the ZOO-Project (http://zoo-project.org/), which is a framework for setting up web processing services (Fenoy et al., 2013). MicroCity (http://microcity.sourceforge.net) is a branch of SAGA, which adds support for the LUA script programming language, and has been used for city road network analyses (Sun, 2015). Laserdata LiS is proprietary software, mainly a toolset extension for the work with massive point data from LiDAR prospection (http://www.laserdata.at). Table 1 summarizes the third party software mentioned above and underlines that SAGA is recognized first of all as a geoprocessing engine. Only MicroCity and LiS make at least partial use of SAGA's GUI capabilities.

Review of SAGA-related studies and applications
Due to its plethora of tools, covering a broad spectrum of geoscientific analysis and modeling applications and its user friendly environment, SAGA has been increasingly utilized for the processing of geodata, the implementation and calibration of statistical and process-based models in various fields, and the visualization of results. The following chapter provides an overview of studies using SAGA in selected geoscientific fields, which were identified as major applications of the software. However, due to the vast number of studies, this chapter only gives an outline without any claim to comprehensiveness. An overview is given in Table 2.

Digital terrain analysis and geomorphology
SAGA is a successor of three applications that were designed for digital terrain analysis, namely SARA, SADO, and Di-GeM, and up to today, the analysis of DEMs has remained a major focus. SAGA provides a comprehensive set of tools ranging from the preprocessing of DEMs (e.g., filtering and filling procedures) through the generation of simple first-and second-order terrain derivatives, such as slope and curvature, to more sophisticated and process-oriented terrain parameters, e.g., the altitude above the channel network, the relative slope position or the SAGA wetness index. The strong focus of SAGA in this particular field is distinctly reflected by its frequent utilization. This section gives a brief overview of available methods, applications, and studies with a special focus on the preprocessing of raw data, the derivation of terrain-based predictor variables for statistical modeling approaches, the classification of distinct geomorphographic units and the implementation of suitable tools for specific investigations. For further information on principles and applications in terrain analysis, including some of the methods that are implemented in SAGA, we refer to Wilson and Gallant (2000). Olaya and Conrad (2009) provided an introduction to geomorphometry in SAGA.

Preprocessing of raster data
Filtering of bare ground from radar interferometry or laser scanning data sets is a pre-requisite for many applications.
In order to make these data sets applicable for geomorphic and hydrologic analyses, SAGA offers tools to reduce elevation of forest canopies in radar-based DEMs (SRTM) and to identify and eliminate man-made terrain features in laserscanning-based data sets (Köthe and Bock, 2009). Wichmann et al. (2008) created digital terrain models (DTM) from airborne LiDAR data in different grid-cell sizes and investigated the effect on the simulation results of a debris flow model. DTM preparation included several processing steps such as morphological filtering and surface depression filling. The implementation of the debris flow model used in this study was described in Wichmann and Becht (2005) and Wichmann (2006). Peters-Walker et al. (2012) used SAGA and the Laserdata information system, a software extending SAGA's point cloud data management and analysis capabilities (Petrini-Montferri et al., 2009;Rieg et al., 2014) to derive a high-resolution DTM from LiDAR data. SAGA was subsequently applied to prepare all relevant catchment and channel network information to finally model discharge and bedload transport with the SimAlp/HQsim hydrologic model. In order to investigate climate and glacier changes from DEM and imagery data, Bolch (2006) and Bolch and Kamp (2006) proposed methods on glacier mapping from SRTM, ASTER and LANDSAT data. SAGA was used for DEM pre-processing, including import, projection and merging of data, as well as gap filling, curvature calculation and cluster analysis. Sediment transport in a proglacial river was investigated by Morche et al. (2012). The authors measured suspended sediment load and bed load along the river and quantified surface changes of sediment sources by comparison of multi-temporal terrestrial and airborne laser scanning data. LiDAR data, both airborne and terrestrial, were investigated by Haas et al. (2012) to quantify and analyze a rockfall event in the western Dolomites. Volume, axial ratio and run-out length of single boulders were derived from the point clouds and statistically analyzed. Furthermore, the surface roughness in the run-out zone of the rockfall was estimated based on point cloud data. The authors also proposed approaches on how to use the derived surface roughness with a rockfall simulation model and compared the simulation results for different rock radii and both airborne and terrestrial laser scanning derived surface roughness data sets.

Using terrain analysis for the derivation of predictor variables
Assuming that topographic characteristics are important drivers of various regional-and local-scale geodynamic processes, derivational terrain parameters are frequently utilized as predictor variables in statistical modeling applications. The close cooperation of the SAGA developer team with varying research projects resulted in the implementation of distinct terrain parameters, particularly suitable for specific investigations.  (2009) Böhner (2004,2005,2006 Targeting the derivation of a spatial map of landslide risk, Varga et al. (2006) used a certainty-factor analysis including data sets of slope, curvature, land use, geology and primary dipping. Mantovani et al. (2010) proposed a new approach for landslide geomorphological mapping, using SAGA for tasks like DTM interpolation, slope and aspect calculations and the delineation of watersheds and stream network. Heckmann et al. (2005) and Heckmann (2006) investigated the sediment transport by avalanches in alpine catchments. Besides quantitative field measurements, SAGA was utilized to implement a spatial model of geomorphic avalanche activity. Potential initiation sites were delineated with a certaintyfactor model (Heckmann and Becht, 2006). Process pathways were modeled by a random walk model, while run-out distance was calculated with a two-parameter friction model. The simulation results (flow height, flow velocity and slope) and field measurements were finally used in a discriminant analysis to establish an empirical relationship. As an example of statistical geocomputing combining R and SAGA in the RSAGA package, Brenning (2008) presented a landslide susceptibility analysis with generalized additive models. It was shown that several local as well as catchment-related morphometric attributes are important, mostly nonlinear, predictors of landslide occurrence. In a later study (Muenchow et al., 2012), RSAGA was employed to estimate geomorphic process rates of landslides along a humidity gradient in the tropical Andes.

Terrain classification
The usage of terrain (or landform) units allows the provision of soil scientists with conceptual spatial entities that are useful for mapping. The border lines of landform units highlight changing landform conditions that are frequently used to explain changing soil conditions. The classification concept of geomorphographic maps (Köthe et al., 1996) utilized DTM derivatives generated with SAGA. Locally adjusted thresholds of terrain parameters such as the SAGA wetness index, altitude above channel network, slope, relative slope position and terrain classification index for lowlands (Bock et al., 2007b) divide a DTM into main classes along a gradient from the relative bottom (bottom areas mostly corresponding to valley floors) to the relative top (summit areas corresponding to crests, peaks and ridges) of the terrain, with slopes and terraces as intermediate classes. This semi-automated terrain-based landscape structure classification is also useful for the analysis of physical and ecological settings. Wehberg et al. (2013) derived geomorphographic units (GMUs) as discrete terrain entities on the basis of a SRTM digital elevation model with SAGA-based terrain analysis. It was found that the GMUs reproduced the physiogeographic settings of the Okavango catchment appropriately and provided a basis for further mappings of vegetation or soil data. Brenning et al. (2012) investigated the detection of rock glacier flow structures by Gabor filters and IKONOS imagery. The authors used SAGA to calculate morphometric features like local slope, upslope contributing area, catchment slope, and catchment height. Also, the all-year potential incoming solar radiation was computed. These terrain attributes were then used in combination with texture attributes for classification.

Implementation of specific tools
The open source and modular architecture of SAGA easily enables the integration of new tools, if requested already available tools can be integrated. Thus, several working groups utilize SAGA as a framework for specific analyses and modeling applications. Grabs et al. (2010) proposed a new algorithm to compute side-separated contributions along stream networks for the differentiation of the riparian zone and adjacent upland lateral contributions on each side of a stream. They implemented a new method -SIDE (stream index division equations) -in SAGA, which determines the orientation of flow lines relative to the stream flow direction and allows distinguishing between stream left and right sides. Haas (2008) and Haas et al. (2011) used a rule-based statistical model for the estimation of fluvial sediment transport rates from hillslopes and small hillslope channels. They introduced the concept of a "sediment contributing area", derived by terrain analysis, and implemented the algorithm in SAGA. The index was finally used in a regression model to derive sediment transport rates. The same model was applied in a later study concerning the impact of forest fires on geomorphic processes (Sass et al., 2012). Studying alpine sediment cascades, Wichmann and Becht (2005) and Wichmann (2006) implemented a rockfall model in SAGA. The model can be used to delineate the process area of rockfalls and for geomorphic process and natural hazard zonation by combining a random walk path finding algorithm with several friction models. Wichmann and Becht (2006) reviewed several rockfall models, implemented in SAGA. Three different methods for run-out distance calculation, an empirical model and two processbased models were compared in greater detail regarding their applicability for natural hazard zonation and the analysis of geomorphic activity. Fey et al. (2011) applied an empirical rockfall model (including the modeling of process pathway and run-out distance) for the back calculations of mediumscale rockfalls. Heckmann et al. (2012) integrated a modification of the rockfall model by Wichmann (2006) in order to re-calculate a rockfall event. Wichmann and Becht (2004) described the development of a model for torrent bed type debris flow, including the delineation of debris flow initiation sites, process pathways, as well as erosion and deposition zones. Potential process initiation sites were derived from channel slope, upslope contributing area, and the sediment contributing area. Pathway and run-out distance were modeled by combining a grid-based random walk model with a two-parameter friction model. Zones of erosion and deposition were derived by threshold functions of channel gra-dient and modeled velocity. The model was validated with field measurements after a high magnitude rainstorm event.
Recently, in an analytical study of the susceptibility of geological discontinuities to gravitational mass movements, Jansen (2014) applied geological engineering methods implemented in SAGA. The existing methodology (Günther, 2003;Günther et al., 2004) was thereby enhanced by interpolation routines available in SAGA, which resulted in an increase in the plausibility of the results.

Digital soil mapping
Digital soil mapping is one of the major applications in SAGA, which still reflects the initial focus of the software on developing terrain analysis methods for soil science. Due to its tool diversity, SAGA became a standard software package in the field, which is underlined by citations in relevant reviews and textbooks (Behrens and Scholten, 2006;Boettinger, 2010;Hartemink et al., 2008;Hengl and Reuter, 2009;Lal and Stewart, 2014).
According to McBratney et al. (2003), digital soil mapping generally is understood as a collection of methods for the estimation of spatial information on soils. These estimates can comprise specific soil properties (continuous data), entire soil types or soil associations (classified data), or the susceptibility of soil against certain soil threats. Based on existing point measurements and/or spatial data from other origin, socalled predictor variables (sometimes referred to as covariates), digital soil mapping techniques can be applied in order to generate extensive soil information. The most important SAGA-specific predictor in the field of digital soil mapping is the SAGA wetness index (Böhner et al., 2002), which is derived from a DTM and reflects the theoretical distribution of lateral water accumulation.
Several studies determine a correlation between one or more soil properties and spatial predictor variables. Russ and Riek (2011) derived groundwater depths from SAGA covariates with the help of pedo-transfer functions. Other authors used statistical methods to model soil parameters, e.g., to apply a multiple regression model for the derivation of groundwater depths (Bock and Köthe, 2008). A more sophisticated regression model was developed by Kühn et al. (2009), who explained the spatial distribution of several soil parameters, such as soil organic carbon or carbonate content, by means of interpolated apparent electrical conductivity data and DEM covariates.
A number of studies refer to SAGA implemented geostatistical methods in order to derive soil parameters. Böhner and Köthe (2003) combined geostatistical regionalization (Kriging) of grain size fractions with pedo-transfer functions to develop a whole set of physical soil properties. Schauppenlehner (2008) and Lado et al. (2008) used regression Kriging to estimate the distribution of heavy metals in surface soils at European scale. Schauppenlehner (2008) compared different geostatistical methods for spatial estimation of soil quality values (Ackerzahl), while Kidd and Viscarra Rossel (2011) tested numerous derivatives from DEM and remote sensing data as covariates for the geostatistical modeling of soil properties. Furthermore, SAGA-based predictor variables can be used to directly model the distribution of classified data of soil types or soil associations by (external) machine learning algorithms such as Random Forest (Roecker et al., 2010) or Classification Tree (Willer et al., 2009). Besides, SAGA offers internal classification routines, such as the statistical cluster analysis, which was applied to combine DEM covariates to serve as a conceptual soil map (Bock et al., 2007a).
Extensive work has been done to model the spatial distribution of soil threats. In addition to the mentioned study on heavy metal distribution in European soils (Lado et al., 2008), several studies estimated soil degradation caused by erosion using SAGA routines. Besides the attempt to record existing soil degradation from erosion with the help of covariates (Milevski, 2008;Milevski et al., 2007), a particular focus is the evaluation of soil erosion risk using the SAGA revised slope length factor (Böhner and Selige, 2006) based on the empirical universal soil loss equation (Wischmeier and Smith, 1978). Patriche et al. (2012) and Enea et al. (2012) are further examples of studies dealing with this issue. Recently, SAGA was extended by a process-based soil erosion model (Setiawan, 2012). The WEELS model allows wind erosion modeling based on SAGA routines (Böhner et al., 2002). The vulnerability towards landslides was modeled using a combination of SAGA and the statistical package R (Brenning, 2008;Goetz et al., 2011).
The SAGA Landscape Evolution Model (SALEM) is rather between the disciplines of geomorphology and soil sciences. The SALEM (Bock et al., 2012) was designed for simulating processes that comprise the critical zone (National Research Council, 2001) and depicts the landscape elements in a process-oriented and time-dynamical way.

Climatology and meteorology
Historically, climatology and meteorology are probably the natural science disciplines where most of the epistemological progress was based on the spatially explicit analysis of local observations. Indeed, the ingenious inventions of climate measuring instruments since the late sixteenth century by Galileo Galilei, Evangelista Torricelli, Otto von Guericke, and others would have rarely contributed to this enormous scientific progress since the late eighteenth century, if instrumental observations had not been taken from different locations, enabling a "synoptic" analysis of spatial climate variations. Today, core functionalities of contemporary GIS, e.g., a range of basic spatial analysis routines, are well suited for this systematic examination of climate variations over space, typically applied to infer spatially continuous (gridded) climate layers from point source observations. Especially for this purpose of climate spatialization, SAGA is equipped with numerous tools, ranging from interpolation methods (local, global and regression-based regionalization) to complex surface parameterization techniques, supporting both, statistical and numerical climate modeling.
Interpolation methods: although SAGA provides an almost complete collection of local and global interpolation techniques, comprising deterministic (inverse distance weighted, local polynomial, radial basis functions), and in particular geostatistical Kriging methods and its derivatives (e.g., ordinary, universal, regression Kriging), interpolation techniques have rarely been used as a standalone method in climate spatialization (Fader et al., 2012). Instead, these methods were frequently combined with more complex statistical or numerical climate modeling approaches, using, e.g., ordinary Kriging, thin-plate spline or inverse distance weighted for the interpolation of model residues, in order to obtain a correction layer for the adjustment of modeling results (Böhner, 2006(Böhner, , 2005(Böhner, , 2004Gerlitz et al., 2014;Kessler et al., 2007;Soria-Auza et al., 2010;Weinzierl et al., 2013).
Regression-based regionalization: in topographically structured terrain and especially in high mountain environments, where the often low density and non-representative distribution of met stations lead to unsatisfying interpolation results, statistical methods utilizing explanatory fields (e.g., DEM elevation and its first-and second-order partial derivatives) as statistical predictors are distinctly preferred in climate spatialization. In order to achieve a proper estimation of the deterministic, topographically determined component of climate variations, SAGA provides a set of DEM parameters, explicitly constructed to represent orographically induced topoclimatic effects. A comprehensive explanation and justification of DEM-based measures for the parameterization of prominent topoclimatic phenomena, such as the anisotropic heating and formation of warm zones at slopes, cold air flow and cold air potentials, orographic effects on rainfall variations and wind velocities, and terrain determined alterations of shortwave and longwave radiation fluxes, is given in Böhner and Antonic (2009). Soria-Auza et al. (2010) conducted a systematic evaluation of SAGA-based climate spatialization results in comparison with Worldclim data (Hijmans et al., 2005), highlighting the advantages of SAGA-based surface parameterization methods and its added values for ecological modeling applications. Further spatialization results from mountainous modeling domains in Asia, Europe and South America are presented in Bolch (2006), Dietrich and Böhner (2008), Kessler et al. (2007), Lehrling (2006) and Stötter and Sailer (2012).
Numerical and statistical modeling: fostered by advanced computational capacities on the one hand, and increasingly freely available topographic data sets on the other, regression-based techniques evolved to a common standard in climate spatialization since the 1990s. The results, though often with high spatial resolution, however, are most commonly static representations of mean (monthly, seasonal or annual) climate values, whilst to date, the temporal high res-olution, dynamical representation of climate processes and values remain a domain of high-performance computingbased climate modeling, rarely covered by GIS. To overcome this disadvantage, Böhner (2006Böhner ( , 2005Böhner ( , 2004 introduced a SAGA-based climate spatialization approach, which basically merges statistical downscaling and surface parameterization techniques. Assuming the spatiotemporal variability of a climatic variable to be predominantly controlled by both, tropospheric and terrain-forced processes, different DEM parameters and monthly resolution tropospheric fields from NCAR-NCEP or ERA interim reanalyses had been considered as statistical predictors, supporting a monthly resolution estimation of climate variables for Germany (Böhner, 2004), the Okavango catchment  and different modeling domains in Central and High Asia (Böhner, 2006;Gerlitz et al., 2015;Klinge et al., 2015). In order to allow a dynamical representation of numerical climate simulations via SAGA, most recently, an efficient climate model interface has been realized, which enables one to import and process even complex and vast climate model outputs. Resulting options for climate spatialization and its advancements in terms of applicability, precision and temporal resolution, using daily and sub-daily resolution ERA Interim reanalysis as dynamic forcings for statistical downscaling are presented in  and Gerlitz (2014). Daily resolution values of climate variables for Baden-Württemberg on a 50 m × 50 m grid were derived by a SAGA-based downscaling approach, refining regional climate model (RCM) simulations of present-day climates and IPCC A1B and A2 climate scenarios, supporting the quantification of climate change effects on the forest site index for major tree species (Nothdurft et al., 2012).

Remote sensing and image processing
SAGA offers a large number of scientific methods and tools for multispectral, hyperspectral and thermal remote sensing, including geometrical preprocessing and spectral filtering techniques, multispectral and Fourier transformations, supervised and unsupervised classification algorithms, change detection, as well as segmentation methods for object-oriented image analysis (Blaschke, 2010). The filtering techniques in-clude standard linear bandpass filters (Gaussian, Laplacian, user defined) and nonlinear filters (majority, rank, and the morphological filter dilation, erosion, opening, closing, morphological gradient, top hat, and black hat). Besides different vegetation indices, tasseled cap transformation, sharpening techniques for multi-sensor data and principal component analysis for feature reduction are implemented. Available classifiers comprise k-means (unsupervised), binary encoding, parallelepiped, minimum and Mahalanobis distance, maximum likelihood, spectral angle mapper, decision trees, random forest, support vector machines, artificial neural networks, as well as ensemble-based classifiers, while region growing and watershed algorithms are available for image segmentation. Besides, some tools are available for the processing of data from specific sensors, like calibration to reflectance and generation of cloud masks for Landsat data. Furthermore, SAGA offers great possibilities to process Li-DAR point data (cf. Sect. 3.4). The code of the remote sensing and image processing tools is partly based on free libraries and open source software including OpenCV (Open Source Computer Vision, http://opencv.org/), ViGrA (Vision with Generic Algorithms, Köthe, 2000), LIBSVM (Chang and Lin, 2011) and GRASS GIS (Geographical Resources Analysis Support System) (Neteler et al., 2012).
A number of remote-sensing-based studies using SAGA have been published in different fields. In forestry, multispectral remote sensing data from different times and sensors and fragmentation models were used to examine deforestation and fragmentation of peat swamp forest covers in Malaysia (Kamlisa et al., 2012;Phua et al., 2008). Bechtel et al. (2008) applied different segmentation algorithms to identify individual tree crowns in very high resolution imagery. Furthermore, a new vegetation map of the central Namib was produced by a supervised classification of remote sensing data from MODIS and ETM+ 7 in combination with other environmental parameters such as climatic and topographic data by Jürgens (2013). In glaciology, SAGA remote sensing tools were used for the integration of different terrain data sets into a combined topographical and multispectral classification of rock glaciers (Brenning, 2009) to analyze the decrease in glacier cover in Kazakhstan (Bolch, 2007(Bolch, , 2006 and to detect flow structures in IKONOS imagery . Recently, several remote sensing studies successfully applied SAGA to generate surface parameters for local-scale climatic mapping and analysis in urban areas. In , digital height models from interferometric synthetic aperture radar data were established to derive roughness parameters and anemometric characteristics of urban surfaces while the thermal properties were investigated regarding the annual cycles of surface temperatures (Bechtel, 2015(Bechtel, , 2012(Bechtel, , 2011a. Furthermore, the urban surface parameters were implemented for urban climatic modeling applications (Bechtel et al., 2012b;Bechtel and Schmidt, 2011) and the classification of local climate zones (Bechtel, 2011b;Bechtel et al., 2015Bechtel et al., , 2012aBech-tel and Daneke, 2012). Additionally, SAGA was utilized to develop downscaling schemes for land surface temperature from geostationary satellites to spatial resolutions of up to 100 m (Bechtel et al., 2012c;Bechtel et al., 2013) and to estimate in situ air temperatures (Bechtel et al., 2014).
The image processing capacities also rouse interest in other disciplines. Asmussen et al. (2015) developed a workflow for petrographic thin section images, which comprises the delineation of rock forming minerals and data acquisition of various fabric parameters. The method is based on SAGA's seeded simple region growing algorithm to obtain flexible and precise object detection for any occurring mineral type in weathered sub-arkose sandstone material, and benefits from a reproducible and transparent GIS database.

Miscellaneous
Besides its core applications, such as terrain analysis, geomorphometry, soil mapping and climate spatialization, SAGA has been used in numerous studies of very different fields and wide scope. One physical geographic subdiscipline, which should be outlined separately due to a series of recent publications, is biogeography and particularly plant geography. Marini et al. (2007), e.g., studied the influence of local environmental factors on plant species richness in the Alps. Different terrain attributes like elevation or slope were tested in statistical models besides other climatic or nutrient parameters, in order to explain biodiversity patterns on different Alpine meadows. A similar approach was carried out by Marini et al. (2009) with the focus on the impact of farm size and terrain attributes on insect and plant diversity of managed grasslands in the Alps. Another example of a biogeographic relevant application can be found in Vanselow and Samimi (2011), who preprocessed a DEM by means of the fill sinks method and afterwards derived terrain attributes, e.g., altitude above channel network and slope, in order to designate potential future pastures in Tajikistan. Other environmental studies comprise Heinrich and Conrad (2008), who applied a cellular automation approach to testing the simulation of flow and diffusion dynamics in shallow water bodies, and Czegka and Junge (2008), who used SAGA as a mobile field tool in environmental geochemistry research activities (GPS coupled navigation on lakes, in situ observation monitoring). Liersch and Volk (2008) implemented a metric conceptual rainfall-runoff model including calibration tools within the SAGA framework with the intention of flood risk prediction.
Besides geoscience, GIS products are increasingly utilized in various fields dealing with spatially explicit data. For example, in archaeology, Bernardini et al. (2013) investigated airborne LiDAR derived images, in order to detect and to monitor hitherto unknown anthropogenic structures and archaeological sites in the Trieste Karst landscape. Leopold et al. (2011) likewise produced a DEM, based on LiDAR information, and subsequently derived a high-resolution cross section with the objective of revealing the location of a former production site of bronze statues at the southern slope of the Acropolis of Athens. The approach was compared to other methods of demarcating prehistoric surface structures (e.g., magnetometry, electrical resistivity tomography, ground-penetration radar) and proved to be suitable as an additional archaeological reconstruction tool. Kaye (2013) utilized SAGA-based terrain analysis, in particular the catchment water balance methodology and the implemented hydrological tools. By means of the GIS-based approach, the author reconstructed historical water availability in southern England as an important logistic factor for the Roman Army and contributed to finding the location of Boudica's last battle in AD 60 or 61.
Finally, two very specific examples for the utilization of SAGA should be highlighted, which particularly indicate the wide scope of the software. Different spatial detection techniques of radioactive matter being randomly dispersed on a free area were tested for the purpose of quantifying dose rates, surface activities, mass concentrations in aerosols and their temporal and spatial distributions (Prouza et al., 2010). Most peculiarly, the morphometric investigation of chewing surfaces of animals should be emphasized, which was based on an automated analysis using specific interpolation and segmentation approaches (Czech, 2010).

Conclusions and outlook
Since the first public release in 2004, SAGA has very rapidly developed from rather specialized niche software for digital terrain analyses to a mature stage FOSS GIS platform, offering the entire spectrum of geodata analysis, mapping and modeling applications of contemporary GIS software. Right from the beginning, the multiple options of an objectoriented programming environment and the consequently modular organized architecture fostered the development of specific methods often distinctly beyond off-the-shelf GIS products. In the present version 2.1.4, SAGA offers more than 600 tools, many of them reflecting a leading paradigm in the SAGA development: advancements in GIS development are only to be achieved if the development is closely embedded in research processes. Indeed, in responding to scientific questions and needs, SAGA is not only a research supporting tool but also likewise its outcome, which complements related scientific publications with the ultimate documentation of used methodologies in the form of source codes.
Today SAGA is maintained and enhanced at the Institute of Geography, Physical Geography Section, at Hamburg University; however, a fast growing user community and developers all over the world contribute to the evolution by their specific needs and applications. For the near future, support for multidimensional raster, e.g., addressing volumes, time series and hyperspectral data, as well as a stronger integration with DBMS like PostgreSQL/PostGIS, are envisaged to broaden the application of SAGA. A further challenging field of the coming SAGA development will be the enhancement of tools and methods, supporting a scale-crossing amalgamation of climate and environmental modeling applications. Already today, SAGA enables an assimilation, dynamical representation and statistical downscaling of climate model outputs, required to interlink dynamical climate forcings with process models for case studies and climate impact analyses. By bridging both, spatial scales and scientific disciplines, SAGA responds to the steadily increasing needs for high-quality, spatially explicit data and information, ultimately tracing GIS to its roots. Indeed, already in 1965, Michael F. Ducey and Duane F. Marble stated in probably the first written document to use the term "Geographic Information System" that "the primary function of a Geographic Information System is to make spatially oriented information available in a usable form". Happy 50th birthday GIS!

Code availability
The SAGA source code repository is hosted at http:// sourceforge.net/projects/saga-gis/ using an Apache Subversion (SVN) server as a versioning and revision control system. Read only access is possible without login. A branch is provided for SAGA version 2.1.4, which is referred to in this paper (release-2-1-4, revision 2335). Alternatively, the source code for this version can be downloaded directly from the files section at http://sourceforge.net/projects/saga-gis/.
The Supplement related to this article is available online at doi:10.5194/gmd-8-1991-2015-supplement.