The System for Automated Geoscientific Analyses (SAGA) is an open source
geographic information system (GIS), mainly licensed under the GNU General
Public License. Since its first release in 2004, SAGA has rapidly developed
from a specialized tool for digital terrain analysis to a comprehensive and
globally established GIS platform for scientific analysis and modeling. SAGA
is coded in C
During the last 10 to 15 years, free and open source software (FOSS) became a
recognized counterpart to commercial solutions in the field of geographic
information systems and science. Steiniger and Bocher (2009) give an overview
of free and open source geographic information system (GIS) software with a
focus on desktop solutions. More recently, Bivand (2014) discussed FOSS for
geocomputation. The System for Automated Geoscientific Analyses (SAGA)
(
SAGA has been designed for an easy and effective implementation of spatial algorithms and hence serves as a framework for the development and implementation of geoscientific methods and models (Conrad, 2007). Today, this modular organized programmable GIS software offers more than 600 methods comprising the entire spectrum of contemporary GIS from multiple file operations, referencing and projection routines over a range of topological and geometric analyses of both raster and vector data up to comprehensive modeling applications for various geoscientific fields.
The idea for the development of SAGA evolved in the late 1990s during the work on several research and development projects at the Dept. for Physical Geography, Göttingen, carried out on behalf of federal and state environmental authorities. In view of the specific needs for high-quality and spatially explicit environmental information of the cooperating agencies, the original research focus was the analysis of raster data, particularly of digital elevation models (DEM), which have been used to predict soil properties, terrain controlled process dynamics as well as climate parameters at high spatial resolution. The development and implementation of apparently new methods for spatial analysis and modeling resulted in the design of three applications for digital terrain analysis, namely SARA (System zur Automatischen Reliefanalyse), SADO (System für Automatische Diskretisierung von Oberflächen) and DiGeM (Programm für Digitale Gelände-Modellierung), each with specific features but distinctly different architectures.
Due to the heterogeneity of the applied operating systems and tools in the working group, a cross operating system platform with integrated support for geodata analysis seemed necessary for further development and implementation of geoscientific methods. Due to the lack of a satisfying development platform at that time, SAGA has been created as a common developer basis and was first published as free open source software in 2004 in order to share its advantageous capabilities with geoscientists worldwide. Since then SAGA has built up a growing global user community (Fig. 1), which also led to many contributions from outside the developer core team and moreover fostered the foundation of the SAGA User Group Association in 2005, aiming to support a sustainable long-term development covering the whole range of user interests. Since 2007 the core development group of SAGA has been situated at the University of Hamburg, coordinating and actively driving the development process.
Total downloads by country (2004–2014; source: SourceForge.net, 2014).
The momentum and dynamics of the SAGA development in the past 10 years is mirrored in both, the increasing number of methods and tools (Fig. 2), which rose from 119 tools in 2005 (version 1.2) up to more than 600 tools in the present version 2.1.4, and, particularly, in the fast growing user community. With about 100 000 downloads annually in the last 3 years (Fig. 3), SAGA today is an internationally renowned GIS developer platform for geodata analysis and geoscientific modeling. Figure 4 gives a rough overview of the different fields of data analysis and management addressed by the SAGA toolset. The categories have been derived from the menu structure, which might not reflect accurately the usability of all tools, e.g., in the case of multipurpose tools. But it can be seen that there is a quite comprehensive set of general tools for raster as well as vector data analysis and management and also that terrain analysis still can be seen as a strength of SAGA.
Number of tools between 2005 (v1.2) and 2014 (v2.1.4).
Average monthly downloads per year (source: SourceForge.net, 2014).
Number of tools by category. Subcategories are shown for the three largest groups: grid tools, shapes tools, and terrain analysis.
This paper aims to respond to the frequent user requests for a review article. In the first section, we introduce the architecture of the SAGA framework, the state of development and implementation and highlight basic functionalities. Thereafter, we demonstrate its utility in various geoscientific disciplines by reviewing important methods as well as publications in the core fields of digital terrain analysis and geomorphology, digital soil mapping, climatology and meteorology, remote sensing and image processing.
The initial motivation for the SAGA development was to establish a framework that supports an easy and effective implementation of algorithms or methods for spatial data analyses. Furthermore, the integration of such implementations into more complex work flows for certain applications and the immediate accessibility in a user friendly way was one major concern. Thus, instead of creating one monolithic program, we designed a modular system with an application programming interface (API) at its base, method implementations, in the following referred to as tools, organized in separate program or tool libraries, and a graphical user interface (GUI) as a standard front end (Fig. 5). A command line interpreter as well as additional scripting environments were integrated as alternative front ends to run SAGA tools.
In 2004, SAGA was firstly published as free software. Except for the API,
source codes are licensed under the terms of the GNU General Public License
(GPL) (Free Software Foundation, 2015). The API utilizes the Lesser GPL
(LGPL), which also allows development of proprietary tools on its basis. The
SAGA project is hosted at SourceForge (
The system is programmed in the widespread C
System architecture.
The main purposes of SAGA's API are the provision of data structures, particularly for geodata handling, and the definition of tool interfaces. Central instances to store and request any data and tools loaded by the system are the Data Manager and the Tool Manager.
Besides these core components, the API offers various additional classes and
functions related to geodata management and analysis as well as general
computational tasks, comprising tools for memory allocation, string
manipulation, file access, formula parsing, index creation, vector algebra
and matrix operations, and geometric and statistical analysis. In order to
support tool developers, an API documentation is generated by means of the
Doxygen help file generator (
All classes related to geodata share a common base class that provides general information and functionality such as the data set name, the associated file path, and other specific metadata (Fig. 6). The supported data types currently comprise raster (grids) and tables with or without a geometry attribute, e.g., vector data representing either point, multipoint, polyline or polygon geometries (shapes). Specific vector data structures are provided by point cloud and TIN classes. The point cloud class is a container for storing mass point data as generated for instance by LiDAR scans. The TIN class creates a triangular irregular network for a given set of points providing topological information concerning point neighborhoods. Each data type supports a generic built-in file format. Raster data use a SAGA-specific binary format with an accompanying header. Table data use either tabbed text, comma separated values, or the DBase format. The latter is also applied for storing vector data attributes with the ESRI shapefile format. In order to enhance read and write performance, point clouds also employ a SAGA-specific binary file format. Besides, each stored data file is accompanied by a metadata file providing additional information such as map projection and original data source. Additionally, the metadata contain a data set history, which assembles information about all tools and settings that have been involved to create the data set.
Data object hierarchy.
SAGA tools are implemented in dynamically loadable libraries (DLLs) or shared objects, thus supporting the concept of modular plug-ins. Each SAGA tool is derived from a tool base class, which is specified in the API and defines the standard interface and functionality. In this class, the tool-specific input and output data of various data types as well as tool options are declared in a parameter list. At least two functions of the base class have to be implemented by each tool. The constructor defines the tool's interface with its name, a description of its usage and methodology, and the list of tool-specific parameters. Its parameter list is automatically evaluated by the system's framework prior to the execution of a tool. The execution itself is started by a call to the second compulsory function, which implements the tool's functionality.
Specialized variants of the tool base class are available for enhanced processing of single raster systems or for interaction of the tool with the GUI (i.e., to respond to mouse events occurring in a map). The API uses a callback system to support communication with the front end, e.g., giving a message of progress, error notification, or to force immediate update of a data set's graphical representation. The tool manager loads the DLLs and makes them accessible for the front ends. The tool manager also facilitates the call of existing tools, e.g., to run a tool out of another one. The GUI uses this feature, e.g., to read data file formats that are not generically supported by SAGA, for projecting geographic coordinate grids to be displayed in a map view, and to access and manipulate data through a database management system. Furthermore, this possibility of executing any loaded tool is used for the processing of tool chains. Tool chains are comparable to the models created with ArcGIS ModelBuilder (ESRI, 2015) or QGIS Processing Modeler (QGIS Development Team, 2014), but unlike these, SAGA does not yet include a graphical tool chain designer. Tool chains are defined in a simple XML-based code that is interpreted by the tool chain class, another variant of the general tool class. This code has two major sections. The first part comprises the definition of the tool interface, e.g., the tool's name, description and a list of input, output and optional parameters. The second is a listing of the tools in the desired execution order. Tool chains are an efficient way to create new tools based on existing ones and perform exactly like hard coded tools. Since it is possible to create a tool chain directly from a data set history, a complex workflow can be developed interactively and then be automated for the analysis of further data sets.
Besides more specific geoscientific methods, SAGA provides a wide range of
general purpose tools. Since SAGA has a limited generic support for data file
formats, the group of data import and export tools is an important feature to
read and write data from various sources and store them to specific file
formats supported by other software. Within this group, a toolset interfacing
the Geospatial Data Abstraction Library (GDAL) should be highlighted
(
A powerful alternative to file-based data storage is provided by database
management systems (DBMS), which offer the possibility of querying user
defined subset selections. Various DBMS can be addressed with a toolset based
on the Oracle, ODBC and DB2-CLI Template Library (OTL,
Tools related to georeferencing and coordinate systems are indispensable for
the work with spatial data. Particularly the coordinate transformation tools
make use of two alternative projection libraries, the Geographic Translator
GEOTRANS (
Due to SAGA's original focus on raster data analysis, numerous tools are
available for addressing this field, comprising tools for map algebra,
resampling, and mosaicking. Nevertheless, the tool sets related to vector
data also cover common operations such as overlays, buffers, spatial joins,
and selections based on attributes or location. Overlay operations like
intersection, difference, and union utilize the functions provided by the
Clipper polygon clipping and offsetting library
(
In order to apply SAGA tools for geoprocessing, a front end program is needed, which controls tools and data management. SAGA's GUI allows an intuitive approach to the management, analysis, and visualization of spatial data (Fig. 7). It interactively gives access to the data and tool management and is complemented by a map management component. General commands can be executed through a menu and a tool bar. More specific commands for all managed elements, i.e., tools, data, and maps, are available through context menus. The properties of the selected element are shown in a separate control. While the number and type of properties depend on the respective element, a group of settings and a description are common to all managed elements. In the case of a tool for instance, the settings give control to input and output data selection as well as to further tool-specific options, while in the case of a data set, it provides several options for visualization in maps. Maps are the standard way of geodata display and offer various additional features, including scale bars, graticules, printing, and clipboard copying. Supplementary data visualization tools comprise histograms, charts, scatterplots, and 3-D views. Tools can be executed either from the tools manager or through the main menu's geoprocessing subgroup, where by default all tools can be found following submenu categories. Due to the large number of tools, a find and run command is a supplementary option to conveniently access all tools. In summary, the GUI is a good choice for interactive work on a single data selection with immediate visualization. However, if complex work flows are applied repeatedly to numerous data sets, alternative front ends with scripting support are certainly more suitable.
Graphical user interface.
The SAGA command line interpreter (CLI) is used to execute SAGA tools from a
command line environment without any visualization or data management
facilities. Therefore, the file paths for all input and output data have to
be specified within the command. The CLI enables the creation of batch or
shell script files with subsequent calls of SAGA tools to automate complex
work flows and automatically apply them to similar data sets. Furthermore,
the CLI allows calling of SAGA tools from external programs in an easy way.
This feature is used by the RSAGA package, which integrates SAGA tools with
the R scripting environment (Brenning, 2008). Likewise, the Sistema
EXTremeño de ANálisis TErritorial (SEXTANTE) makes SAGA tools
accessible for various Java-based GIS programs (gvSIG, OpenJUMP). In 2013,
SEXTANTE was ported to Python to become a functional addition to QGIS (QGIS
Development Team, 2014), another popular free and open source GIS software,
thus spreading SAGA tools amongst many more GIS users. Alternatively to
CLI-based scripting, the SAGA API can also be accessed directly from Python.
This connection is generated by means of the Simplified Wrapper and Interface
Generator – SWIG (
Another option for integrating SAGA is direct linkage of the API. A very
recent development is the integration of SAGA by the ZOO-Project
(
Table 1 summarizes the third party software mentioned above and underlines that SAGA is recognized first of all as a geoprocessing engine. Only MicroCity and LiS make at least partial use of SAGA's GUI capabilities.
Software utilizing SAGA.
Due to its plethora of tools, covering a broad spectrum of geoscientific analysis and modeling applications and its user friendly environment, SAGA has been increasingly utilized for the processing of geodata, the implementation and calibration of statistical and process-based models in various fields, and the visualization of results. The following chapter provides an overview of studies using SAGA in selected geoscientific fields, which were identified as major applications of the software. However, due to the vast number of studies, this chapter only gives an outline without any claim to comprehensiveness. An overview is given in Table 2.
Studies utilizing SAGA in various research areas.
SAGA is a successor of three applications that were designed for digital terrain analysis, namely SARA, SADO, and DiGeM, and up to today, the analysis of DEMs has remained a major focus. SAGA provides a comprehensive set of tools ranging from the preprocessing of DEMs (e.g., filtering and filling procedures) through the generation of simple first- and second-order terrain derivatives, such as slope and curvature, to more sophisticated and process-oriented terrain parameters, e.g., the altitude above the channel network, the relative slope position or the SAGA wetness index. The strong focus of SAGA in this particular field is distinctly reflected by its frequent utilization. This section gives a brief overview of available methods, applications, and studies with a special focus on the preprocessing of raw data, the derivation of terrain-based predictor variables for statistical modeling approaches, the classification of distinct geomorphographic units and the implementation of suitable tools for specific investigations. For further information on principles and applications in terrain analysis, including some of the methods that are implemented in SAGA, we refer to Wilson and Gallant (2000). Olaya and Conrad (2009) provided an introduction to geomorphometry in SAGA.
Filtering of bare ground from radar interferometry or laser scanning data sets is a pre-requisite for many applications. In order to make these data sets applicable for geomorphic and hydrologic analyses, SAGA offers tools to reduce elevation of forest canopies in radar-based DEMs (SRTM) and to identify and eliminate man-made terrain features in laser-scanning-based data sets (Köthe and Bock, 2009). Wichmann et al. (2008) created digital terrain models (DTM) from airborne LiDAR data in different grid-cell sizes and investigated the effect on the simulation results of a debris flow model. DTM preparation included several processing steps such as morphological filtering and surface depression filling. The implementation of the debris flow model used in this study was described in Wichmann and Becht (2005) and Wichmann (2006). Peters-Walker et al. (2012) used SAGA and the Laserdata information system, a software extending SAGA's point cloud data management and analysis capabilities (Petrini-Montferri et al., 2009; Rieg et al., 2014) to derive a high-resolution DTM from LiDAR data. SAGA was subsequently applied to prepare all relevant catchment and channel network information to finally model discharge and bedload transport with the SimAlp/HQsim hydrologic model. In order to investigate climate and glacier changes from DEM and imagery data, Bolch (2006) and Bolch and Kamp (2006) proposed methods on glacier mapping from SRTM, ASTER and LANDSAT data. SAGA was used for DEM pre-processing, including import, projection and merging of data, as well as gap filling, curvature calculation and cluster analysis. Sediment transport in a proglacial river was investigated by Morche et al. (2012). The authors measured suspended sediment load and bed load along the river and quantified surface changes of sediment sources by comparison of multi-temporal terrestrial and airborne laser scanning data. LiDAR data, both airborne and terrestrial, were investigated by Haas et al. (2012) to quantify and analyze a rockfall event in the western Dolomites. Volume, axial ratio and run-out length of single boulders were derived from the point clouds and statistically analyzed. Furthermore, the surface roughness in the run-out zone of the rockfall was estimated based on point cloud data. The authors also proposed approaches on how to use the derived surface roughness with a rockfall simulation model and compared the simulation results for different rock radii and both airborne and terrestrial laser scanning derived surface roughness data sets.
Assuming that topographic characteristics are important drivers of various regional- and local-scale geodynamic processes, derivational terrain parameters are frequently utilized as predictor variables in statistical modeling applications. The close cooperation of the SAGA developer team with varying research projects resulted in the implementation of distinct terrain parameters, particularly suitable for specific investigations.
Targeting the derivation of a spatial map of landslide risk, Varga et al. (2006) used a certainty-factor analysis including data sets of slope, curvature, land use, geology and primary dipping. Mantovani et al. (2010) proposed a new approach for landslide geomorphological mapping, using SAGA for tasks like DTM interpolation, slope and aspect calculations and the delineation of watersheds and stream network. Heckmann et al. (2005) and Heckmann (2006) investigated the sediment transport by avalanches in alpine catchments. Besides quantitative field measurements, SAGA was utilized to implement a spatial model of geomorphic avalanche activity. Potential initiation sites were delineated with a certainty-factor model (Heckmann and Becht, 2006). Process pathways were modeled by a random walk model, while run-out distance was calculated with a two-parameter friction model. The simulation results (flow height, flow velocity and slope) and field measurements were finally used in a discriminant analysis to establish an empirical relationship. As an example of statistical geocomputing combining R and SAGA in the RSAGA package, Brenning (2008) presented a landslide susceptibility analysis with generalized additive models. It was shown that several local as well as catchment-related morphometric attributes are important, mostly nonlinear, predictors of landslide occurrence. In a later study (Muenchow et al., 2012), RSAGA was employed to estimate geomorphic process rates of landslides along a humidity gradient in the tropical Andes.
The usage of terrain (or landform) units allows the provision of soil scientists with conceptual spatial entities that are useful for mapping. The border lines of landform units highlight changing landform conditions that are frequently used to explain changing soil conditions. The classification concept of geomorphographic maps (Köthe et al., 1996) utilized DTM derivatives generated with SAGA. Locally adjusted thresholds of terrain parameters such as the SAGA wetness index, altitude above channel network, slope, relative slope position and terrain classification index for lowlands (Bock et al., 2007b) divide a DTM into main classes along a gradient from the relative bottom (bottom areas mostly corresponding to valley floors) to the relative top (summit areas corresponding to crests, peaks and ridges) of the terrain, with slopes and terraces as intermediate classes. This semi-automated terrain-based landscape structure classification is also useful for the analysis of physical and ecological settings. Wehberg et al. (2013) derived geomorphographic units (GMUs) as discrete terrain entities on the basis of a SRTM digital elevation model with SAGA-based terrain analysis. It was found that the GMUs reproduced the physiogeographic settings of the Okavango catchment appropriately and provided a basis for further mappings of vegetation or soil data. Brenning et al. (2012) investigated the detection of rock glacier flow structures by Gabor filters and IKONOS imagery. The authors used SAGA to calculate morphometric features like local slope, upslope contributing area, catchment slope, and catchment height. Also, the all-year potential incoming solar radiation was computed. These terrain attributes were then used in combination with texture attributes for classification.
The open source and modular architecture of SAGA easily enables the integration of new tools, if requested already available tools can be integrated. Thus, several working groups utilize SAGA as a framework for specific analyses and modeling applications.
Grabs et al. (2010) proposed a new algorithm to compute side-separated contributions along stream networks for the differentiation of the riparian zone and adjacent upland lateral contributions on each side of a stream. They implemented a new method – SIDE (stream index division equations) – in SAGA, which determines the orientation of flow lines relative to the stream flow direction and allows distinguishing between stream left and right sides. Haas (2008) and Haas et al. (2011) used a rule-based statistical model for the estimation of fluvial sediment transport rates from hillslopes and small hillslope channels. They introduced the concept of a “sediment contributing area”, derived by terrain analysis, and implemented the algorithm in SAGA. The index was finally used in a regression model to derive sediment transport rates. The same model was applied in a later study concerning the impact of forest fires on geomorphic processes (Sass et al., 2012). Studying alpine sediment cascades, Wichmann and Becht (2005) and Wichmann (2006) implemented a rockfall model in SAGA. The model can be used to delineate the process area of rockfalls and for geomorphic process and natural hazard zonation by combining a random walk path finding algorithm with several friction models. Wichmann and Becht (2006) reviewed several rockfall models, implemented in SAGA. Three different methods for run-out distance calculation, an empirical model and two process-based models were compared in greater detail regarding their applicability for natural hazard zonation and the analysis of geomorphic activity. Fey et al. (2011) applied an empirical rockfall model (including the modeling of process pathway and run-out distance) for the back calculations of medium-scale rockfalls. Heckmann et al. (2012) integrated a modification of the rockfall model by Wichmann (2006) in order to re-calculate a rockfall event. Wichmann and Becht (2004) described the development of a model for torrent bed type debris flow, including the delineation of debris flow initiation sites, process pathways, as well as erosion and deposition zones. Potential process initiation sites were derived from channel slope, upslope contributing area, and the sediment contributing area. Pathway and run-out distance were modeled by combining a grid-based random walk model with a two-parameter friction model. Zones of erosion and deposition were derived by threshold functions of channel gradient and modeled velocity. The model was validated with field measurements after a high magnitude rainstorm event. Recently, in an analytical study of the susceptibility of geological discontinuities to gravitational mass movements, Jansen (2014) applied geological engineering methods implemented in SAGA. The existing methodology (Günther, 2003; Günther et al., 2004) was thereby enhanced by interpolation routines available in SAGA, which resulted in an increase in the plausibility of the results.
Digital soil mapping is one of the major applications in SAGA, which still reflects the initial focus of the software on developing terrain analysis methods for soil science. Due to its tool diversity, SAGA became a standard software package in the field, which is underlined by citations in relevant reviews and textbooks (Behrens and Scholten, 2006; Boettinger, 2010; Hartemink et al., 2008; Hengl and Reuter, 2009; Lal and Stewart, 2014).
According to McBratney et al. (2003), digital soil mapping generally is understood as a collection of methods for the estimation of spatial information on soils. These estimates can comprise specific soil properties (continuous data), entire soil types or soil associations (classified data), or the susceptibility of soil against certain soil threats. Based on existing point measurements and/or spatial data from other origin, so-called predictor variables (sometimes referred to as covariates), digital soil mapping techniques can be applied in order to generate extensive soil information. The most important SAGA-specific predictor in the field of digital soil mapping is the SAGA wetness index (Böhner et al., 2002), which is derived from a DTM and reflects the theoretical distribution of lateral water accumulation.
Several studies determine a correlation between one or more soil properties and spatial predictor variables. Russ and Riek (2011) derived groundwater depths from SAGA covariates with the help of pedo-transfer functions. Other authors used statistical methods to model soil parameters, e.g., to apply a multiple regression model for the derivation of groundwater depths (Bock and Köthe, 2008). A more sophisticated regression model was developed by Kühn et al. (2009), who explained the spatial distribution of several soil parameters, such as soil organic carbon or carbonate content, by means of interpolated apparent electrical conductivity data and DEM covariates.
A number of studies refer to SAGA implemented geostatistical methods in order to derive soil parameters. Böhner and Köthe (2003) combined geostatistical regionalization (Kriging) of grain size fractions with pedo-transfer functions to develop a whole set of physical soil properties. Schauppenlehner (2008) and Lado et al. (2008) used regression Kriging to estimate the distribution of heavy metals in surface soils at European scale. Schauppenlehner (2008) compared different geostatistical methods for spatial estimation of soil quality values (Ackerzahl), while Kidd and Viscarra Rossel (2011) tested numerous derivatives from DEM and remote sensing data as covariates for the geostatistical modeling of soil properties. Furthermore, SAGA-based predictor variables can be used to directly model the distribution of classified data of soil types or soil associations by (external) machine learning algorithms such as Random Forest (Roecker et al., 2010) or Classification Tree (Willer et al., 2009). Besides, SAGA offers internal classification routines, such as the statistical cluster analysis, which was applied to combine DEM covariates to serve as a conceptual soil map (Bock et al., 2007a).
Extensive work has been done to model the spatial distribution of soil threats. In addition to the mentioned study on heavy metal distribution in European soils (Lado et al., 2008), several studies estimated soil degradation caused by erosion using SAGA routines. Besides the attempt to record existing soil degradation from erosion with the help of covariates (Milevski, 2008; Milevski et al., 2007), a particular focus is the evaluation of soil erosion risk using the SAGA revised slope length factor (Böhner and Selige, 2006) based on the empirical universal soil loss equation (Wischmeier and Smith, 1978). Patriche et al. (2012) and Enea et al. (2012) are further examples of studies dealing with this issue. Recently, SAGA was extended by a process-based soil erosion model (Setiawan, 2012). The WEELS model allows wind erosion modeling based on SAGA routines (Böhner et al., 2002). The vulnerability towards landslides was modeled using a combination of SAGA and the statistical package R (Brenning, 2008; Goetz et al., 2011).
The SAGA Landscape Evolution Model (SALEM) is rather between the disciplines of geomorphology and soil sciences. The SALEM (Bock et al., 2012) was designed for simulating processes that comprise the critical zone (National Research Council, 2001) and depicts the landscape elements in a process-oriented and time-dynamical way.
Historically, climatology and meteorology are probably the natural science disciplines where most of the epistemological progress was based on the spatially explicit analysis of local observations. Indeed, the ingenious inventions of climate measuring instruments since the late sixteenth century by Galileo Galilei, Evangelista Torricelli, Otto von Guericke, and others would have rarely contributed to this enormous scientific progress since the late eighteenth century, if instrumental observations had not been taken from different locations, enabling a “synoptic” analysis of spatial climate variations. Today, core functionalities of contemporary GIS, e.g., a range of basic spatial analysis routines, are well suited for this systematic examination of climate variations over space, typically applied to infer spatially continuous (gridded) climate layers from point source observations. Especially for this purpose of climate spatialization, SAGA is equipped with numerous tools, ranging from interpolation methods (local, global and regression-based regionalization) to complex surface parameterization techniques, supporting both, statistical and numerical climate modeling.
Interpolation methods: although SAGA provides an almost complete collection of local and global interpolation techniques, comprising deterministic (inverse distance weighted, local polynomial, radial basis functions), and in particular geostatistical Kriging methods and its derivatives (e.g., ordinary, universal, regression Kriging), interpolation techniques have rarely been used as a standalone method in climate spatialization (Fader et al., 2012). Instead, these methods were frequently combined with more complex statistical or numerical climate modeling approaches, using, e.g., ordinary Kriging, thin-plate spline or inverse distance weighted for the interpolation of model residues, in order to obtain a correction layer for the adjustment of modeling results (Böhner, 2006, 2005, 2004; Gerlitz et al., 2014; Kessler et al., 2007; Soria-Auza et al., 2010; Weinzierl et al., 2013).
Regression-based regionalization: in topographically structured terrain and especially in high mountain environments, where the often low density and non-representative distribution of met stations lead to unsatisfying interpolation results, statistical methods utilizing explanatory fields (e.g., DEM elevation and its first- and second-order partial derivatives) as statistical predictors are distinctly preferred in climate spatialization. In order to achieve a proper estimation of the deterministic, topographically determined component of climate variations, SAGA provides a set of DEM parameters, explicitly constructed to represent orographically induced topoclimatic effects. A comprehensive explanation and justification of DEM-based measures for the parameterization of prominent topoclimatic phenomena, such as the anisotropic heating and formation of warm zones at slopes, cold air flow and cold air potentials, orographic effects on rainfall variations and wind velocities, and terrain determined alterations of shortwave and longwave radiation fluxes, is given in Böhner and Antonic (2009). Soria-Auza et al. (2010) conducted a systematic evaluation of SAGA-based climate spatialization results in comparison with Worldclim data (Hijmans et al., 2005), highlighting the advantages of SAGA-based surface parameterization methods and its added values for ecological modeling applications. Further spatialization results from mountainous modeling domains in Asia, Europe and South America are presented in Bolch (2006), Dietrich and Böhner (2008), Kessler et al. (2007), Lehrling (2006) and Stötter and Sailer (2012).
Numerical and statistical modeling: fostered by advanced
computational capacities on the one hand, and increasingly freely available
topographic data sets on the other, regression-based techniques evolved to a
common standard in climate spatialization since the 1990s. The results,
though often with high spatial resolution, however, are most commonly static
representations of mean (monthly, seasonal or annual) climate values, whilst
to date, the temporal high resolution, dynamical representation of climate
processes and values remain a domain of high-performance computing-based
climate modeling, rarely covered by GIS. To overcome this disadvantage,
Böhner (2006, 2005, 2004) introduced a SAGA-based climate spatialization
approach, which basically merges statistical downscaling and surface
parameterization techniques. Assuming the spatiotemporal variability of a
climatic variable to be predominantly controlled by both, tropospheric and
terrain-forced processes, different DEM parameters and monthly resolution
tropospheric fields from NCAR-NCEP or ERA interim reanalyses had been
considered as statistical predictors, supporting a monthly resolution
estimation of climate variables for Germany (Böhner, 2004), the Okavango
catchment (Weinzierl et al., 2013) and different modeling domains in Central
and High Asia (Böhner, 2006; Gerlitz et al., 2015; Klinge et al., 2015).
In order to allow a dynamical representation of numerical climate simulations
via SAGA, most recently, an efficient climate model interface has been
realized, which enables one to import and process even complex and vast
climate model outputs. Resulting options for climate spatialization and its
advancements in terms of applicability, precision and temporal resolution,
using daily and sub-daily resolution ERA Interim reanalysis as dynamic
forcings for statistical downscaling are presented in Böhner et
al. (2013) and Gerlitz (2014). Daily resolution values of climate variables
for Baden-Württemberg on a 50 m
Climate layers performed with SAGA, however, are seldom stand-alone results, but are mostly considered as basic predictor variables or driving forcings for further environmental analysis and modeling. Applications range from paleoclimate reconstructions (Aichner et al., 2010; Böhner and Lehmkuhl, 2005; Herzschuh et al., 2011, 2010; Wang et al., 2014) over environmental resource assessment and regionalization (Böhner and Kickner, 2006; Böhner and Langkamp, 2010; Kessler et al., 2007; Klinge et al., 2003; Lehmkuhl et al., 2003; Miehe et al., 2014; Soria-Auza, 2010) to climate impact assessment and soil erosion modeling (Böhner, 2004; Böhner et al., 2003; Böhner and Köthe, 2003; Conrad et al., 2006).
SAGA offers a large number of scientific methods and tools for multispectral,
hyperspectral and thermal remote sensing, including geometrical preprocessing
and spectral filtering techniques, multispectral and Fourier transformations,
supervised and unsupervised classification algorithms, change detection, as
well as segmentation methods for object-oriented image analysis (Blaschke,
2010). The filtering techniques include standard linear bandpass filters
(Gaussian, Laplacian, user defined) and nonlinear filters (majority, rank,
and the morphological filter dilation, erosion, opening, closing,
morphological gradient, top hat, and black hat). Besides different vegetation
indices, tasseled cap transformation, sharpening techniques for multi-sensor
data and principal component analysis for feature reduction are implemented.
Available classifiers comprise k-means (unsupervised), binary encoding,
parallelepiped, minimum and Mahalanobis distance, maximum likelihood,
spectral angle mapper, decision trees, random forest, support vector
machines, artificial neural networks, as well as ensemble-based classifiers,
while region growing and watershed algorithms are available for image
segmentation. Besides, some tools are available for the processing of data
from specific sensors, like calibration to reflectance and generation of
cloud masks for Landsat data. Furthermore, SAGA offers great possibilities to
process LiDAR point data (cf. Sect. 3.4). The code of the remote sensing and
image processing tools is partly based on free libraries and open source
software including OpenCV (Open Source Computer Vision,
A number of remote-sensing-based studies using SAGA have been published in
different fields. In forestry, multispectral remote sensing data from
different times and sensors and fragmentation models were used to examine
deforestation and fragmentation of peat swamp forest covers in Malaysia
(Kamlisa et al., 2012; Phua et al., 2008). Bechtel et al. (2008) applied
different segmentation algorithms to identify individual tree crowns in very
high resolution imagery. Furthermore, a new vegetation map of the central
Namib was produced by a supervised classification of remote sensing data from
MODIS and ETM
The image processing capacities also rouse interest in other disciplines. Asmussen et al. (2015) developed a workflow for petrographic thin section images, which comprises the delineation of rock forming minerals and data acquisition of various fabric parameters. The method is based on SAGA's seeded simple region growing algorithm to obtain flexible and precise object detection for any occurring mineral type in weathered sub-arkose sandstone material, and benefits from a reproducible and transparent GIS database.
Besides its core applications, such as terrain analysis, geomorphometry, soil mapping and climate spatialization, SAGA has been used in numerous studies of very different fields and wide scope. One physical geographic sub-discipline, which should be outlined separately due to a series of recent publications, is biogeography and particularly plant geography. Marini et al. (2007), e.g., studied the influence of local environmental factors on plant species richness in the Alps. Different terrain attributes like elevation or slope were tested in statistical models besides other climatic or nutrient parameters, in order to explain biodiversity patterns on different Alpine meadows. A similar approach was carried out by Marini et al. (2009) with the focus on the impact of farm size and terrain attributes on insect and plant diversity of managed grasslands in the Alps. Another example of a bio-geographic relevant application can be found in Vanselow and Samimi (2011), who preprocessed a DEM by means of the fill sinks method and afterwards derived terrain attributes, e.g., altitude above channel network and slope, in order to designate potential future pastures in Tajikistan. Other environmental studies comprise Heinrich and Conrad (2008), who applied a cellular automation approach to testing the simulation of flow and diffusion dynamics in shallow water bodies, and Czegka and Junge (2008), who used SAGA as a mobile field tool in environmental geochemistry research activities (GPS coupled navigation on lakes, in situ observation monitoring). Liersch and Volk (2008) implemented a metric conceptual rainfall–runoff model including calibration tools within the SAGA framework with the intention of flood risk prediction.
Besides geoscience, GIS products are increasingly utilized in various fields dealing with spatially explicit data. For example, in archaeology, Bernardini et al. (2013) investigated airborne LiDAR derived images, in order to detect and to monitor hitherto unknown anthropogenic structures and archaeological sites in the Trieste Karst landscape. Leopold et al. (2011) likewise produced a DEM, based on LiDAR information, and subsequently derived a high-resolution cross section with the objective of revealing the location of a former production site of bronze statues at the southern slope of the Acropolis of Athens. The approach was compared to other methods of demarcating prehistoric surface structures (e.g., magnetometry, electrical resistivity tomography, ground-penetration radar) and proved to be suitable as an additional archaeological reconstruction tool. Kaye (2013) utilized SAGA-based terrain analysis, in particular the catchment water balance methodology and the implemented hydrological tools. By means of the GIS-based approach, the author reconstructed historical water availability in southern England as an important logistic factor for the Roman Army and contributed to finding the location of Boudica's last battle in AD 60 or 61.
Finally, two very specific examples for the utilization of SAGA should be highlighted, which particularly indicate the wide scope of the software. Different spatial detection techniques of radioactive matter being randomly dispersed on a free area were tested for the purpose of quantifying dose rates, surface activities, mass concentrations in aerosols and their temporal and spatial distributions (Prouza et al., 2010). Most peculiarly, the morphometric investigation of chewing surfaces of animals should be emphasized, which was based on an automated analysis using specific interpolation and segmentation approaches (Czech, 2010).
Since the first public release in 2004, SAGA has very rapidly developed from rather specialized niche software for digital terrain analyses to a mature stage FOSS GIS platform, offering the entire spectrum of geodata analysis, mapping and modeling applications of contemporary GIS software. Right from the beginning, the multiple options of an object-oriented programming environment and the consequently modular organized architecture fostered the development of specific methods often distinctly beyond off-the-shelf GIS products. In the present version 2.1.4, SAGA offers more than 600 tools, many of them reflecting a leading paradigm in the SAGA development: advancements in GIS development are only to be achieved if the development is closely embedded in research processes. Indeed, in responding to scientific questions and needs, SAGA is not only a research supporting tool but also likewise its outcome, which complements related scientific publications with the ultimate documentation of used methodologies in the form of source codes.
Today SAGA is maintained and enhanced at the Institute of Geography, Physical Geography Section, at Hamburg University; however, a fast growing user community and developers all over the world contribute to the evolution by their specific needs and applications. For the near future, support for multidimensional raster, e.g., addressing volumes, time series and hyperspectral data, as well as a stronger integration with DBMS like PostgreSQL/PostGIS, are envisaged to broaden the application of SAGA. A further challenging field of the coming SAGA development will be the enhancement of tools and methods, supporting a scale-crossing amalgamation of climate and environmental modeling applications. Already today, SAGA enables an assimilation, dynamical representation and statistical downscaling of climate model outputs, required to interlink dynamical climate forcings with process models for case studies and climate impact analyses. By bridging both, spatial scales and scientific disciplines, SAGA responds to the steadily increasing needs for high-quality, spatially explicit data and information, ultimately tracing GIS to its roots. Indeed, already in 1965, Michael F. Ducey and Duane F. Marble stated in probably the first written document to use the term “Geographic Information System” that “the primary function of a Geographic Information System is to make spatially oriented information available in a usable form”. Happy 50th birthday GIS!
The SAGA source code repository is hosted at
The authors would like to thank all SAGA developers for their contributions to the code and the documentation and numerous users for their valuable remarks. Furthermore, we thank I. Friedrich for editorial support with the manuscript. Finally, we thank M. F. Ducey and D. F. Marble for using the term “Geographic Information System” in their Technical Note “Some Comments on Technical Aspects of Geographic Information Systems” from 1965.Edited by: S. Easterbrook