GO2OGS: a versatile workflow to integrate complex geological information

Introduction Conclusions References


Problem description and motivation
To gain understanding of phenomena in our environment, it is common practice to develop models (Kolditz et al., 2012b). Since real world problems show a high complexity, it is necessary for various disciplines to cooperate in order to achieve a proper description of different aspects of a model (Wycisk et al., 2009;Refsgaard et al., 2012;Sharpe et al., 2002). Yet, the various scientific disciplines use varying types of models CAD software in the SGrid format, into simulation models usable by the Finite Element simulator OpenGeoSys in the VTK framework format. Geological structure models are used, for instance, to gain better understanding of the development and formation of geological units. Based on the characteristics of the geological units, it is possible to create a hydrogeological simulation model to study the fluid flow in the subsurface.
Thus, a geological structure model can aid to obtain a valid simulation model, by providing the parametrization in a heterogeneous domain. A valid simulation model can help to understand various phenomena and makes predictions of the future possible.
Even if a simulation model includes only few parameters, it can be a challenging task to set up the model appropriately and solve it analytically. In most cases of simulation 15 models, analytical solutions are not available at all. To solve such problems, numerical methods need to be utilized; in this case, we therefore use OpenGeoSys. Similar to Park et al. (2014), who show an approach to integrate Petrel data into OpenGeoSys (PET2OGS), we developed the workflow GO2OGS for the transition from GOCAD to OpenGeoSys.
Thuringian Highlands in the Southeast.
On the basis of this geological structure model, various scientific aspects should be investigated. Among others, it was necessary to analyse the relevance of the faults for the groundwater flow within basin. To investigate this subject, we selected Open-GeoSys as a numerical modeling tool. OpenGeoSys (Kolditz et al., 2012a) is a well es- 10 tablished, open source, platform independent simulation package for coupled THM/C processes (thermal, hydraulic, mechanical and chemical processes). The numerical simulation model utilizes the finite element method (FEM) and has been applied at various instances in environmental sciences (Sun et al., 2011;Walther et al., 2012bWalther et al., , 2013Nagel et al., 2013;Maxwell et al., 2013;Kolditz et al., 2015). 15 Geological models can be built in several data formats; in this case a GOCAD stratigraphic grid format (SGrid) was generated for further usage. OpenGeoSys makes use of the VTK framework to read the mesh files in the VTU format.

Previous work and literature
As a link between geological and visualization sciences, ParaViewGeo 2 offers inter- 20 faces to many file formats used in the geological community. Amongst others, it is able to read GOCAD ASCII files. However, we were unable to utilize this for our purposes, as not all features of the GOCAD SGrid format, e.g. the so called split nodes, important to integrate the faults, are supported by the reading algorithms. We could also not continue the work of the ParaViewGeo GOCAD SGrid reader as ParaViewGeo, with the Introduction (2014), a converting algorithm from GOCAD SGrid models to FLAC3D simulation models is shown. Zehner et al. (2015) describe and compare three GOCAD internal workflows to generate a mesh suitable for numerical simulations of electromagnetic fields. The workflows use external open source meshing tools (e.g. GMSH or TetGen), or an interface to the modeling environment Ansys 3 . Recently, Qu et al. (2015) offered 10 a method to generate volumetric gridded fault zones by adding elements in the respective areas to the existing grids. This method, although shown to be applicable on a multitude of setups, will require an existing base mesh of good quality, requires the use of additional software, and is not open-sourced. Unfortunately, none of these tools satisfy every criterium mentioned before, i.e. publicly available algorithms, well docu-15 mented workflows, platform independency, as well as the ability to include faults in an unstructured mesh, which renders them unusable for our purposes. Numerical simulation studies on the hydrogeology in the Thuringian Syncline have been conducted by Rödiger (2005), who created a hydrogeological simulation model for the eastern part of Thuringian Syncline. Rödiger shows awareness of the existence of 20 faults but states that the used numerical modelling software (VISUAL MODFLOW) does not support to incorporate the faults. Furthermore, Zech (2013) investigated transport processes in the subsurface within the Thuringian Syncline. Zech used a workflow provided by Zehner (2011) to obtain two vertical 2-D cross sections from GOCAD for numerical simulations with OpenGeoSys. In this case, the data preparation was solely 25 done within GOCAD. Unfortunately, we were unable to base GO2OGS on this, as the reduced complexity of the 2-D setups did not incorporate the necessary features in a full 3-D domain. Introduction As far as the authors are aware, similar groundwater flow simulation codes, e.g. GMS 4 , Petrel, or FEFLOW, do not have a streamlined interface, that satisfies the afore mentioned requirements, to make use of the GOCAD SGrid format.

Structure of the paper
The conversion workflow to transform a geological structure mesh (SGrid) into a usable 5 simulation mesh (VTU) is sketched in Fig. 2, which simultaneously shows the structure of the paper. In the following, we use the notation , , and to refer to subfigures of Fig. 2 and steps of the workflow, respectively.
In order to foster and to encourage the interested scientific audience, the equivalent source code for the individual algorithms can be found in an open-access repository 5 , 10 and is available to the community via https://github.com/envinf/GO2OGS. In the paper, we will use the style GO2OGS: algorithm to refer to a specific algorithm.
After a brief description of the underlying mathematical equations, we introduce meshing terms and element quality measures. Based on this, we present an algorithm for the conversion of the GOCAD SGrid to an open data format mesh . By 15 evaluating the element quality of , we propose a reconstruction algorithm to obtain a corrected mesh . Finally, in a third section, we utilize to delineate a mesh that is used in an application of a regional scale groundwater flow modelling that makes use of the heterogeneous information initially obtained by the geological modelling in .

Fundamentals and methods
Partial differential equations are used as models for natural processes. For instance, a linear elliptic partial differential equation of second order defined on a domain Ω ⊂ R can be written as where a i j are coefficients, u is a potential and f describes the sources and sinks of a quantity. Furthermore conditions on the boundary Γ = ∂Ω have to be specified: where Eq. (1b) are denoted first type or Dirichlet boundary conditions and Eq. (1c) describes flow through the surface normal of the boundary, i.e. second type or Neumann boundary conditions. Unfortunately, even for this simple linear problem, an analytical solution does not exist. To solve the equation, a numerical method can be used com-10 puting an approximation to the solution.
In the following, we want to apply the Finite Element Method (FEM, Zienkiewicz et al., 2000), which has already been incorporated in the open-source simulation toolbox OpenGeoSys. In order to apply the FEM we need a partition of the domain Ω, where the problem Eq. (1) is defined on, i.e. we need to generate a mesh where the equation 15 can be parametrized.

Meshing and element quality
We used volume elements for the partition of a polyhedral approximation of the problem domain Ω into simpler parts. This partitioning is referred to as meshing. Introduction operations are not exact and the operations' (intermediate) results are rounded. For example, two nodes can be such close to each other, that during the computation of their distance, cancellation will occur, i.e. the difference of the operands of the subtraction contains a smaller number of accurate (significant) digits than the original operands. Additionally, the element quality depends on the physical process that should be simu-5 lated. For example, in transport process simulations, the Peclet number has to be acknowledged (see Johannsen et al., 2006;Johannsen, 2006;Graf and Degener, 2011). In general, mesh quality influences both, the accuracy and efficiency of a FEM. Therefore, the quality of the cells has to satisfy specific quality criteria. Knupp (2007) gives an overview on the relevance of quality criteria, among others, for FE methods.

Conclusions
The OpenGeoSys DataExplorer , a data exploration framework as part of the OpenGeoSys project, provides several quality measures. The Open-GeoSys DataExplorer can be used, for example, to calculate the edge aspect ratio r ea for an element e (face or cell) following 15 where i is the i th straight line segment and |e| is the number of segments of e. As a rule of thumb, elements with r ea 1 are unsuitable for the application of the FEM (Zienkiewicz et al., 2000).

Data structures and visualization
OpenGeoSys can read and write the file format VTU (Visualization Toolkit unstructured 20 grid) of the well documented Visualization Toolkit (VTK, Schroeder et al., 2006). The documentation for this open data format, as well as the source code of the VTK project is publicly available from the project web site 6 . We use ParaView (Ayachit, 2015)  visualize and evaluate the particular converting steps. ParaView is a widely used open source visualization toolbox, also in hydrogeological sciences Helbig et al., 2014;Walther et al., 2014). ParaView uses the VTK framework to apply various filters in a pipeline combining a large variety of visualization algorithms. 5 A GOCAD SGrid data set is comprised of several files. A central file manages all important metadata. This file (extension.sg) may already include the necessary data to describe the geological model and its properties, or may also refer to external data containers. In the following we will describe the methodology to read the GOCAD SGrid data and write to an open data format .

Read original GOCAD data
Algorithm 1 contains the main steps performed to convert the GOCAD SGrid data to an open data format. The algorithm was implemented within the OpenGeoSys project, i.e. it uses its data structures to read and convert the data. The algorithm as well as the definition and implementation of data structures used within the algorithm can be found  Executing steps 1-3 of the algorithm results in a structured grid (see Fig. 4a and b). After integrating the split nodes (step 7), the mesh becomes an unstructured grid (Fig. 4d). 5 We chose two different hydrogeological examples to study the applicability of the resulting meshes in FE simulations of the conversion workflow given through Algorithm 1. CAD model used in this study, and the geological background will be publicly available in the report TLUG (2015). In the following, we will restrain on relevant information for our workflow to acquire a FE mesh and set up a simulation model. The domain is comprised of 13, non-continuous stratigraphic units. Furthermore, the model domain contains 54 vertical faults. Converging stratigraphic units and faults, 5 that are realized through the integration of split nodes (compare Fig. 4d), led to corrupted mesh elements in the original GOCAD SGrid model, with e.g. non-planar element faces, or near-zero volume elements. Figure 5 visualizes a discontinuous stratigraphic unit in three levels of detail; at the finest level ( Fig. 5 right), one can recognize an example for a corrupted mesh element in the Rotliegendes (lower, brown layer).

Description of setups
To evaluate the overall quality of the mesh, we calculated the edge ratio criterion for every cell read from the original GOCAD SGrid (Fig. 6, gray bars). About 60 % of the mesh elements show an aspect ratio r ea ≤ 10 −2 ; as mentioned in Sect. 2.1, elements with r ea 1 are not suitable for FE simulations. Therefore, we will have to employ further conversion steps on Setup B to correct and provide a mesh that can be used in 15 a FE model.

Re-construction of a complex geological 3-dimensional model
To increase the mesh quality of the original GOCAD SGrid mesh, we will use the method of reconstructing the mesh, converting to (see Fig. 2). The method of reconstruction offers the ability to specify element dimensions prior to meshing and, thus, 20 positively influence mesh quality. The aim of the reconstruction is to achieve the closest possible approximation of the geological structures of the original GOCAD SGrid model preferably using a small number of high quality cells.
For reconstruction, one could employ TetGen (Si, 2013), a mesh generator that uses tetrahedra to construct a 3-D domain. In order to mesh the domain with TetGen, one 25 needs to provide piece-wise linear complexes along all outer and internal boundaries. The acquisition of these complexes is especially challenging in areas close to faults, where split nodes constitute discontinuities. Additionally, TetGen will generate an ex-6319 to satisfy element quality criteria. This will lead to too many elements to finish the FE simulations and acquire results within reasonable amounts of time. To encounter these challenges, we follow an alternative way for reconstructing the mesh domain by utilizing a hexahedra resampling method.

Reconstruction using hexahedra resampling
For the hexahedra resampling based reconstruction, we use the implementation provided by Algorithm 2. In the first step, employing GO2OGS: resample, a structured grid within the bound-10 ing box of the original unstructured grid is generated. Depending on the needs of the simulation, it is possible to adjust the resolution in the x, y, or z dimensions. Properties like the affiliation to a geological unit are transmitted from unstructured cells to the cells of the structured grid. In order to determine the property value of a cell c s in the structured grid we look for the cell c u of the unstructured grid that contains the 15 geometric central point of c s . The cell c s is assigned the same property value as is assigned to c u . Cells within the bounding box, but not located in the unstructured grid, are marked as invalid. They are preserved to maintain the structure of the original grid and to simplify the indexing.
In the second step, cells intersecting faults surfaces are marked using GO2OGS: areal element in the GOCAD SGrid model becomes a volume element in the structured grid.
In the last step, in GO2OGS: removeInvalidCells.py, cells marked as invalid in the first step are removed. The resulting unstructured grid is stored in the open data format VTU . 5 The selection of the resolution for the reconstruction influences several aspects, including (1) the element quality, (2) the number of cells needed for the domain approximation, (3) the level of detail of the geological information, and (4) the approximation quality for the investigated processes of the FEM. Issue (1) is relevant to ensure that the generated elements are not corrupted (i.e. r ea 1), and most often achieved by 10 increasing the number of elements for the representation of heterogeneous features. Yet, this is contrary to issue (2), which at first depends on the system resources of the computing machine (i.e. whether the simulation can be run at all), but eventually influences the runtime of the FE model. Furthermore, issue (3) refers to the fact that in order to omit loosing information of the topology of the geological units, an appropriate 15 z resolution depending on the information density of the underlying geological model is required. Exemplary for the representation of the geological features with respect to the resolution, Fig. 7a shows the original GOCAD mesh in comparison with two resolutions after reconstruction (Fig. 7b and c). Finally, (4) is a necessary consideration that will become important after the FE simulation has finished, i.e. whether the investigated 20 processes have been rendered correctly, or that numerical stability criteria were not violated (e.g. Peclet, Courant numbers).

Application example
In the preceding sections, we introduced a workflow to convert a heterogeneous GO-CAD mesh, as a result of geological modeling, into an input file for OpenGeoSys for 25 the application of the FEM method. In the following, we want to show how we used the converted mesh of Setup B in a catchment scale modeling, primarily as a proof of concept for the conversion workflow. As a secondary result, we expect first insight into the significance of flow regimes of the different hydrogeological units and faults. This should provide a basis for further investigations within the framework of the project Influins and future work in the study area. We want to simulate steady state, confined groundwater flow in a catchment of the 5 Thuringian Syncline. The chosen study area, the Unstrut catchment until the gauge Oldisleben (compare Fig. 1), is part of the Thuringian Syncline. Following Eq. (1), and neglecting storage for a long-term mean, the groundwater flow in porous media can be modeled by the equation with the conditions on the boundary:

Delineation of a catchment domain
To define the border of the numerical model, we had to cut the geological model of the Thuringian Syncline with the catchment boundary of the Unstrut river. Therefore, we

Parametrization of the simulation model
With specified boundaries, we could parametrize Eq. (2), and apply boundary conditions to the domain. For the proof of concept, and out of simplicity, the numerical 15 model was set up and parameterized based on plain, but reasonable assumptions. Doubtlessly, there is potential for improving the model setup and, through calibration techniques the results, to approach more realistic projections. Yet, this is beyond the scope of this work, but we will yield recommendations in an outlook for future modeling endeavours.

Fluid and medium properties
We assumed common constant values for dynamic viscosity µ = [1 × 10 −3 ] Pa · s, and density ρ = [1 × 10 3 ] kg · m −3 of water. As our model features a steady state simulation, the only additional parameter group that had to be defined was intrinsic permeability k F, L (with superscript F for faults and L for layers). Within the material groups, the parameters were set to a homogeneous value. We defined two distinct scenarios (S1, S2): S1 is the result of the presented workflow with a heterogeneous domain, with commonly cited values for the inherent geological units (Seidel, 2003). In scenario S1, we treated the faults to be impermeable (i.e. low permeability κ F κ L ). In contrast to S1, scenario S2 is defined through a homogeneous value for all material groups 5 providing a setup for assessing the importance of heterogeneities and faults within the chosen catchment on regional scale. Table 1 gives an overview for the used values, and scenarios of the parameter permeability.
As the Basement ("G") layer from the GOCAD model does not significantly contribute to the flow regime (personal communication with J. Geletneky, Thüringer Landesanstalt 10 für Umwelt und Geologie, TLUG), we neglected this material group in the setup. The respective elements were removed following a similar approach as described in Sect. 3.1 (GO2OGS: removeMeshElements). The same is probably true for material groups 10-12 (i.e. Zechstein and Upper Rotliegendes), but removing the associated elements resulted in a non-continuous mesh, i.e. holes in the domain; therefore, we did not re-15 move these elements.

Boundary conditions
Assuming that surface and subsurface catchments are generally identical, and that the basement ("G") layer is practically impermeable, we can set No-Flow boundary conditions after Eq. (2c), i.e. ∂ n h(x) = 0, x ∈ Γ, at the vertical borders and the bottom 20 of the chosen catchment domain.
A homogeneous groundwater recharge was assumed on the top of the model with [150] mm · a −1 , which lies within a common bandwidth for the area (Seidel, 2003;Rödiger, 2005). We introduced a top layer (see material group 13 in Table 1) with a high permeability to represent infiltrating capabilities of the soil, which are enforced by weather- 25 ing processes even on initially low permeable units that strike the ground surface. The top layer comprises all cells which own nodes that are exposed to the upper boundary of the mesh. 6324 We set Dirichlet-type boundary conditions along the main rivers, which are acting as major water sources or sinks in the area. The hydraulic head of the river nodes was set to the respective elevation of the river nodes (compare also light blue dots in Fig. 1 for position of river nodes).  Figure 8 shows the simulated hydraulic head in the model domain of the Unstrut catchment for scenario S1 (i.e. heterogeneous setup). The model provides a reasonable distribution of the hydraulic head; larger values can be observed in the higher southwestern and northern mountainous regions, while smaller values are calculated for the lower north-eastern area near the catchment outlet. The rivers, as boundary conditions, alternate between loosing and gaining sections. A comparison of the observed and simulated depth from the surface to the groundwater table (Fig. 9) affirms that the model S1 is generally capable to represent the regional groundwater level distribution in the domain; topographical effects are recognizable through larger distances in ele- 15 vated areas, and smaller distances in lower regions, whereas additionally, groundwater levels reach the ground surface in the vicinity to rivers.

Results
The difference between the two scenarios S1 and S2, i.e. the influence of a heterogeneous vs. a homogeneous parametrization, can be observed in Fig. 10 that shows the absolute difference of hydraulic heads from S1 and S2 at the top of layer 2. Although, 20 for the many areas of the domain, the differences are mostly small (< [2] m), simulations results vary in the northern part of the catchment, where the rivers as boundary conditions dominate the flow regime, and in the southern part, where faults significantly alter the hydraulic head.
The parametrization of the faults and its influence on the scenarios become even 25 more important, when observing flow patterns: Fig. 11 shows a particle tracking within an area where two faults reduce the available flow width (see Fig. 10  Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | the detailed view). In S1, Fig. 11a, the low-permeable faults act as barriers, limiting the pathways of the particles and directing the flow through a "bottleneck" in lower layers (yellow, green, blue, and purple pathlines). As the Top layer (MatGroup 13) was interpreted as an upper stratum being exposed to weathering processes with increased permeability, the faults are not hindering the flow paths and particles may pass (red 5 pathlines). Contrary, in the homogeneous case S2, Fig. 11b, flow paths are not limited by faults and show an exchange between the rivers as the overall flow regime is primarily governed by the rivers' boundary conditions.

Discussion
While the simulation generally yields reasonable results for the hydraulic head and its 10 distribution within the model domain, and the comparison to observed values of the depth to the groundwater table shows generally similar values, the model still offers large potential for improvement from a hydrogeological point of view (compare also Sect. 4.1). For example, groundwater levels in the upper elevations (northern areas) are currently governed by the 1st-type river boundary conditions; utilizing a 3rd-type 15 boundary condition might reflect a more realistic situation. Also, the varying influent and effluent conditions of the streams that are results of the simulations should be compared with observations, and e.g. the near-river hydrogeology adapted. Although the simulated particle flow paths most likely do not represent the local flow regime correctly, more importantly, the comparison of the scenarios underlines the relevance for

Conclusions
Dealing with complex subsurface structures, e.g. sedimentary basins such as the Thuringian Syncline, is still a challenge for process modelling, e.g. fluid dynamics of subsurface systems. Even though, sophisticated professional tools for structural mod-5 elling exist (e.g. GOCAD, PETREL), the translation of structural to accurate numerical models remains to be a demanding task due to different (i) classic, and even (ii) new aspects. (i) The complex-physics-simple-structure vs. complex-structure-simplephysics paradigm is still valid (Fletcher, 1991), may this concern e.g. data acquisition and model parametrization through data scarcity (Ritzema et al., 2010;Gräbe 10 et al., 2013), or model validity (Nguyen and de Kok, 2007). Also, (ii) modern geophysical exploration methods provide increasingly detailed information and high data accuracy, that adequate numerical representation is required (Thorleifson et al., 2010;Schmelzbach et al., 2011;Van Dam, 2012;Lucia et al., 2015). Thanks to the extending availability of supercomputing facilities, e.g. Peta-Flop architectures, solving numeri-15 cal models with 10 9 degrees-of-freedom became possible, offering a new dimension of hyper-resolution subsurface modelling (Wang et al., 2014). With respect to numerical modelling, we consider meshing as a non-trivial, complex task that needs special attention in order to provide a sound basis for FE simulations. With GO2OGS, we provide a complete workflow to utilize complex hydrogeological 20 information of a geological structure model for the setup of numerical subsurface simulation models (Fig. 2). Specifically, important achievements of this work are: -It has been shown that output of composite geological modeling tools (here GO-CAD) may include elements of bad quality; these meshes cannot directly be used in FEM simulation tools.

25
-Corresponding conversion algorithms have been developed that generate mesh domains of good quality and arbitrary spatial resolution. GOCAD SGrid input and write OpenGeoSys , as well as VTK framework files and .
-Converted results from could be directly used for numerical modelling; here for simulation of flow in small-scale sedimentary structures (Kunkel et al., 2013).
-As and are provided in an open data format, the presented workflow and 5 tools are potentially useful for other applications, aiding in fundamental research (McKenna et al., 2003;Pozdniakov et al., 2012), offering easier and straightforward integration of subsurface heterogeneous information (Walther et al., 2012b), or to increase prediction quality (Sutanudjaja et al., 2011;de Hoyos et al., 2012). Respective readers and conversion methods can easily be written follow-10 ing the VTK framework documentation.
-As a proof of concept for , we applied the developed workflow on a regional scale catchment (Unstrut river till gauge Oldisleben). Employing two different scenarios, we could show that the parametrization of faults in the study area is important for the regional and local flow regime. In particular, hydraulic head and particle flow 15 paths are altered by faults, which may effect, e.g. groundwater age, or contaminant spreading. Further investigations are needed to reduce uncertainty of the parametrization and the model.

Outlook
With the presentation and implementation of a general workflow for the setup of high-20 resolution numerical models based on structural information, this technical paper constitutes a prerequisite for further investigations. With the workflow, we are now able to tackle scientific questions related to a better understanding of the dynamics of the Thuringian Syncline's subsurface systems with accurate numerical subsurface models. Among others, relevant scientific question are: -What is the influence of tilted fractures, or the anisotropy of fractures on the communication between layers? Can isotope measurements (Matter et al., 2005;Tian et al., 2015) or simulation of groundwater age (Goode, 1996) help to identify the flow paths in the field?
-How important is the representation of rivers, i.e. 3rd-type boundary conditions to represent colmation of river sediments, or alternatively, what is the impact from the usage of the river network model output?
-In continuation of Zech (2013), can a regional-scale, heterogeneous, threedimensional subsurface model help to understand thermohaline effects?
In addition, we see the possibility to extend the workflow by employing automatic calibration methods to obtain better parameter estimates (hydraulic conductivity, porosity, leaching factor of rivers, recharge etc.), e.g. by utilizing PEST (Gallagher and Doherty,15 2007) as in Li et al. (2009);Sun et al. (2011). It is also possible to approach numerical improvements, e.g. to include other mesh elements in the reconstruction, which offers the possibility to reduce the mesh size, i.e. the number of elements, while still incorporating relevant structural small-scale features through local grid refinement. Finally, as the underlying geological model is subject to an active and constant update process, 20 the state-of-the-art information of a more detailed model may easily be incorporated into future numerical setups employing the workflow GO2OGS.  Waterloo, Canada, 19-22 July, IAHS Publ. 297, 159-168, 2004. 6313 Matter, J. M., Waber, H. N., Loew, S., and Matter, A.: Recharge areas and geochemical evolution of groundwater in an alluvial aquifer system in the Sultanate of Oman, Hydrogeol. J., 14,