Introduction
Turbidity currents are density currents driven by sediment particles that are
suspended by turbulence in the containing fluid . They occur
frequently throughout the Earth's oceans and are one of the main processes by
which sediment is moved from the continental shelf to the deep ocean. The
largest turbidity currents can involve several hundred cubic kilometres of
sediment and can travel for hundreds of kilometres
across the seabed at speeds of tens of metres per second .
This schematic representation of a dense gravity current (left) and a
corresponding depth-averaged shallow water approximation (right) shows
current height h, volume fraction c, depth-averaged volume fraction
ψ, velocity u, the forward component of the depth-averaged
velocity u, and deposit depth η.
The vast majority of the available data on turbidity currents are contained in
the sedimentary deposits that they leave behind. Significant effort is spent
on attempting to diagnose details about the turbidity current that produced
these deposits. describe the current theories on how
deposits found in the field are formed. The experimental evidence cannot yet
validate all of these theories. Computer models, along with laboratory
experiments, have been useful tools in improving our understanding of the
dynamics of turbidity currents
. However, computer models have
not often been directly applied to recreating deposits found in the field,
despite their capacity to do so. They are generally applied to idealised
cases to understand a specific physical mechanism. It is useful to directly
apply models in an attempt to recreate real-world deposits
, but this requires good knowledge of
the initial and boundary conditions and accurate estimates of values for
other controlling model parameters, which are often hard to determine
.
The task of obtaining a set of model input parameters based upon a desired
model output represents an inverse problem. It can also be interpreted as an
optimisation problem for which model parameters are sought to minimise the misfit
between the deposit profile generated by the model and a target deposit
profile, which is produced from measurements taken in the field.
In this paper, a shallow water model is used to simulate turbidity currents.
The shallow water equations are a set of partial differential equations
(PDEs). The optimisation of PDE-based models occurs throughout science and
engineering and is already applied, for instance, in ocean science
, renewable energy , and design
problems . In addition, there is a growing
interest in applying inverse modelling techniques to the modelling of
turbidity currents . In
particular, applied a gradient-free optimisation
method to reconstruct parameters for a turbidity model.
PDE models of turbidity currents require the definition of initial and
boundary conditions. In the simplest case, this could involve the definition
of a static lock-release laboratory configuration with uniform sediment depth
and a single, uniform sediment grain size. Such a simple configuration would
at least require the definition of the initial depth of the current,
the concentration of sediment in the fluid, the ratio of initial depth to length, and
the parameters controlling the particle settling velocity and flow front speed.
More realistic initial conditions would be an inflow condition with
time-varying depth, velocity, and concentrations of a wide range of sediment
grain sizes along with information defining the topography of the bed, its
composition, the parameterisations governing bed erosion rates, flow
rheology, and bedload transport. As the model complexity and the choices of boundary and initial
conditions increase, the range of deposit shapes that
can be generated by the model also increases such that the model is capable of better
recreating a range of deposits found in the field. However, with this added
complexity, the parameter space grows and the manual tuning of the parameters becomes
a greater challenge.
This paper presents a shallow water sediment-laden density current model,
released under the name AdjointTurbidity 1.0, that uses a novel finite-element mixed discontinuous Galerkin function space with adaptive
time stepping (Sect. ). The model implementation is verified
through comparison with analytical solutions and convergence analyses
(Sect. ). The model is then differentiated using the adjoint
method, which is an efficient way of computing the sensitivity of a model
output to many input parameters (Sect. ). This enables the
use of fast-converging gradient-based optimisation techniques. Finally, a
gradient-based optimisation technique is applied to minimise the data misfit
between the modelled sediment deposit and field measurements taken in the
Miocene Marnoso-arenacea Formation (Sect. ). To the
best of the authors' knowledge, this paper represents the first published work
in which adjoint-based optimisation is applied to turbidity currents and
demonstrates the usefulness of these techniques for interpreting sedimentary
successions that have been deposited by turbidity currents.
Model
Shallow water models solve the Navier–Stokes equations in depth-averaged
form (Fig. ). They are a valid approximation when the
horizontal length scale, or the length of the current, is much larger than the
vertical length scale, or the height of the current. This is the case for all
sediment-laden density currents a short period after an initial
release. In this case, the vertical pressure gradients are in near hydrostatic
balance. Sediment in the current is assumed to be well mixed by the
turbulence in the flow such that there is a vertically uniform sediment
distribution.
Shallow water sediment-laden density current models come in a variety of
forms. proposed the “four-equation” model. This is a
complex model which accounts for entrainment of sediment from the bed and
entrainment of ambient fluid into the flow. It has an extra equation for the
internal kinetic energy of the flow, which is translated into potential
energy through these mixing processes. A drag force is applied along the
length of the current which takes into account the viscous forces impeding
the flow motion at the base and top of the flow. This model has been applied
to large-scale turbidity currents
. It is dependent upon the selection of
numerous governing parameters and hence is a good case for
inverse modelling. A similar but slightly simplified model was used by
for modelling the plume of a dense pyroclastic
basal flow. This model also included a dense underflow that has been applied
in direct comparison with field measurements .
proposed one- and two-layer sediment-laden shallow water
density current models. The two-layer model includes equations for the motion
of the ambient fluid through which the density current is propagating. This
is important when the ambient fluid depth is similar to the initial current
depth. The one- and two-layer models presented by use a
coordinate system that adapts relative to the length of the flow. The moving
coordinate system allows the speed to be prescribed at the front of the
current. This speed can be well approximated using the Froude number, the
height, and the volume fraction of sediment in the current. This is a good
approach, as the speed of the front of a gravity current is governed by
dynamics that cannot be resolved by a vertically averaged model. The moving
coordinate system also results in a discretisation that scales with the
horizontal length scale of the flow. This is beneficial for capturing the
important flow features. This model has been used extensively in
understanding turbidity current flow characteristics
, including the effects of modelling polydisperse
suspensions and the effects of external flow
. The model used in this paper is based upon the single
layer shallow water model of .
Governing equations
The equations governing the current column height, h, and the vertically
integrated momentum, q=uh, with u being the depth-averaged current
velocity, are described in non-dimensionalised conservative form
as
∂h∂t+∂q∂x=0,∂q∂t+∂∂xq2h+φh2=0
with boundary conditions
q=0atx=0,q=x˙Nhatx=xN(t),given thatx˙N=Frφ1/2atx=xN(t),
where xN is the location of the front of the current, x˙N is the
velocity of the front of the current, Fr is the Froude number, and
φ=ψh is the vertically integrated volume fraction of sediment
where ψ is the depth-averaged volume fraction of sediment within the
flow. Through experimentation, the Froude number for a density current with
a head height < 0.075 of the total water depth has been found to be 1.19
. The evolution of φ is described using
∂φ∂t+∂∂xqφh=-βφh,
where β is a constant particle-settling parameter. Hence, the gravitational
forcing term in (Eq. ) (the last term on the left-hand side) changes
with time as φ is advected and settles out of the column.
This single layer model ignores the effect of the motion of the overlying
fluid on the current. This approximation is valid for flows in which the maximum
column height is significantly less than the depth of the ambient fluid
. Viscous forces are also ignored. For high
Reynolds number flows, the viscous forces will be negligible in relation to
the buoyancy forces. found that this was valid while
the Reynolds number was greater than O(1).
The amount of deposited sediment, η, is also recorded and is calculated
using
∂η∂t=βφh.
The model is non-dimensionalised with the length, time, and velocity scales
h0, (h0/g0′)1/2, and (h0g0′)1/2 respectively. Here,
h0 is the dimensional depth of the initial sediment release, g0′=ψ0gρp-ρa/ρa is the
initial reduced gravity of the current, ρp is the sediment
particle density, ρa is the ambient fluid density, which is
assumed to equal the interstitial fluid density, ψ0 is the initial
volume fraction of sediment, and g is the acceleration due to gravity.
Finally, the volume fraction is scaled such that ∫0xN(0)φdx=1.
Following , a coordinate transformation from (x,t) to
(y,τ) is applied where y=x/xN(t) and t=τ. This is a convenient
form for the equations as the front of the current is always at the right-hand
boundary of a fixed computational domain, and hence the boundary condition at
the front of the flow is applied at the right-hand side of the domain. The
transformed derivatives are given by
∂∂t=∂∂τ-yx˙NxN∂∂y,∂∂x=1xN∂∂y.
Applying this coordinate transformation, but keeping t in place of τ
for notational simplicity following , produces
the system of equations
∂h∂t=1xNyx˙N∂h∂y-∂q∂y,∂q∂t=1xNyx˙N∂q∂y-∂∂yq2h+φh2,∂φ∂t=1xNyx˙N∂φ∂y-∂∂yqφh-βφh,∂η∂t=1xNyx˙N∂η∂y+βφh,∂xN∂t=x˙N
with boundary conditions
q=0aty=0,q=x˙Nhaty=1,η=0aty=1,given thatx˙N=Frφ1/2aty=1.
Note that Eq. () for xN has been introduced to close
the system.
It is now shown that the boundary conditions (see
Eqs. –) are sufficient to uniquely
solve this system. Equations ()–() are a
hyperbolic system of PDEs. For such a system to be well posed, there must be a
boundary condition for each inwardly propagating characteristic. This system
of equations has four characteristic velocities:
dydt=c±:=1xNu-yx˙N±φ1/2,dydt=c:=1xNu-yx˙N,dydt=cη:=-1xNyx˙N.
These are obtained using the method of characteristics, where c± is the
characteristic velocity of waves in shallow water, c is the advection velocity
of sediment, and cη is the advection velocity of deposited sediment which
is advected away from the current head as the domain length increases.
Due to the boundary conditions on momentum, the following is true: u=q/h=0
at y=0 and u=q/h=x˙N at y=1. Hence, c=0 at both y=0 and
y=1. Therefore, there are three inwardly propagating characteristics: c+=φ1/2/xN at y=0, c-=-φ1/2/xN at y=1, and
cη=-yx˙N/xN at y=1. Hence, three boundary conditions
are required for the problem to be well posed such that the three boundary
conditions (Eqs. –) are exactly what is
required.
Discretisation and numerical method
As the cell size grows throughout the simulation, it is possible to use a much
larger time step at the end of the simulation than at the start of the
simulation. To exploit this property, an adaptive time-stepping scheme is used in
this model. A new time-dependent variable is introduced, Δt, which will
vary according to a CFL criteria, C, based upon a velocity scale,
x˙N,
and the mesh element size such that
Δt=CΔxx˙N=CxNΔyx˙N,
where Δx is the mesh element size in x, and Δy is the mesh
element size in the transformed coordinate system y. The time-dependent
model variables are defined as a vector
U=h,q,φ,η,xN,x˙N,ΔtT.
The system is discretised in time using a second-order explicit Runge–Kutta
time discretisation . An implicit term is added to the
semi-discrete system in order to solve for the diagnostic variables
x˙N and Δt. With Un as the solution at the beginning of
the time step, Un+1 as the solution at the end of the time step, and
U(0), U(1), and U(2) as intermediate values, the system of
equations (discretised in time) can be written as
U(0)=Un,U(1)=A(U(0))+L(U(0))+K(U(1)),U(2)=A(U(1))+L(U(1))+K(U(2)),Un+1=12U(0)+12U(2),
where
A(U)=h,q,φ,η,xN,0,0T,L(U)=Δt1xNyx˙N∂f1(U)∂x-∂f2(U)∂x+f3(U),f1=h,q,φ,η,0,0,0T,f2=q,q2h+φh2,qφh,0,0,0,0T,f3=0,0,-βφh,βφh,0,0,0T.
A(U) is non-zero where there is a time derivative term. L(U) is the explicit
right-hand side term multiplied by Δt. Note that K(U) contains the implicit
right-hand side terms. K(U) can only be easily described well in weak form, so
this is defined later.
The spatially weak form of the semi-discrete system (Eq. ) is
obtained as Ψ and integrated over the domain Ω. This gives
Ω. For all Ψ in an appropriately chosen test space,
∫ΩΨ⋅U(0)dΩ=∫ΩΨ⋅UndΩ,∫ΩΨ⋅U(1)dΩ=∫ΩΨ⋅A(U(0))dΩ+∫ΩΨ⋅L(U(0))dΩ+∫ΩΨ⋅K(U(1))dΩ,∫ΩΨ⋅U(2)dΩ=∫ΩΨ⋅A(U(1))dΩ+∫ΩΨ⋅L(U(1))dΩ+∫ΩΨ⋅K(U(2))dΩ,∫ΩΨ⋅Un+1dΩ=12∫ΩΨ⋅Un+U(2)dΩ.
Piecewise linear discontinuous Galerkin (DG) elements are used to discretise the
spatially varying state variables. Thus, the spatial and temporal discretisations
both have second-order accuracy. DG element types are known to be particularly
suitable for advection dominated problems . They are good at
preserving discontinuities as they produce stable discretisations without the
need for diffusive stabilisation strategies, such as streamline upwinding
. These are important features in shallow water
particle-laden density current models.
In order to construct a DG formulation, a regular partition Th={e} of Ω into non-overlapping subdomains Ωe∈Ω
with boundaries ∂Ωe is considered. The piecewise linear DG
function space is denoted DG1. For this function space,
piecewise linear test functions with no global continuity requirement are
considered; i.e. functions that have the potential to be double valued on
∂Ωe. xN, x˙N, and Δt are defined on a
function space, R, which is constant throughout the spatial
domain. Therefore, the vector of model unknowns U is defined on a mixed
function space, Vh=DG14×R3. The
function in the mixed function space is denoted with Ψh∈Vh and the discretised approximation of the state variable with
Uh∈Vh.
Notice that L(U) contains derivatives of discontinuous functions.
Its undiscretised weak form is
∫ΩΨ⋅L(U)dΩ=Δt∫ΩΨ⋅1xNyx˙N∂f1(U)∂x-∂f2(U)∂x+f3(U)dΩ.
The discretised DG formulation of Eq. () is then
∑e∈Th∫ΩeΨh⋅L(Uh)dΩ=Δt∑e∈Th∫ΩeΨh⋅1xNyx˙N∂f1(Uh)∂x-∂f2(Uh)∂x+Ψhf3(Uh)dΩ.
Integrating the gradient terms by parts and slightly rearranging yields
∑e∈Th∫ΩeΨh⋅xNΔtL(Uh)dΩ=-∑e∈Th∫Ωe∂∂xΨhyx˙N⋅f1(Uh)dΩ+∑e∈Th∫∂ΩeΨh^⋅yx˙Nf1^(Uh)n^dσ+∑e∈Th∫Ωe∂Ψh∂x⋅f2(Uh)dΩ-∑e∈Th∫∂ΩeΨh^⋅f2^(Uh)n^dσ+∑e∈Th∫ΩeΨh⋅xNf3(Uh)dΩ,
where ⋅^ indicates that the function is double valued and
special attention is required. The various summations can now be rewritten as
integrals over the entire domain Ω, all element interfaces Σh,
and the domain boundaries Γh. Note that
∑e∈Th∫∂ΩeΨh⋅Uh^dσ≡∫ΣhΨh⋅Uh^dσ+∫ΓhΨh⋅U0dσ,and∑e∈Th∫ΩeΨh⋅UhdΩ≡∫ΩΨh⋅UhdΩ.
Additionally, note that within domain boundary integrals, the
⋅^ notation is dropped as the function is single valued at
this location. Uh is also replaced with U0, which is either the
boundary value if a Dirichlet boundary condition is present or the function
value at the boundary if it is not. Note that in the case of the boundary
condition for q at y=1, U0 is still a function of Uh. Applying
Eqs. () and () to () yields
∫ΩΨh⋅xNΔtL(Uh)dΩ=-∫Ω∂∂xΨhyx˙N⋅f1(Uh)dΩ+∫ΣhΨh^⋅yx˙Nf1^(Uh)n^dσ+∫ΓhΨh⋅yx˙Nf1(U0)ndσ+∫Ω∂Ψh∂x⋅f2(Uh)dΩ-∫ΣhΨh^⋅f2^(Uh)n^dσ-∫ΓhΨh⋅f2(U0)ndσ+∫ΩΨh⋅xNf3(Uh)dΩ.
A choice of flux term must be made to handle the double-valued terms. This will
involve some coupling between the elements on either side of the interface. An
upwind flux is used for the advection term, f1^, and based upon
experience an average flux works well for f2^. This gives
∫ΩΨh⋅xNΔtL(Uh)dΩ=-∫Ω∂∂xΨhyx˙N⋅UhdΩ+∫ΓhΨh⋅y(x˙Nn)downf1(U0)dσ+∫ΣhΨh+-Ψh-⋅yf1+(Uh)(x˙Nn+)up+f1-(Uh)(x˙Nn-)updσ+∫Ω∂Ψh∂x⋅f2(Uh)dΩ-∫ΓhΨh⋅f2(U0)ndσ-∫ΣhΨh+-Ψh-⋅12f2(Uh)++f2(Uh)-n+dσ+∫ΩΨh⋅xNf3(Uh)dΩ,
where (⋅)+ and (⋅)- indicate the function values on either side of an
interior element boundary. (⋅)up is equal to (⋅) where x˙Nn±>0 and 0 otherwise. Conversely, (⋅)down is equal to (⋅) where
x˙Nn±<0 and 0 otherwise. K(U) can be described in weak
discretised form as
∫ΩΨh⋅K(Uh)dΩ=∫ΩΨh⋅KΩ(Uh)dΩ+∫∂ΩRΨh⋅Kσ(Uh)dσ,KΩ(Uh)=0,0,0,0,0,0,CxNΔyx˙N,Kσ(Uh)=0,0,0,0,0,Fr(φ)1/2,0,
where ∂ΩR is the right-hand boundary at y=1 such that a solution for
x˙N is obtained by solving only at the front of the
current.
Using Eqs. (), (), and () and
applying Eq. (), the full weak, discontinuous form of
Eq. () can be obtained.Find Uh(0),Uh(1),Uh(2),Uh(n+1)∈Vh such that ∀Ψh∈Vh
∫ΩΨh⋅Uh(0)dΩ=∫ΩΨh⋅UhndΩ,∫ΩΨh⋅Uh(1)dΩ=∫ΩΨh⋅A(Uh(0))dΩ+∫ΩΨh⋅L(Uh(0))dΩ+∫ΩΨh⋅K(Uh(1))dΩ,∫ΩΨh⋅Uh(2)dΩ=∫ΩΨh⋅A(Uh(1))dΩ+∫ΩΨh⋅L(Uh(1))dΩ+∫ΩΨh⋅K(Uh(2))dΩ,∫ΩΨh⋅Uhn+1dΩ=12∫ΩΨh⋅Uh(0)+Uh(2)dΩ.
This set of equations is solved for each time step of the simulation as a
nonlinear variational problem using Newton's method with an LU decomposition
solver for the linear problems.
Schematic diagram of the lock-release static initial
condition (a) and the following dam-break (b) and
slumping (c) phases with the shock wave propagation direction
indicated by
().
Slope limiting
Discontinuous Galerkin discretisations for convection dominated problems can
suffer from over- and undershoots at discontinuities that can cause
instability problems . Slope limiting can be
applied to solve this problem, but this typically involves discontinuous
operations, which are problematic in a gradient-based optimisation framework.
Therefore, we do not use slope limiting here and limit ourselves to the
assumption of smooth initial conditions where slope limiting is not
necessary. It would be possible to formulate a continuous slope-limiting
function to overcome this limitation if it was required.
Implementation
The shallow water sediment-laden density current model described above was
built using the FEniCS framework , an open-source
software project that provides features for the automated, efficient solution
of differential equations. Using a high-level interface, the model partial
differential equations are described in variational form using UFL (Unified
Form Language) . This can be achieved in Python or C++ code
in a way that is remarkably similar to how one would describe the equations
on paper. At runtime, this model description is compiled into efficient C++
kernels that handle the assembly of the required matrices to generate the systems
of equations that are then solved using PETSc .
Forward model verification
Many laboratory experiments and computer models are based around the
classical lock-release static initial condition (Fig. a).
Following the release of the lock-gate, the current accelerates forwards. This is
known as the dam-break stage (Fig. b).
As the lock-gate is released, a shock forms which travels in the opposite
direction to the front of the current. This shock carries information that
sets the fluid in motion. Once this shock reaches the rear wall, all of the
fluid behind the lock-gate is in motion. This marks the point of transfer
from the dam-break to the slumping phase
(Fig. c). For a non-depositional current (i.e. β=0)
with initial h and xN=1, the slumping phase begins at t=1. The current
front height and velocity remain approximately constant during this phase of
motion. The rear propagating shock is reflected off the no-flow boundary
and travels faster than the front of the current. A short while later, it
reaches the front of the current, marking the end of the slumping phase. The
current is now able to “forget” the initial condition and begins adjusting
to self-similar propagation . For a non-depositional
current (i.e. β=0), the reflected shock reaches the front of the current
at t=3 .
showed that a similarity solution, which is a solution that looks the same
at all times or at all length scales, could be obtained for a single layer shallow
water density current model during the self-similar phase
of propagation. This is described as
xN=κt2/3,u=23κt-1/3us,h=κ-1t-2/3hs,ψ=1,
where
y=x/κt2/3,κ=27Fr212-2Fr21/3,us=y,hs=49κ3y24-14+1Fr2.
Similarity convergence analysis. All variables are shown to converge
on the correct solution at the correct order. ϵ(*) indicates the
L2 norm of the error in the solution obtained for variable (*).
The domain is unit length, as in all cases for this model. This solution is
valid for the model described in this paper so long as the settling velocity
of particles, β, is equal to 0 (i.e. no particle settling). This
analytical solution is useful in verifying the implementation of the
governing equations and boundary conditions for this model. The solution to
the model PDEs should converge on this analytical solution as the mesh
resolution is refined at the correct rate for the temporal and spatial
discretisation. The use of piecewise discontinuous linear elements and a
second-order time-stepping regime means that the convergence order should be
quadratic in both space and time.
For the convergence test, the analytical solution is projected onto the model
function space forming the initial condition at t=3. At t=10, the L2
norm of the difference between the model variables and the analytical
solution is obtained and used to measure convergence. The analysis shows that
all variables converge on the analytical solution at the correct order
(Fig. ). Note that the time step is adaptive and
will therefore decrease along with the element size such that this test
checks both spatial and temporal convergence. This verifies that the model
equations are implemented correctly . A qualitative
comparison shows that the solution matches the analytical solution very well
(Fig. ).
Similarity results for the finest resolution mesh (solid lines)
compared against the analytical results (dashed lines) at t=10.0.
The adjoint model
Here, we describe the adjoint model and its derivation generally rather than
specifically applying it to this model.
Consider a problem with N input parameters forming a vector, m. Let F(U,m)=0 denote the set of PDEs that describe a model where U represents the model
variables throughout time. Note that U can be seen as an implicit function of
m, U=U(m), by finding a solution to F(U,m)=0. Suppose now that
the aim is to minimise an objective functional, J(U,m), by optimising
m. Here, J(U,m) will be defined as a function measuring the difference
between a deposit profile generated by the model and a target deposit
profile. Where optimisation is required to a tolerance of δmi, with
i being the index of each parameter and where each parameter has bounds
spanning a range Δmi, optimisation through a brute force approach will
require ∏iNΔmi/δmi evaluations of the model to find the
solution. N may be very large for a sediment-laden density current model, which
could potentially have time-varying boundary conditions for sediment
concentration, velocity and height, uncertainty in the elevation profile,
the friction coefficient of the surface over which the current is flowing, and
uncertainty in the parameters that govern the physics of the flow, such as entrainment of
ambient fluid, front speed, and sediment erosion. Many of these parameters vary
over space and time such that the parameter space grows as the resolution in
time or space increases. Such a large potential parameter space motivates the
use of a more advanced and efficient optimisation strategy.
Numerous algorithms have been developed to improve this brute force approach.
These optimisation algorithms begin with an initial guess of the input
parameters and iterate, then generate improved estimates until they terminate,
hopefully at the optimised solution. The authors refer the reader to
for an extensive description of the range of numerical
optimisation methods.
Most of these optimisation algorithms require the gradient of the objective
functional with respect to the input parameters, dJ/dm.
Approximation techniques, such as finite differencing, could be used to
evaluate the gradient, but this will require an excessive number of PDE
evaluations and may suffer from noise . Here, the adjoint
model is used to efficiently calculate the gradient. This approach is
favoured as it calculates dJ/dm for any number of input
parameters with a single evaluation of the adjoint model.
Obtaining the adjoint model begins by applying the chain rule to
dJ/dm:
dJ(U(m),m)dm=∂J∂U,dUdm+∂J∂m.
∂J/∂U and ∂J/∂m are both vectors,
and they are typically straightforward to compute as J is typically a given
analytical function of U and m. dU/dm, on the other
hand,
is a matrix that is typically dense and expensive to compute. A
relationship for dU/dm can be obtained by taking the total
derivative of F(U,m)=0 with respect to m:
0=dF(U(m),m)dm=∂F∂UdUdm+∂F∂m,⇒∂F∂UdUdm=-∂F∂m.
Equation is termed the tangent linear
equation. ∂F/∂U and ∂F/∂m are
both matrices. The solution of this equation is obtained by solving N
systems of equations. When there are many functionals, J, and a small set
of parameters, m, then this equation can be useful for obtaining
dJ/dm via Eq. (). With a large set of
parameters and only one functional, as is the case here, this is not an
efficient approach.
However, suppose that ∂F/∂U in
Eq. () is invertible so that one can obtain
dUdm=-∂F∂U-1∂F∂m.
This expression can be substituted for dU/dm directly
into Eq. () to obtain
dJ(m)dm=-∂J∂U,∂F∂U-1∂F∂m+∂J∂m.
A simple property of inner products, y,Ax=A*y,x, where A* is the conjugate transpose
or adjoint of A, can be used to shift ∂F/∂U-1 to the left-hand side of the inner product:
dJ(m)dm=-∂F∂U-*∂J∂U,∂F∂m+∂J∂m.
Gathering the left-hand side of the inner product into a new variable,
λ:=∂F∂U-*∂J∂U,
yields the linear system of equations that can be solved for the adjoint
variable, λ:
∂F∂U*λ=∂J∂U.
Equation () is termed the adjoint equation.
The right-hand side is a vector and only one evaluation is required to obtain
λ for a specific functional, J. Once
Eq. () is solved, dJ/dm can easily
be computed with respect to any parameter m by substituting the value of
λ into Eq. ().
As mentioned above, ∂J/∂U and ∂J/∂m are typically straightforward to compute. However, (∂F/∂U)* and ∂F/∂m still need to be
derived and implemented, which is not a simple task for a large set of complex
PDEs. The challenge of obtaining these matrices is the main obstacle to using
the adjoint model. However, the high-level abstraction of the coding provided
by FEniCS to create this model makes calculating (∂F/∂u)* and ∂F/∂m an automatable task using
an additional tool, dolfin-adjoint . This powerful tool
automatically derives the discrete adjoint and tangent linear models from a
forward model written in FEniCS. This makes differentiating the forward
model and solving the adjoint equation to obtain the derivative of the
objective functional a much simpler task. Additionally,
dolfin-adjoint contains tools for carrying out the optimisation of the model
parameters by interfacing with IPOPT optimisation
algorithms .
Estimation of the parameters for the turbidity
current that generated Bed 1.1 in the Marnoso-arenacea Formation
The Marnoso-arenacea Formation spans 17 to 7 Ma (Late Burdigalian to
Tortonian) and is over 3500 m thick . Deposition
occurred from two sources: the northwestern Alpine source and the
southwestern Apennine source . The
depositional environment was an elongated foreland basin adjacent to the
Apennine thrust belt with turbidites deposited in a relatively wide
(> 60 km) basin in a non-channelised manner
. The formation provides the most
extensive and detailed correlation of flow deposits (beds) in any ancient
turbidite system and is therefore a natural laboratory for studying turbidite
depositional processes . It has been extensively mapped with
more than 100 sections accurately recorded over a correlated distance
of more than 120 km . Bed volumes range from
O(10-3) km3 to several km3 . It
contains extensive data for evaluating the performance of the adjointed
turbidity current model described here.
The sandstone depths measured for Bed 1.1 along the Pietralunga and
Ridracoli structural elements orientated approximately parallel to the
palaeoflow. This has been reconstructed from Fig. 5 in .
A fourth-order polynomial approximation of the deposit profile,
ηT, is also shown. This is used as a target for the optimisation
algorithm. The base of the bed is shown as a horizontal datum in order to
illustrate lateral changes in deposit thickness. Note that a different datum
is used in the source figure, which uses the top of Bed 1.2.2 rather than the
top of Bed 1. The palaeoelevation of the base of the bed would have varied
spatially, reflecting the basin floor relief.
In this section, an optimisation algorithm is used to select model parameters
that produce an output deposit that best matches part of Bed 1.1 in the
Marnoso-arenacea Formation, as recorded by . This is defined
as a small volume of flow deposit with a total sediment volume of
≈ 0.215 km3 . produced
an approximate one-dimensional deposit parallel to the palaeoflow
(Fig. ). The shape of the deposit strongly resembles
that of very low concentration currents in laboratory tests, and it also
resembles the shape of bed profiles generated by the
model. This implies that the flow that created this deposit was a very low
concentration current. The model used in this chapter is very simple. It does
not model any stratification or particle–particle interactions in the flow.
As such, its application is limited to very low concentration flows, and hence
Bed 1.1 is a good candidate as a case study for this model.
The deposit consists of sandstone and mudstone components. The focus here
will be on attempting to recreate only the sandstone portion of the deposit.
It is likely that ponding effects have influenced the shape of the mud
deposit in this bed , which this model cannot replicate.
The outcrop quality also deteriorates beyond the extent of the sandstone
deposit. Therefore, no attempt is made to model this portion of the
bed.
Choice of initial conditions and parameters
The initial conditions are based upon the analytical solution for a
non-depositional flow at a non-dimensional time, t=ts=3, after a
column collapse as described by Eq. (). Some assumptions
are therefore made as to the initial shape, sediment concentration profile,
and velocity profile of the flow.
The non-dimensional particle-settling velocity, β, is calculated using the
standard Stokes settling law for a particle in suspension
non-dimensionalised by (h0g0′)1/2 to give
β=g′D218ν(h0g0′)1/2=g′1/2D218νh0ψ01/2,
where D is the average sediment diameter. The sediment-reduced gravity, g′=g0′/ψ0=(ρp-ρa)g/ρa=16, is
based upon the reduced gravity of silica in water. Using these initial
conditions, there are three unknowns: h0, the dimensional length scale of
the current; ψ0, the initial sediment concentration throughout the
current; and D, the mean sediment diameter. These become the set of input
parameters that will be optimised with m=h0,ψ0,DT.
The beginning of the basin is defined as being at the front of the current at
t=ts such that the current is not in the basin prior to the start
of the simulation. The current enters the domain as soon as the simulation
starts. The end of the simulation, tf, is defined as the time at
which the total suspended sediment is less than 1 % of the starting
quantity.
Choice of optimisation functional
The aim here is to reduce the difference between the deposit profile generated
by this model and the target deposit profile from field measurements. To do
this,
we need to map the non-dimensional, transformed results from the model back to
the observation space. We also only measure the variation over the length of the
measured deposit. Therefore, the functional that we will aim to minimise, J,
has the form
J(U(m),m)=∫0x^maxη̃-ηT2dx^,
where ηT is the dimensional target deposit profile,
η̃=ψ0h0η is the dimensionalised modelled
deposit, x^max= 82 000 is the extent of the measured data, and
x^=x̃-x̃N(ts) is a coordinate
transformation such that x^=0 at the front of the current at
t=ts, x̃=yx̃N(tf) is the dimensionalised
reverse of the coordinate transformation outlined in
Sect. ; x̃N(t)=h0xN(t) is
the dimensional length of the current.
To calculate this functional, ηT must be a function of
x^. The deposit is approximated using a fourth-order polynomial
ηT=∑i=04cix^i,
where ci is the ith coefficient. The coefficients are obtained using the
least squares method. The fourth-order approximation fits the measured data
points well (Fig. ).
It is important to note that at the end of the simulation, t=tf,
the length of the current does not necessarily match the length of the
deposit, or x^N≠x^max, where x^N(t)=x̃N(t)-x̃N(ts) is the dimensional length of the
modelled deposit within the basin. This complicates the calculation of the
above integral.
The calculation of J is split into two components, an integral over the lesser of
the length of the modelled current (or the length of the measured data,
J0)
and an integral over any remaining length of measured data, J1, such that
J=J0+J1.
The first integral takes the form
J0=∫0min(x^N,x^max)ηT(x^)-η̃(x^)2dx^.
This can be approximately transformed into the model coordinate system as
J0=∫Ωγ0ηT(x^(y))-γ0η̃(x^(y))2dy,
where γ0 is a scaled filter. This filter is 0 in the regions
x^<0 and x^>x^max. Elsewhere, the filter value is a
constant such that the integral of the filter over the domain is equal to the
length of the dimensional integral, min(x^N,x^max). The filter defines the region of the domain over which the integral is
evaluated and scales the resultant value appropriately. The filter is defined
as
γ0(x^)=min(x^N,x^max)expmin(x^-x^max,x^,0)sγ0,sγ0=∫Ωexpmin(x^-x^max,x^,0)dx^.
It is important that the functional is differentiable. Therefore, the min and
max functions are replaced by smooth approximations fmin and fmax
defined as
fmin(a,b)=ln(exp(10a)+exp(10b))/10,fmax(a,b)=fmin(-a,-b).
The second integral, J1, takes the form
J1=∫min(x^N,x^max)x^maxηT(x^)2dx^,
such that J1 integrates the target deposit volume beyond the extent of the
modelled deposit. If the modelled deposit length exceeds the length of the
measured data, this integral will be 0. Again, this can be approximately
transformed into the model coordinate system as
J1=∫Ωγ1ηT(x̃(y))2dy,x̃(y)=min(x^N+y(x^max-x^N),x^N),
where γ1 is a scaled filter similar to γ0 defined as
γ1(x^)=x^max-min(x^N,x^max)⋅expmin(x^-x^max,0)sγ0,sγ0=∫Ωexpmin(x^-x^max,0)dx^.
Again, min and max are replaced by smooth differentiable alternatives.
Verification of the gradient calculation
The gradient computation was verified using the Taylor remainder convergence
test. Let J^(m)≡J(U(m),m), a pure function
of m. The zero-order Taylor expansion states that
J^(m+δm)-J^(m)=Oδm,
while the first-order Taylor expansion states that
J^(m+δm)-J^(m)-dJ^(m)dmδm=Oδm2.
Even small errors in the derivative destroy the second-order convergence in
Eq. (). Therefore, testing the convergence of these
expansions with the gradient calculated from the adjoint yields a strong
indicator of whether the adjoint gradient computation is correct.
Taylor remainders R0=|J^(xN(0)+δxN)| and R1=|J^(xN(0)+δxN)-J^(xN(0))-dJ^(xN(0))/duδxN| for the with functional
given by J(η)=∫Ωη(t=tf)2dΩ.
δxN
R0(δxN)
order
R1(δxN)
order
1.0
1.16 ×10-8
2.67 ×10-9
0.5
5.11 ×10-9
1.18
6.54 ×10-10
2.03
0.25
2.39 ×10-9
1.01
1.56 ×10-10
2.07
0.0125
1.15 ×10-9
1.05
3.54 ×10-11
2.14
0.00625
5.64 ×10-10
1.03
6.97 ×10-12
2.34
The above convergence was succesfully carried out for the implemented adjoint
model with a number of different controls and functionals. As an example,
Table shows the results with m as the initial
deposit length. The convergence of the remainder term is second order
with respect to varying magnitudes of δm, providing strong evidence
that the adjoint model and gradient computation are implemented correctly.
Optimisation of a model with one sediment class
With confidence that the forward and backward models are working, optimisation
of the input parameters, m=h0,ψ0,D, to minimise the
objective functional J can now be performed as
minm=J(U(m),m),
with the the following bounding constraints on the input parameters:
10m≤h0≤10km,0.001%≤ψ0≤50%,1μm≤D≤1mm.
These bounding constraints are chosen based upon very loose limits of expected
values that each parameter may possibly take. The principal purpose of these
bounds is to avoid invalid negative values being generated for any of the
parameters.
The nonlinear optimisation library, IPOPT , is used to
solve this problem. This library implements a primal-dual interior point
algorithm which has good global and local convergence properties
. The interface to this library is supplied by
dolfin-adjoint . The initial input parameters are set to
m=h0=2.3kmψ0=0.07%D=200.0µm.
The aim is to recreate the sand deposit by modelling only the sand in the
flow using a single average grain size, D. The value of ψ0 is based
upon a combined initial volumetric concentration for the sand and mud mixture
of 0.5 %, with 86 % of the mixture being mud. The starting value for
h0 is based upon the area of the two-dimensional deposit profile and the
value of ψ0. The average sediment grain size is a reasonable estimate
of the average grain size based upon the information provided by
. The input parameters provided to the optimisation
algorithm, m‾, are normalised such that they are all equal to
1. Thus,
mi=m‾im0,
where m0 indicates the initial parameter values in Eq. (), and
mi indicates the value of m after optimisation iteration i. This
scaling helps the optimisation algorithm work effectively
.
Values of the parameters over the optimisation iterations against the
value of the objective functional, J, that we are aiming to
minimise. The values shown are normalised by their starting values. (*)n is
the value of parameter * at the start of iteration n.
The dimensional deposit output η̃0 from the initial parameter
guess (Eq. ) and the optimised dimensional deposit output
η̃ from the optimised parameter (Eq. )
shown against the field measurement from Bed 1.1 and the
fourth-order polynomial target deposit profile, ηT.
The criteria for finishing the parameter optimisation is based upon the relative
change in J between iterations such that
Ji-Ji-1Ji<1.0×10-5,
where Ji is the value of J after the ith iteration. The optimisation
is completed in 21 iterations with a final functional value of J=1.75
(Fig. ). The optimised deposit profile, η,
compares relatively well with ηT
(Fig. ). Most notably, there is a significant
variation in the thickness towards the end of the deposit. This will be
addressed later. The final optimised values are
m=h0=2.56kmψ0=0.0494%D=103µm.
These optimised values are not completely acceptable. The value for h0
represents the initial height of the current starting from a static
lock-release initial condition. This translates to an initial current height
of 993.3 m at the start of this simulation and as the current enters the
basin plain. This value appears to be quite large for a relatively small
turbidity current. Additionally, the average sediment diameter of
103 µm is lower than expected. defines the
sandstone interval as dominated by sediment grains estimated to be
larger than ≈ 125 µm.
With the exception of the sediment diameter, the optimised values
are fairly similar to those chosen as input values. This confirms that the
input parameters chosen were sensible predictions of the starting conditions
for the gravity current. To test this hypothesis, we ran the same situation starting from
a number of alternative initial conditions. We found that there are
indeed a number of local minima. An optimisation with initial conditions
m=h0=3.0kmψ0=0.02%D=100.0µm
was optimised to
m=h0=3.95kmψ0=0.02%D=154.0µm.
Figure shows a comparison of the two generated
deposit profiles. The two profiles are very comparable, even though the
alternative profile is created by a much larger, but much less dense, initial
current.
The dimensional deposit output η̃ from the optimised deposit
output from the initial parameter guess (Eq. ) and the dimensional
deposit output η̃alt from an alternative minima achieved
by using the initial parameter guess (Eq. ) shown against the
fourth-order polynomial target deposit profile, ηT.
The existence of alternative minima must always be considered when running
optimisations of this type. It is important to have a good understanding of
the problem to choose sensible initial starting conditions and also to assess
the resultant optimised values. A regularisation approach would avoid this
problem but assumes prior knowledge about the target profile.
A clear omission from the model is the presence of mud in the suspension. The
presence of mud will significantly alter the energy budget of the flow. A mud
sediment class can easily be included so that the model produces more
realistic optimised values. This is detailed below.
Extending the model to include an additional sediment class for the
mud in suspension
Investigating the effect of including mud in the sediment mixture can be
achieved relatively simply by including an additional transport equation with
a form identical to Eq. () ,
∂φm∂t=1xNyx˙N∂φm∂y-∂∂yqφmh-βmφmh,
where φm=ψmh is the vertically integrated
volume fraction of mud in the suspension. ψm is the
depth-averaged volume fraction of mud within the flow, and βm
is the settling velocity of the mud particles. Using a single tracer
equation,
we approximate the distribution of mud particle sizes using a single mud
diameter, the same way the distribution of sand is modelled in the
flow. We neglect the flocculation of mud particles. Assuming that the density of
both sediment classes is the same, (Eq. ) is modified to
include this new sediment class in the gravity term:
∂q∂t=1xNyx˙N∂q∂y-∂∂yq2h+(φ+φm)h2.
Finally, φ and φm are scaled such that at the start of
the simulation φ+φm=1, where previously
φ=1. The aim is still to recreate the deposit of sand, and hence the
equation for η stays the same. We term the sand deposit generated by
this modified model η2. The discretisation for
Eq. () is consistent with the rest of the model, as
presented in Sect. .
The initial condition needs to be altered to include the new sediment class.
The initial vertically averaged volume fraction of sand is changed to
ψ=fs, and a new initial condition for the vertically averaged
volume fraction of mud is introduced as ψm=1-fs. The sand
fraction, fs, is estimated by to be 0.14 and
is kept fixed.
βm must also be calculated. This is done in the same way as for
β, except that a different sediment diameter parameter is used and optimised: Dm,
the mean diameter of mud particles in the flow. The
equation for βm is therefore
βm=g′1/2Dm218νh0ψ01/2.
Note that ψ0 is now the combined initial volume fraction of sand and mud
in the flow.
Optimisation for a model with two sediment
classes
The set of optimised input parameters is redefined as m=h0,ψ0,D,DmT. An additional bounding constraint is added for
Dm such that the new bounding constraints for m are
10m≤h0≤10.0km,0.001%≤ψ0≤50%,1.0µm≤D≤1.0mm,1.0µm≤Dm≤100.0µm.
The initial input parameters are set to
m=h0=2.1kmψ0=5%D=200.0µmDm=20.0µm.
The input parameters are normalised as detailed in Sect. before
being passed to the optimisation algorithm. The criteria for finishing the
optimisation are consistent with the previous optimisation (see
Eq. ).
Values of the parameters from the model with both mud and sand sediment
classes over the optimisation iterations against the value of the objective
functional, J, that we are aiming to minimise. The values shown are normalised
by their starting values. (*)n is the value of parameter * at the start
of iteration n.
Optimised dimensional deposit output from the model with both mud
and sand sediment classes, η̃2, shown against the optimised
results from the single sediment class model, η̃, the field
measurement from Bed 1.1 , and the fourth-order polynomial
target deposit profile, ηT.
The optimisation of the model with two sediment classes is completed in 17
iterations with a final functional value of J=2.13 (see
Fig. ). Therefore, the fit is quantitatively
slightly worse when mud is included in the model. This is a surprising
result,
as the model now more closely matches reality. Qualitatively, it is very hard
to determine which model fits the data better. The resultant deposit is very
similar in shape to that obtained when only modelling sand in the flow (see
Fig. ). The fit appears to be worse at the start of
the deposit. The runout length is slightly longer when mud is included such
that the fit towards the end of the deposit is slightly improved.
The fit with the measured data is still poor towards the end of the deposit.
noted that the distal section of small deposits in the
Marnoso-arenacea Formation show evidence of transport in a tractional
boundary layer. This simulation does not model bedload transport or erosion,
which is the likely reason for the difference in the results. The velocity of the
head of the turbidity current in this simulation varies between 10 and
2.4 m s-1 over the period during which sand is deposited
(Fig. ). At these head velocities, erosion is very
likely to occur. Models for erosion and bedload transport exist
. These could be added in future work.
The final optimised values are
m=h0=1.92µkmψ0=5.94×10-3D=125µmDm=28.1µm.
A comparison of these results to those obtained without a mud sediment class
shows that the
value of h0 has reduced by 25 % and translates to an initial current
height of 745.0 m as the current enters the basin plain. The average
sediment diameter has also increased by 21 % to 125 µm,
bringing the average diameter in line with the estimates from the field
measurements by . Arguably, the sand and mud classes
should be subdivided further. described how polydisperse
density currents will have longer runout distances than equivalent currents
with uniform sediment at the mean value of the poydisperse current.
It is also interesting to assess the sensitivity of the model to variations in
the input parameters by analysing the final gradient of the objective
functional,
dJdm‾=dJ/dh‾0=5.1×10-3dJ/dψ‾0=-1.8×10-3dJ/dD‾=-2.4×10-3dJ/dD‾m=1.5×10-6,
where ⋅‾ indicates a parameter value normalised by its value on the
initial optimisation iteration. The sensitivity of the functional to changes in
the mud diameter is several orders of magnitude smaller than the sensitivity to
changes in the other variables.
It is indeed found that changing this value has very little effect on the
obtained deposit. The same simulation is run with the mud diameter decreased by
2 orders of magnitude such that the input parameter values are
m=h0=1.92µkmψ0=5.94×10-3D=125µmDm=0.281µm.
The resulting functional value is J=2.13, which is identical to that obtained
for the optimised simulation. There is no discernible difference in the
resulting deposit, η3 (Fig. ). The head
height and velocity only vary a small amount over the period during which sand is
deposited (Fig. a and b). The
current properties vary significantly after the sand has been deposited and mud
is still in suspension, but this does not have any effect on the sand deposit.
Optimised dimensional deposit output from the model with both mud
and sand sediment classes, η̃2, shown against the results from the
same model with the same parameters but a mud-settling velocity reduced by
2 orders of magnitude, η̃3.
Although the sandstone deposits generated by the single and two sediment class
models are very similar, the properties of the turbidity currents that produced them
are very different (Fig. ). The turbidity current with
mud in suspension travels approximately twice as quickly due to the increased
gravitational forces produced by the sand and mud mixture (Fig. b).
Sand also drops out of the suspension much more
rapidly (Fig. d). All of the sand is deposited within
approximately 6 h. The model without mud deposits sand over a period of
more than 20 h. This is due to the reduced height of the current at the
start of the simulation and the faster decrease in the height of the current as
a result of the higher head velocity (Fig. a and d).
Clearly, the presence of mud in the suspension has a
significant impact on the resultant flow and must be included in the model.
The time evolution of the dimensionalised variables for three simulations: a
simulation with a single sand sediment class and optimised input parameters
to match the Bed 1.1 sand deposit, a simulation with sand and mud classes
and optimised input parameters to match the Bed 1.1 sand deposit, and a
simulation with sand and mud classes and the same optimised input parameters
but a mud diameter 2 orders of magnitude smaller. The results are shown
against the dimensional time, t̃=t(h0/g0)1/2.
The simulated turbidity currents that produced η2 and η3 deposited
sand over a similar time period. After all of the sand had fallen out of
suspension, less than 25 % of the mud settled from the flow for both of
these currents; the current head is > 50 m tall, and the head is moving at
> 1.0 m s-1 (Fig. ). Hence, there is still a significant
amount of energy in the flow. The remaining mud suspension will reach the end of
the basin (x̃N≈ 130 km) and will still have a significant amount
of energy left when it does so. It is very hard to predict what will happen
after this point. The current may be partly reflected, and ponding of the
suspended mud is likely to occur. This result is in agreement with the
explanations of .
The height of the current in the optimised simulation with both sand and mud
sediment classes is ≈ 750 m as it enters the basin, although this
decreases very quickly as the current propagates. It is possible that including
processes such as fluid entrainment, erosion, and bedload transport may reduce
the necessity for such a large initial current height in producing this
deposit. More complex initial and boundary conditions may also have a
significant impact on this value. It is unclear what effect an inflow boundary
condition with time-varying height, sediment concentration, and velocity would
have on the results. This would be an interesting addition to the models
capabilities.
The model also neglects variations in the bed profile. The gradient of the sea
floor in the basin where the Marnoso-arenacea Formation was created was
substantially less than 1 degree . Variations in gradient of
this magnitude will have a negligible impact on the head velocity
. However, small variations will have an impact on the
velocity of the body of the current. Future work will address this.