The Finite Element Sea Ice-Ocean Model ( FESOM ) v . 1 . 4 : formulation of an ocean general circulation model

The Finite Element Sea Ice-Ocean Model (FESOM) is the first global ocean general circulation model based on unstructured-mesh methods that has been developed for the purpose of climate research. The advantage of unstructured-mesh models is their flexible multi-resolution modelling functionality. In this study, an overview of the main features of FESOM will be given; based on sensitivity experiments a number of specific parameter choices will be explained; and directions of future developments will be outlined. It is argued that FESOM is sufficiently mature to explore the benefits of multi-resolution climate modelling and that its applications will provide information useful for the advancement of climate modelling on unstructured meshes.


Introduction
Climate models are becoming increasingly important to a wider range of users. They provide projections of anthropogenic climate change; they are extensively used for subseasonal, seasonal and decadal predictions; and they help us to understand the functioning of the climate system.
Despite substantial progress in climate modelling, even the most sophisticated models still show substantial shortcomings. Simulations of the Atlantic meridional overturning circulation, for example, still vary greatly in strength and pattern between climate models (Randall et al., 2007). Furthermore, many climate models still show substantial problems when it comes to simulating the observed path of the Gulf Stream. It is increasingly being recognized that a lack of sufficiently high spatial resolution is one of the main causes of the existing model shortcomings. In fact, many climate-relevant processes are too small scale in nature to be explicitly resolved by state-of-the-art climate models on available supercomputers and therefore need to be parametrized (Griffies, 2004;Shukla et al., 2009;Jakob, 2010).
In recent years a new generation of ocean models that employ unstructured-mesh methods has emerged. These models allow the use of high spatial resolution in dynamically active regions while keeping a relatively coarse resolution otherwise. It is through this multi-resolution flexibility that unstructured-mesh models provide new opportunities to advance the field of climate modelling. Most existing unstructured-mesh models have dealt with coastal or regional applications (e.g. Chen et al., 2003;Fringer et al., 2006;Zhang and Baptista, 2008). This paper focuses on the setting of unstructured-mesh models for global applications, that is, on configurations more geared towards climate research applications. More specifically, the latest version of the Finite Element Sea Ice-Ocean Model (FESOM) -the first mature, global sea-ice ocean model that employs unstructured-mesh methods -will be described. The fact that FESOM solves the hydrostatic primitive equations for the ocean and comprises a finite element sea ice module makes it an ideal candidate for climate research applications. FESOM has been developed at the Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research (AWI) over the last 10 yr (Danilov et al., 2004;Wang et al., 2008;Timmermann et al., 2009).
The present study is meant to give a thorough overview of FESOM in the context of global ocean modelling. Providing the climate modelling community with such an overview of the most mature global multi-resolution sea-ice ocean model seems justified given that unstructured mesh modelling does not feature yet in standard textbooks on ocean modelling. Furthermore, the field is advancing so rapidly that details of the implementation of FESOM described in previous papers (Danilov et al., 2004;Wang et al., 2008;Timmermann et al., 2009) are already outdated. It is also expected that other modelling groups working on the development of similar models (e.g. Ringler et al., 2013) will benefit from a detailed overview of our implementation of unstructured-mesh methods in global models. Finally, we expect that the present study, which also entails details on parametrizations and model tuning, will stimulate discussions and therefore ultimately advance the development of multi-resolution models.
The basic numerical formulation of FESOM including spatial and temporal discretization is described in Sect. 2. Section 3 represents the key elements of FESOM which are fundamental in formulating ocean climate models. This section partly takes a review form, describing various physical parametrizations and numerical methods presented in the literature. A short summary will be given in the last section. A brief historical review of FESOM's development is given in the Appendix.

Spatial discretization
Here we briefly explain the implementation of the finite element method in FESOM. For a detailed description of the implementation see Wang et al. (2008). The variational formulation with the FE method involves two basic steps. First, the partial differential equations (primitive equations) are multiplied by a test function and integrated over the model domain. Second, the unknown variables are approximated with a sum over a finite set of basis functions. FESOM uses the combination of continuous, piecewise linear basis functions in two dimensions for surface elevation and in three dimensions for velocity and tracers. For example, sea surface elevation η is discretized using basis functions M i as where η i is the discrete value of η at grid node i of the 2-D computational mesh. The test functions are the same as basis functions, leading to the standard Galerkin formulation. In two dimensions FESOM uses triangular surface meshes. Figure 1 shows the schematic of 2-D basis functions on a triangular mesh. The basis function M i is equal to one at grid node i and goes linearly to zero at its neighbour nodes; it equals zero outside the stencil formed by the neighbour nodes. The 3-D mesh is generated by dropping vertical lines starting from the surface 2-D nodes, forming prisms which are then cut into tetrahedral elements (Fig. 2). Except for layers adjacent to sloping ocean bottom each prism is cut into three tetrahedra; over a sloping bottom not all three tetrahedra are used in order to employ shaved cells, in analogy to  the shaved cells used by Adcroft et al. (1997). Keeping the 3-D grid nodes vertically aligned (i.e. all 3-D nodes have their corresponding 2-D surface nodes above them) is necessitated by the dominance of the hydrostatic balance in the ocean.
For a finite element discretization the basis functions for velocity and pressure (surface elevation in the hydrostatic case) should meet the so-called LBB condition (Ladyzhenskaya, 1969;Babuska, 1973;Brezzi, 1974), otherwise spurious pressure modes can be excited. These modes are similar to the pressure modes of Arakawa A and B grids (Arakawa, 1966). The basis functions used in FESOM for velocity and pressure do not satisfy the LBB condition, so some measures to stabilize the code against spurious pressure modes are required. Note that pressure modes on unstructured meshes are triggered more easily than in finitedifference models and robust stabilization is always needed.
In the early model version the Galerkin least squares (GLS) method proposed by Codina and Soto (1997) was used to solve the difficulty related to the LBB condition. In the current model version the GLS method is replaced by a pressure projection method described by Zienkiewicz et al. (1999) to circumvent the LBB condition. With the GLS method the iterative solver needs to solve the surface elevation equation and the vertically integrated momentum equations together (Danilov et al., 2004), whereas with the pressure projection method the solution of surface elevation is separated and no barotropic velocity is introduced . Therefore using the pressure projection method reduces the computational cost. It also leads to a more consistent code, as in the GLS case the horizontal velocity and vertically integrated Schematic of spatial discretization. The column under each surface triangle is cut to prisms (a), which can be divided to tetrahedra (b). Except for layers adjacent to sloping ocean bottom each prism is cut to three tetrahedra. Schematic of spatial discretization. The column under each surface triangle is cut into prisms (a), which can be divided into tetrahedra (b). Except for layers adjacent to sloping ocean bottom each prism is cut into three tetrahedra. horizontal velocity cannot be in the same functional space in the presence of bottom topography, leading to projection errors. The Appendix provides an overview on the development history of FESOM.

Temporal discretization
The advection term in the momentum equation is solved with the so-called characteristic Galerkin method (Zienkiewicz and Taylor, 2000), which is effectively the explicit secondorder finite-element Taylor-Galerkin method. The method is based on taking temporal discretization using Taylor expansion before applying spatial discretization. Using this method with the linear spatial discretization as mentioned above, the leading-order error of the advection equation is still second order and generates numerical dispersion (Durran, 1999), thus requiring friction for numerical stability.
The horizontal viscosity is solved with the explicit Euler forward method (Sect. 3.9). The vertical viscosity is solved with the Euler backward method because the forward time stepping for vertical viscosity is unstable with a typical vertical resolution and time step. 1 To ensure solution efficiency, we solve the implicit vertical mixing operators separately from other parts of the momentum and tracer equations. 2 The surface elevation is solved implicitly to damp fast gravity 1 The stability of an explicit Euler forward time stepping method for vertical viscosity requires A v t z 2 ≤ 1/2, where A v is the vertical mixing coefficient. A typical time step of t = 40 min and a surface vertical resolution of z = 10 m require A v ≤ 0.02 m 2 s −1 . Vertical mixing coefficients in the surface boundary layer obtained from the KPP scheme (see Sect. 3.6) can readily be higher than this value locally.
2 To guarantee global conservation the vertical diffusion equations are solved using the finite element method. For continuous basis functions the discretized vertical diffusion equations involve horizontal connections through the mass matrix with the time derivative term, and they cannot be solved efficiently. We chose to lump the matrix of time derivative terms (lumping means to sum all entries in waves, and needs iterative solvers. The Coriolis force term uses the semi-implicit method to well represent inertial oscillations. The default tracer advection scheme is an explicit fluxcorrected-transport (FCT) scheme (Sect. 3.5). The GM parametrization is incorporated into the model with the Euler forward method (see Sect. 3.10 for the description of the GM parametrization), while the vertical diffusivity uses the Euler backward method for the same reason as for vertical viscosity.
An external iterative solver is called for solving the surface elevation equation. The final momentum and tracer equations have only matrices of time derivative terms on the lefthand side of the equations, which can be relatively efficiently solved. 3 Overall the dynamics and thermodynamics in the model are staggered in time with a half time step. That is, the new velocity is used to advect tracers, and the updated temperature and salinity are then used to calculate density.

Model efficiency
Models formulated on unstructured meshes are slower than their structured-mesh counterparts because of indirect memory addressing and increased number of numerical operations. If an unstructured-mesh model is n s times slower than typical structured-mesh models, simulations using it can computationally benefit from its unstructured-mesh functionality when the fine resolution region occupies less than 1/n s portion of the total computational area and most of the computational degrees of freedom are confined there. In the course of FESOM development we have chosen spatial and temporal discretization schemes by taking both model accuracy and efficiency into account. FESOM is about 10 times slower than a typical structured-mesh model . In its practical applications we therefore limit the locally refined region to less than 10 % of the total domain area (e.g. Q. Timmermann et al., 2012;Timmermann and Hellmer, 2013;Wekerle et al., 2013;Wang et al., 2013). In these applications mesh resolutions are increased locally by factors ranging from 8 to 60 and most computational grid nodes are located inside the refinement region, ensuring great computational benefit.
The run-time memory access in the current FESOM version, hindered by its 1-D storage of variable arrays, is one of the bottlenecks for model efficiency. Some software engineering work is required in the future to identify the potential in improving memory access efficiency. Other unstructuredmesh numerical methods have shown potential in developing more efficient ocean models (for a review see Danilov, 2013). In our model development team, we reserve some a row to the diagonal in the matrix). This leads to a decoupled equation for each water column, which can be very efficiently solved.
3 Implicit advection and lateral friction would require iterative solvers with pre-conditioning in every time step, thus slowing down the solution. human resources for research on different numerical methods, although currently the main effort is on FESOM development and its climate scale applications.

Key elements of the model
Different numerical and parametrization schemes are available in FESOM. A detailed description of all the available features of FESOM is beyond the scope of the present study.
Here the focus will be on those model elements that are crucial for climate scale applications. Illustrations of model sensitivity to different choices are presented here only for some of the model elements. A relatively complete description of the key model elements does not allow us to pursue thorough sensitivity studies for all of them here.
When configuring an OGCM for the purpose of climate research, choices for numerical and parametrization schemes are made according to the model developers' practical experience as well as knowledge from the literature. For the sake of the paper we provide a review form in some parts of the model description in this section. The review details the background for making our choices and importantly lists recommendations from previous studies which can improve model integrity. In our future work it will be valuable to evaluate those recommended options that have not been tested with FESOM yet. We also note that the reviews made in this section are not complete in themselves as they are organized following our own model development issues considered so far. For a thorough review on ocean model fundamentals, see Griffies (2004).
All sensitivity tests presented in this section are based on atmospheric forcing fields taken from the inter-annual CORE-II data provided by Large and Yeager (2009). All simulations were carried out for 60 yr over the period of forcing provided by Large and Yeager (2009).

Two-dimensional mesh
FESOM uses spherical coordinates, so the meridional and zonal velocities would be poorly approximated on a triangle covering the North Pole. To avoid the singularity a spherical coordinate system with the north pole over Greenland (40 • W, 75 • N) is used. 4 FESOM uses triangular surface meshes. There are a few free triangle-mesh generators available, including DistMesh (Persson and Strang, 2004) and Triangle (Shewchuk, 1996). The mesh quality, the extent to which the triangles are close to equilateral ones, can be further improved after mesh generation by relaxation of grid point locations. An abrupt change in resolution can lead to bad triangles (with too small/big inner angles and very different edge lengths) thus degrading the quality of meshes, so a transition zone between high and  coarse resolutions is generally introduced. The resolution of a triangle is defined by its minimum height.
In practical applications with limited computational resources, we keep the horizontal mesh resolution coarse in most parts of the global ocean (for example, at nominal one degree as used in popular climate models), and refine particularly chosen regions. The equatorial band with meridionally narrow currents and equatorially trapped waves requires higher resolution on the order of 1/3 • (Latif et al., 1998;Schneider et al., 2003). Figure 3 shows the horizontal resolution of an example mesh with increased resolution in the equatorial band. It is nominal one degree in most parts of the ocean and increases to 24 km poleward of 50 • N. The resolution of this mesh is designed in terms of both kilometre (north of 50 • N) and geographical degree (south of 50 • N) as a particular example, although it is not necessarily so in general cases. Thanks to the flexibility of unstructured meshes, one can fully avoid the imprint of geographic coordinates and design meshes based on distances along the spherical surface.
Many key passages between ocean basins such as Indonesian Throughflow, Bering Strait and Canadian Arctic Archipelago (CAA) are important for basin exchange but narrow and difficult to resolve in global models. Figure 4 shows an example mesh which was configured to study the Arctic Ocean freshwater circulation (Wekerle et al., 2013). This global mesh uses a coarse resolution south of 50 • N and increases resolution to 24 km poleward, with the CAA region resolved at a 5 km scale. Traditional climate models cannot resolve the straits as narrow as in the CAA (27-53 km wide at the narrowest locations in the two largest CAA passages), so they are usually widened and/or deepened to allow adequate throughflow. However, such empirical treatment of CAA results in a very large range in the simulated CAA freshwater transport among a set of state-of-the-art ocean models (Jahn et al., 2012). The improved simulation by explicitly resolving the narrow straits with a global FESOM setup indicates the potential of unstructured meshes in representing narrow strait throughflows in global models (Wekerle et al., 2013).
It is worth mentioning that the model time step size is constrained by the finest resolution on the mesh. 5 If the number of grid points in the high-resolution regions is only a few percent of the total grid points, we cannot enjoy the advantage of using FESOM because the overall time step size has to be set small. Therefore, in practice, increasing horizontal resolution in narrow straits is usually implemented in applications when some part of the ocean basin is also locally refined. In such applications a large portion of the computational grid points are in fine-resolution areas. To benefit from the multi-resolution capability even in cases when only a very small portion of the computational grid points have locally increased resolutions, multirate time stepping schemes are needed. Seny et al. (2013) gave an example of such schemes applied in a discontinuous Galerkin model.

Vertical coordinates and discretization
The choice of vertical coordinates or vertical grids is one of the most important aspects in the design of ocean circulation models . Coordinated projects have been carried out to study the performance of different types of vertical discretization, -for example, DAMEE-NAB (Chassignet et al., 2000) and DYNAMO . Different vertical coordinates are used in ocean models and each of them has its own advantages and disadvantages (Haidvogel and Beckmann, 1999;Griffies, 2004).
FESOM uses z coordinates (also called geopotential coordinates) in the vertical. The primitive equations are discretized on z coordinates without coordinates transformation, while sigma or more general vertical grids can be conveniently used because of the FE formulation . The 2-D mesh and 3-D discretization (including grid types and vertical resolution) are set during the mesh generation stage off-line before carrying out simulations.
Similar to traditional sigma grid models, truncation errors in computing hydrostatic pressure gradient exist on sigma grids in FESOM (see Sect. 3.4). The error in hydrostatic pressure gradient can be reduced by introducing high-order interpolations, but it is not trivial and can potentially degrade the solutions in climate scale simulations. Therefore, z-level grids are recommended in setting up ocean climate models. We use shaved cells over the ocean bottom on z-level grids to better represent the gentle topographic slopes (Adcroft et al., 1997;Wang et al., 2008). A faithful representation of bottom topography by using shaved (or partial) cells can generally improve the integrity of ocean model simulations (Maier-Reimer et al., 1993;Adcroft et al., 1997;Pacanowski and Gnanadesikan, 1998;Myers and Deacu, 2004;Barnier et al., 2006). 5 While an implicit advection scheme is stable in terms of temporal discretization, it is not necessarily accurate if the Courant number (U t/ , where U is the current speed, t is the time step and is the grid spacing) is large.  Combining z levels in the bulk of the ocean with sigma grids in shelf regions of interest is a viable alternative to zlevel grids. A schematic of such hybrid grids is shown in Fig. 5. These types of grids are used in FESOM when the ice cavities are included in the model . This hybrid grid is similar to the generalized coordinate used in POM (Princeton Ocean Model) described by Ezer and Mellor (2004). As illustrated in Fig. 5, sigma grids are used under ice shelf cavities and along the continental shelf around the Antarctic, while the z-level grids are used in all other parts of the global ocean to minimize pressure gradient errors.
There are a few reasons for using sigma grids in these marginal seas. Sigma grids offer the flexibility in vertical refinement (near the ocean surface, near the bottom, or in the whole column over the shallow continental shelves). The iceshelf-ocean interactions can be better represented with vertically refined resolution near the ocean surface. Increased vertical resolution is also beneficial for representing continental  shelf and ocean basins exchange processes, including dense shelf water outflow and circumpolar water inflow; z-level grids with shaved cells under ice shelves are found to be useful in simulating the ocean circulation in ice cavities (Losch, 2008), while the merits of sigma grids are flexible vertical resolution and less grid scale noise, thus less spurious mixing.
On the z-level grid the vertical resolution is usually set finer in the upper 100-200 m depth to better resolve the surface boundary layer and becomes coarser with depth. Shaved cells are generally used at the bottom. In the region of sigma grids the vertical resolution is set depending on scientific interest, for example, increasing the near-surface resolution under the ice shelf and the near-bottom resolution where continental shelf and basin water mass exchange is important. The vertical resolution distribution function of Song and Haidvogel (1994) is used in the mesh generator for adjusting the sigma grid resolution. 6 6 A convenient recipe is to define the sigma levels as is the vertical layer index, h min is the minimum depth in the sigma grid region, H is the water column thickness, s(k) = −(k − 1)/(N − 1) and N is the number of vertical layers (Song and Haidvogel, 1994). 0 ≤ θ ≤ 20 and 0 ≤ θ b ≤ 1 are the tuning parameters for designing the vertical discretization. Larger θ leads to more refined near-surface layers and if θ b approaches 1 resolution at the bottom is also refined. A transition zone is required to smoothly connect the sigma and z-level grids.

Bottom topography
A blend of several bottom topography data sets is used to provide the bottom topography for FESOM. North of 69 • N the 2 km resolution version (version 2) of International Bathymetric Chart of the Arctic Oceans (IBCAO version 2, Jakobsson et al., 2008) is used, while south of 64 • N the 1 min resolution version of General Bathymetric Chart of the Oceans (GEBCO) is used. Between 64 • N and 69 • N the topography is taken as a linear combination of the two data sets. The ocean bottom topography, lower ice surface height and ice shelf grounding line in the ice cavity regions around the Antarctica are derived from the one minute Refined Topography data set (Rtopo-1, Timmermann et al., 2010), which is based on the BEDMAP version 1 data set (Lythe et al., 2001). 7 The raw topography data are used to determine the ocean coastlines. They are bilinearly interpolated to the model grid points. After the raw topography is interpolated to the model grid, grid scale smoothing of topography is applied to get rid of grid scale noise. The smoothing for each 2-D node is performed over its 2-D stencil (consisting of all 2-D nodes connected by edges with it). 8 Preliminary smoothing on coarse intermediate meshes should be avoided because it may be too strong for the fine part of the model mesh. One may choose to explicitly resolve narrow ocean straits using locally increased resolution in some cases and apply manual mesh and topography modification at unresolved straits in other cases. Modellers need to decide how to treat individual narrow straits depending on the research interest and overall mesh design.
The topography is bilinearly interpolated from the data grid with fine resolution (2 km and 1 min) to model grids. If the model resolution is much lower than the topography data resolution, adequate smoothing of model topography can have a positive impact on the simulated ocean circulation. It is recommended to repeat the grid scale smoothing several times. Figures 6a and b show the bottom topography after applying stencil smoothing three and one times respectively for the coarse mesh shown in Fig. 3. Sensitivity experiments using these two versions of topography indicate that the former leads to a more realistic ocean circulation (Fig. 7).
The barotropic stream functions with the two versions of model topography are different mainly in the Southern Ocean, along the western boundary and in the North Atlantic 7 Improved ice bed, surface and thickness data sets for Antarctica (BEDMAP2, Fretwel et al., 2013) and a new bathymetry data set for the Arctic Ocean (IBCAO3, Jakobsson et al., 2012) have been released recently and their impact on model simulations compared to previous data sets need to be tested in sensitivity studies. 8 The smoothing at node i is a weighted mean over its stencil. The neighbour nodes have a weight one and node i has a weight 2n, where n is the total number of neighbour nodes. As an abrupt change in mesh resolution is avoided, the variation in distance between node i and neighbour nodes is not accounted for in the smoothing.  subpolar gyre. In the Weddell Sea, the maximum transport in observational estimates is 29.5 Sv along the transect between the northern tip of the Antarctic Peninsula and Kapp Norvegia (Fahrbach et al., 1994) and more than 60 Sv at the Greenwich Meridian (Schroeder and Fahrbach, 1999). Observations suggest mean southward transport at the Labrador Sea exit at 53 • N ranging from 37 Sv (Fischer et al., 2004) to 42 Sv (Fischer et al., 2010). Circulations in the Southern Ocean and Labrador Sea are dynamically controlled by the Joint Effect of Baroclinicity and Relief (JEBAR, Olbers et al., 2004;Eden and Willebrand, 2001), so the model results have large sensitivity to the treatment of bottom topography in these places. Because of the relatively coarse mesh (Fig. 3), maximum barotropic transport in these regions is weaker than observed in both simulations. The maximum gyre transport in Weddell Sea and North Atlantic subpolar gyre is about 34 Sv and 28 Sv respectively for the topography shown in Fig. 6a (see Fig. 7a). If topography smoothing is applied only once, the transport is lower by about 10 Sv in both regions (see Fig. 7b).
Over the terrain-following part of the mesh the topography is smoothed by adjusting the slope parameter r 0 (also called Beckmann and Haidvogel number, Beckmann and Haidvogel, 1993) and the hydrostatic inconsistency number  r 1 (also called Haney number, Haney, 1991). 9 The smoothing helps to alleviate hydrostatic pressure gradient errors and maintain numerical stability. In practice we recommend the criteria r 0 ≤ 0.2 and r 1 ≤ 3. The smoothing is done on each 2-D stencil starting from the shallowest grid point until the deepest grid point. This procedure is repeated until the criteria are satisfied throughout the mesh.

Hydrostatic pressure gradient
Care should be taken in the calculation of the hydrostatic pressure gradient on sigma grids. Pressure gradient errors are not avoidable when the sigma grid surface deviates from the geopotential coordinate, but can be reduced with carefully designed numerics (Shchepetkin and McWilliams, 2003). A few measures are taken to reduce the errors in FE-SOM. The widely used method of exchanging the sequence 9 r o and r 1 are defined on edges between two neighbouring where H is the water column thickness, and subscripts i and j indicate two neighbouring nodes. r 1 = is the vertical coordinate at layer k. The smoothing is done for r 0 first and then for r 1 , over stencils starting from the shallowest depth to the deepest. The smoothing procedure usually needs to be repeated to satisfy the criteria throughout the mesh. of integration and differentiation is employed (Song, 1998;Song and Wright, 1998). The horizontal derivatives of in situ density are taken first and pressure gradient forces are calculated then. In this way pressure gradient force errors are reduced but still present because of truncation errors in representing density with linear functions. The second measure is to use high-order interpolation in the vertical to interpolate density to a common depth for computing the density gradient. The idea is discussed and assessed in Wang et al. (2008).
In practice more measures are taken to control the pressure gradient errors on sigma grids. A common additional recipe is to apply topography smoothing to satisfy the criteria for r 0 and r 1 as described in Sect. 3.3. Increasing resolution also helps to reduce pressure gradient errors . Furthermore, the sigma grid as a part of the hybrid grid is only used around the Antarctic continental shelf and under ice shelves (Sect. 3.2), where we commonly use increased resolution to resolve small geometrical features.

Tracer advection
The commonly used tracer advection scheme in FESOM is an explicit second-order flux-corrected-transport (FCT) scheme. The classical FCT version described by Löhner et al. (1987) is employed as it works well for transient problems. The FCT scheme preserves monotonicity and eliminates overshoots, a property useful for maintaining numerical stability on eddying scales. Upon comparison to a secondorder scheme without flux limiter and an implicit second order scheme in idealized 2-D test cases, at coarse resolution the FCT scheme tends to slightly reduce local maxima even for a smooth field, but it well represents a sharp front and shows least dispersion errors in general (Wang, 2007).
Advection schemes should be able to provide adequate dissipation on grid scales and keep large scales less dissipated.  show that it is important to adequately resolve the admitted scales of motion in order to maintain a small amount of spurious diapycnal mixing in z-coordinate models with commonly used advection schemes. They find that spurious diapycnal mixing can reach more than 10 −4 m 2 s −1 depending on the advection scheme and the flow regime. 10 Ilicak et al. (2012) demonstrate that spurious dianeutral transport is directly proportional to the lateral grid Reynolds number. Our preliminary tests show that the effective spurious diapycnal mixing associated with the FESOM FCT scheme is similarly high as shown in . Systematic research is needed for exploring alternative transport schemes and limiters and for investigating the dependence on the Reynolds number, especially in the context of mesh irregularity.

Diapycnal mixing
Diapycnal mixing in the ocean has a strong impact on the dynamics of the ocean circulation and on the climate system as a whole (e.g. Bryan, 1987;Park and Bryan, 2000;Wunsch and Ferrari, 2004). The mixing processes are not resolved in present ocean models and need to be parametrized. The k-profile parametrization (KPP) proposed by Large et al. (1994) provides a framework accounting for important diapycnal mixing processes, including wind stirring and buoyancy loss at the surface, non-local effects in the surface boundary layer, shear instability, internal wave breaking and double diffusion. Previous studies (Large et al., 1997;Gent et al., 1998) suggest that the KPP scheme is preferable in climate simulations. It is implemented in many current climate models. It is also used in FESOM for large-scale simulations. 11 Mixing induced by double diffusion (due to salt fingering and double diffusive convection) was found to have a relatively small impact on the mixed layer depth (Danabasoglu et al., 2006) and upper ocean temperature and salinity (Glessmer et al., 2008) in sensitivity studies, although its impact (mainly from salt fingering) on biogeochemical properties is pronounced and cannot be neglected in ecosystem modelling (Glessmer et al., 2008). The double diffusion mixing scheme modified by Danabasoglu et al. (2006) is implemented in FESOM.

Diapycnal mixing from barotropic tides
Mixing due to shear instability is parametrized as a function of Richardson number (Large et al., 1994). To include the mixing from barotropic tides interacting with ocean bottom, especially in the relatively shallow continental shelf regions, the tidal speed is accounted for in the computation of the Richardson number as proposed by Lee et al. (2006). As the tidal speed is large along the coast (Fig. 8), the Richardson number is small and vertical mixing is large in these regions. The original tidal mixing scheme of Lee et al. (2006) leads to too strong vertical mixing even far away vertically from the ocean bottom, as manifested by unrealistic winter polynyas in the central Weddell Sea in our simulations (not shown). The exponential decay as a function of distance from the ocean bottom suggested by Griffies (2012) is implemented in FESOM. It helps to remove the spurious large mixing. 12 11 Other mixing schemes are also used in climate models. For example, the current version of the MPIOM-ECHAM6 Earth System Model  uses the Pacanowski and Philander (1981) scheme. 12 The parametrization of mixing from barotropic tides interacting with the continental shelf is given by κ tidal v = κ max (1 + σ Ri) −p exp −(H −|z|)/z tide , where κ max = 5 × 10 −3 m 2 s −1 , σ = 3, p = 1/4, Ri is the Richardson number based on tidal speed (see Lee et al., 2006, for details), H is the water column thickness, and z tide is an exponential decay length scale (proportional to tidal speed  The barotropic tidal mixing was found to be useful as it assists in the horizontal spreading of river water at certain river mouths . Our simulation shows that this is the case especially for the Arctic river runoff. The increased horizontal spreading of river water from the Arctic marginal seas leads to an increase in the freshwater flux at Fram Strait (see Fig. 9). The freshwater transport at Fram Strait shows the largest spread among the four Arctic gates in a recent Arctic Ocean model intercomparison, partly attributed to uncertainties in simulated salinity in the western Arctic Ocean (Jahn et al., 2012). Due to its impact on river water spreading and salinity, tidal mixing is among the key processes that need to be investigated to understand the reported model biases.
Mixing due to barotropic tides has a large-scale impact as it reduces the Atlantic Meridional Overturning Circulation (AMOC) (Fig. 10). The Labrador Sea Water production and AMOC are more sensitive to the freshwater exported through Fram Strait than through Davis Strait (Wekerle et al., 2013). The increased freshwater export through Fram Strait is an important mechanism through which tidal mixing can consequently weaken the Labrador Sea deep convection and AMOC, while there could be other relevant processes. It should be noticed that the tests shown here are carried out with an ocean-alone model and surface salinity restoring to climatology is enforced. Salinity restoring can provide a local salt sink/source. We speculate that the impact of tidal mixing on the Arctic freshwater export and large-scale circulation is also significant in coupled climate models (without surface salinity restoring). To include the impact of tides, Mueller et al. (2010) added a tidal model into a coupled climate model (MPIOM-ECHAM5). In this case the tidal velocity is simulated and tidal mixing is explicitly taken into times the M2 tide period). The exponential decay term was introduced by Griffies (2012) and does not exist in the original formula of Lee et al. (2006).  account through the dependence of mixing coefficients on the Richardson number. They also reported that tides have pronounced influence on the ocean circulation, including weakening of the Labrador Sea deep convection.

Diapycnal mixing associated with internal wave energy dissipation
The background vertical diffusivity in the KPP scheme represents the mixing due to internal wave breaking, which provides mechanical energy to lift cold water across the thermocline and increase the potential energy of water, thus sustaining the large-scale overturning circulation (Huang, 1999;Wunsch and Ferrari, 2004). Wind and tides are the main energy sources for this mechanical energy in the abyss (Munk and Wunsch, 1998). Observational estimates indicate that the diapycnal diffusivity is of the order of 0.12 ± 0.02 − 0.17 ± 0.02 × 10 −4 m 2 s −1 (Ledwell et al., 1993(Ledwell et al., , 1998 in the subtropical Atlantic pycnocline and 0.15 ± 0.07 × 10 −4 m 2 s −1 in the North Pacific Ocean (Kelley and Van Scoy, 1999), and much smaller values were observed near the equator (Gregg et al., 2003). In the deep ocean the diffusivity is observed to be small (0.1 × 10 −4 m 2 s −1 ) over smooth topography and much larger (1 − 5 × 10 −4 m 2 s −1 ) near the bottom in regions of rough topography Toole et al., 1997;Ledwell et al., 2000;Morris et al., 2001;Laurent et al., 2012). A modified version of the Bryan and Lewis (1979) background vertical tracer diffusivity is used poleward of 15 • in the model formulation with FESOM (Fig. 11). 13 The minimum value is 0.1 × 10 −4 m 2 s −1 at the surface and the maximum value is 1.1 × 10 −4 m 2 s −1 close to the ocean bottom. Motivated by observations (Gregg et al., 2003) the magnitude of this vertical profile is made one order smaller within the ±5 • latitude range (0.01 × 10 −4 m 2 s −1 at ocean surface) and increased linearly to the off-equator value at 15 • N/S. Using a coupled climate model Jochum et al. (2008) found 13 The background vertical tracer diffusivity poleward of 15 • N/S is computed as a function of depth {0.6 + 1.0598/π × atan[4.5 × 10 −3 × (|z| − 2500.0)]} × 10 −4 .  that using the small background vertical diffusivity (0.01 × 10 −4 m 2 s −1 ) in the equatorial band improves the tropical precipitation, although the improvement is only minor compared to existing biases. We use a constant background vertical viscosity of 10 −4 m 2 s −1 , and there is no observational justification for this value. Enhanced vertical mixing in the thermocline arising from Parametric Subharmonic Instability (PSI) of the M2 tide at the 28.9 • N/S band (Tian et al., 2006;Alford et al., 2007;Hibiya et al., 2007) is not accounted for in our model formulation. The model sensitivity study by Jochum et al. (2008) shows that increasing the off-equator background vertical diffusivity in the thermocline toward the observational estimate (0.17×10 −4 m 2 s −1 instead of 0.1×10 −4 m 2 s −1 ) or accounting for the mixing arising from PSI worsens the North  Atlantic results in their particular tests. Anyway, observations for diapycnal diffusivity have motivated the utilization of more realistic diffusivity values in present climate models (see e.g. Danabasoglu et al., 2012).
The importance of the Arctic Ocean in the climate system especially in a warming world and the reported difficulty in robustly representing the surface and deep circulation in the Arctic Ocean in state-of-the-art ocean models (e.g. Karcher et al., 2007;Zhang and Steele, 2007;Jahn et al., 2012) warrant research on improving numerical models including diapycnal mixing parametrizations. The diapycnal mixing in the halocline in the central Arctic Ocean is small compared to mid-latitude, largely due to the presence of sea ice (Rainville and Winsor, 2008;Fer, 2009). The diapycnal diffusivity is 0.05 ± 0.02 × 10 −4 m 2 s −1 averaged over 70-220 m depth in the Amundsen Basin and as low as 0.01 × 10 −4 m 2 s −1 in the upper cold halocline (Fer, 2009). A small background vertical diffusivity of 0.01×10 −4 m 2 s −1 was used in the KPP scheme and found to be optimal in some regional Arctic models (Zhang and Steele, 2007;Nguyen et al., 2009). The decrease in diapycnal diffusivity in the Arctic Ocean was taken into account in present climate models (e.g. Jochum et al., 2013). In our practice we found that using this small value indeed improves the representation of the summer warm layer, but it increases the misfit of the halocline (too fresh in the upper halocline and too saline in the lower halocline in the model) and leads to too low freshwater export through Fram Strait. Therefore, this local tuning of background diapycnal diffusivity for the Arctic Ocean is not adopted in FESOM. Presumably, using a more realistic vertical profile of diapycnal diffusivity with a range 0.01-0.1 × 10 −4 m 2 s −1 in the halocline as suggested by observations (Fer, 2009) can more adequately simulate the Arctic Ocean circulation. This hypothesis has not been tested yet. Improved understanding of mixing processes in the ocean has led to a parametrization of abyssal mixing induced by internal wave breaking associated with baroclinic tidal energy (St Laurent et al., 2002;Simmons et al., 2004). Concentrating intense mixing above rough topography where major tidal energy dissipates was found to be preferable for representing deep ocean stratification and Southern Ocean heat uptake in climate models (Saenko, 2006;Exarchou et al., 2013). The model sensitivity study by Jayne (2009) shows that using the tidal mixing parametrization proposed by St Laurent et al. (2002) can significantly enhance the deep cell of the meridional overturning circulation (MOC) in comparison with only using an ad hoc background vertical diffusivity, although the upper cell of the MOC and the poleward heat transport (the often used diagnostics for adjudging climate models) are not strongly affected by this parametrization. Present climate and earth system models tend to use the St Laurent et al. (2002) parametrization instead of the Bryan and Lewis (1979) type of background diffusivity (e.g. Danabasoglu et al., 2012;Delworth et al., 2012;Dunne et al., 2012). The merit of the abyssal diapycnal mixing parametrization of St Laurent et al. (2002) is that it is based on energy conservation and is more consistent with physical principles. Compared to the tidal mixing scheme by Lee et al. (2006) which tends to increase vertical diffusivity in regions of low Richardson numbers (particularly the continental shelf regions), the parametrization of St Laurent et al. (2002) and Simmons et al. (2004) allows for enhanced tidal mixing in deep ocean regions. Comparing these different approaches remains for our future work.
Energy associated with mesoscale eddies is another importance source for turbulent mixing. Saenko et al. (2012) have recently investigated the individual effects of the tide and eddy dissipation energies on the ocean circulation. They showed that the overturning circulation and stratification in the deep ocean are too weak when only the tidal energy maintains diapycnal mixing. With the addition of the eddy dissipation, the deep-ocean thermal structure became closer to that observed and the overturning and stratification in the abyss became stronger. Jochum et al. (2013) developed a parametrization for wind-generated near-inertial waves (NIWs) and found that tropical sea surface temperature and precipitation and mid-latitude westerlies are sensitive to the inclusion of NIW in their climate model. They concluded that because of its importance for global climate the uncertainty in the observed tropical NIW energy needs to be reduced. Presumably the recent progress in the understanding of diapycnal mixing processes will increase model overall fidelity when practical parametrizations for these processes are taken into account.

Penetrative short-wave radiation
The infrared radiation from the solar heating is almost completely absorbed in the upper 2 m water column, while the ultraviolet and visible part of solar radiation (wavelengths < 750 nm) can penetrate deeper into the ocean depending on the ocean colour. In the biologically unproductive waters of the subtropical gyres, solar radiation can directly contribute to the heat content at depths greater than 100 m. Adding all the solar radiation to the uppermost cell in ocean models with a vertical resolution of 10 m or finer can overheat the ocean surface in regions where the penetration depth is deep in reality.
Many sensitivity studies have shown that adequately accounting for short-wave radiation penetration and its spacial and seasonal variation is important for simulating sea surface temperature (SST) and mixed layer depth at low latitude, equatorial undercurrents, and tropical cyclones and El Niño-Southern Oscillation (ENSO) (Schneider and Zhu, 1998;Nakamoto et al., 2001;Rochford et al., 2001;Murtugudde et al., 2002;Timmermann et al., 2002;Sweeney et al., 2005;Marzeion et al., 2005;Anderson et al., 2007;Ballabrera-Poy et al., 2007;Zhang et al., 2009;Jochum et al., 2010;Gnanadesikan et al., 2010;Liu et al., 2012). The solar radiation absorption is influenced by ocean colour on a global scale, so the bio-physical feedbacks have a global impact on the simulated results, including sea-ice thickness in the Arctic Ocean and the MOC (Lengaigne et al., 2009;Patara et al., 2012).
One traditional way to account for the spatial variation of short-wave penetration in climate models is to use spatially varying attenuation depths in an exponential penetration profile, which was found to be preferable compared to using a constant depth (Murtugudde et al., 2002). The seasonal variability of the attenuation depth plays an important role in the interannual variability in the tropical Pacific (Ballabrera-Poy et al., 2007). The interannual variability in short-wave absorption was also found to be important in representing ENSO in climate models (Jochum et al., 2010).
We use the short-wave penetration treatment as suggested by Sweeney et al. (2005) and Griffies et al. (2005). The optical model of Morel and Antoine (1994) is used to compute visible light absorption. 14 The chlorophyll seasonal climatology of Sweeney et al. (2005) (see Fig. 12) is used in the computation. The visible light attenuation profile is obtained from the optical model, and the difference between two vertical grid levels is used to heat the cells in between. Sweeney et al. (2005) show that the optical models of Morel and Antoine (1994) and Ohlmann (2003) produce a relatively small difference in their ocean model. 14 The attenuation profile of downward radiation in the visible band is computed via I VIS (x, y, z) = I 0 VIS (x, y)(V 1 e z/ζ 1 + V 2 e z/ζ 2 ), where V 1 , V 2 , ζ 1 and ζ 2 are computed from an empirical relationship as a function of chlorophyll a concentration as suggested by Morel and Antoine (1994). I 0 VIS is 54 % of the downward solar radiation to the ocean, and the other part is infrared radiation and is directly added to the ocean surface.  In some Earth System Models ecosystem models are used to better represent chlorophyll fields and the bio-physical feedbacks (Lengaigne et al., 2009;Loeptien et al., 2009;Jochum et al., 2010;Patara et al., 2012). Prognostic biogeochemistry is potentially beneficial in improving the fidelity of climate prediction through adaptive bio-physical feedbacks. An ecosystem model (REcoM, Schartau et al., 2007) has just been coupled to FESOM but is not included in the short-wave penetration parametrization in the present model version.

Vertical overturning
The hydrostatic approximation necessitates the use of a parametrization for unresolved vertical overturning processes. One approach is to use the convection adjustment schemes from Cox (1984) or Rahmstorf (1993). The latter scheme can efficiently remove all static instability in a water column. Another approach is to use a large vertical mixing coefficient (e.g. 10 m 2 s −1 ) to quickly mix vertically unstable water columns and it is employed in FESOM. 15 As indicated by Klinger et al. (1996), using a large but finite vertical mixing coefficient can improve the simulation compared to instantaneous convection adjustment. Note that the vertical diffusion approach can only be realized through implicit time stepping. 16

Horizontal viscosity
Horizontal momentum friction in ocean models is employed mainly for practical computational reasons and not motivated 15 Super-parametrization as an alternative is found to be greatly superior to the convection adjustment parametrization at much less computational cost than running non-hydrostatic models (Campin et al., 2011). Its potential in climate modelling needs to be explored. 16 Implicit time stepping methods for vertical diffusion are needed in general. See footnote 1. by first principles (see the review of . As a numerical closure, horizontal friction is intended to suppress grid noise associated with the grid Reynolds number and to resolve viscous boundary currents (Bryan et al., 1975;Large et al., 2001;Smith and McWilliams, 2003). 17 In practice, horizontal friction in climate models is kept as small as feasible provided the grid noise is at an acceptable level and the western boundary layers are properly resolved.
Both Laplacian and biharmonic momentum friction operators are used in large-scale ocean simulations, and there is no first principle motivating either form. With respect to the dissipation scale-selectivity, the biharmonic operator is favourable compared to the Laplacian operator as it induces less dissipation at the resolved scales and concentrates dissipation at the grid scale Griffies, 2004). Large et al. (2001) and Smith and McWilliams (2003) proposed an anisotropic viscosity scheme by distinguishing the along and cross-flow directions in strong jets in order to reduce horizontal dissipation while satisfying the numerical constraints. Larger zonal viscosities were used in the equatorial band to maintain numerical stability in the presence of strong zonal currents, and larger meridional viscosities were employed along the western boundaries to resolve the Munk boundary layer (Munk, 1950), while the meridional viscosity remained small in the equatorial band to better capture the magnitude and structure of the equatorial current. This approach was adopted in the previous GFDL climate model , while isotropic viscosities are restored in a new GFDL Earth System Model to "allow more vigorous tropical instability wave activity at the expense of adding zonal grid noise, particularly in the tropics" (Dunne et al., 2012).
Different choices for viscosity values were made in different ocean climate models. One choice is to use the prescribed viscosity. Due to the convergence of the meridians grid resolution also varies on structured meshes in some traditional models. To avoid numerical instability associated with large viscosity in an explicit time stepping scheme, viscosity is often scaled with a power of the grid spacing (e.g. Bryan et al., 2007;Hecht et al., 2008). Another approach is to use flow-dependent Smagorinsky viscosities (Smagorinsky, 1963(Smagorinsky, , 1993. The Smagorinsky viscosity is proportional to the local horizontal deformation rate times the squared grid spacing in the case of the Laplacian operator. It is enhanced in regions of large horizontal shear, thus providing increased 17 In the case of Laplacian viscosity the grid Reynolds number is defined as Re = U /A, where U is the speed of the currents, is the grid resolution, and A is the viscosity. In one dimension, the centred discretization of momentum advection requires Re < 2 (or A > U /2) to suppress the dispersion errors. The second constraint A > ( √ 3/π) 3 β 3 ensures that the frictional western boundary is resolved by at least one grid point (Bryan et al., 1975). Here β is the meridional gradient of the Coriolis parameter. Additionally, an explicit time stepping (Euler forward) method enforces an upper bound for horizontal viscosity, i.e. A < 2 /(2 t). dissipation where it is required to maintain stability. Its dependence on grid spacing eliminates the requirement for additional scaling as done for a priori specified viscosities.
We use the biharmonic friction with a Smagorinsky viscosity in FESOM large-scale simulations.  provide a thorough review on this scheme. As linear (first order) basis functions are used in FESOM, a direct formulation of the biharmonic operator cannot be achieved. Therefore, a two-step approach (first evaluating nodal Laplacian operators, then constructing the biharmonic operators) is used as described by Wang et al. (2008). The biharmonic Smagorinsky viscosity is computed as Laplacian Smagorinsky viscosity times 2 /8 as suggested from the linear stability analysis , 18 where is the local grid resolution. The dimensionless scaling parameter in the computation of Smagorinsky viscosity is set equal to π in our practice. To resolve the western boundaries, a minimum biharmonic viscosity of β 5 is set at the four grid points close to the western boundaries, where β is the meridional gradient of the Coriolis parameter. When grid resolution is increased along the western boundaries and the velocity becomes more vigorous, the western boundary constraint becomes less stringent than the Reynolds constraints (see the discussion in . Figure 13a shows the sensitivity of barotropic stream functions to the two forms of friction operators (biharmonic and Laplacian). The Smagorinsky viscosity is used in both simulations. The difference of barotropic stream functions is seen mainly in the Southern Ocean and northern North Atlantic. In both regions the biharmonic viscosity leads to a stronger circulation, by about 4 Sv for the Antarctic Circumpolar Current (ACC) and 8 Sv for the subpolar gyre in the North Atlantic. Strong and narrow currents are sensitive to the form of friction operators. The strength of North Atlantic Current, South Atlantic Current and Pacific equatorial current is enhanced by 2-4 Sv for the biharmonic case. Local impacts along ACC near topographic features or where the current narrows are also visible.
Consistent with the sensitivity study of Jochum et al. (2008), the boundary currents off east Greenland and west Greenland are enhanced with reduced momentum friction by using the biharmonic operator, thus increasing warm, saline Irminger Current water inflow to the Labrador Sea and decreasing Labrador Sea sea-ice area (not shown). 19 Enhanced 18 For centred differences in space and forward difference in time in a 2-D case, the stability requirement is A < 2 4 t for the Laplacian operator and B < 4 32 t for the biharmonic operator, thus a ratio of 2 /8 between B and A, where A is the Laplacian viscosity, B is the biharmonic viscosity, t is the time step and is the horizontal resolution. 19 Although a similar conclusion is obtained, our sensitivity tests are different from that of Jochum et al. (2008). They reduce momentum dissipation by replacing the combination of background and Smagorinsky viscosity by only the background one, while we The same as in (a) but between a run with biharmonic viscosity scaled with third power of the horizontal resolution and a run with biharmonic Smagorinsky viscosity (the latter minus former). The results are the last 10 yr mean in 60 yr simulations.

Fig. 13. (a) Barotropic stream function difference (Sv) between runs with Laplacian and biharmonic Smagorinsky viscosities (the latter minus the former). (b)
The same as in (a) but between a run with biharmonic viscosity scaled with third power of the horizontal resolution and a run with biharmonic Smagorinsky viscosity (the latter minus the former). The results are the last 10 yr mean in 60 yr simulations.
fraction of Atlantic Water in the Labrador Sea weakens the stratification and results in stronger deep convection there, thus an enhanced AMOC upper cell (by about 1 Sv, see Fig. 14a). The increase in subpolar gyre strength (Fig. 13a) is associated with increased density in the Labrador Sea resulting from enhanced Atlantic Water inflow and deep water ventilation.
More Atlantic Water accumulates south of the Greenland-Scotland Ridge (GSR) in the case of Laplacian friction, leading to stronger deep convection north of 60 • N, which is manifested by the strengthening of the overturning circulation at intermediate depth between 50-60 • N (Fig. 14a). The commonly called Labrador Sea Water feeding the Deep Western Boundary Current has its origin both in the Labrador Sea and south of GSR including the Irminger Sea (Pickart et al., 2003;Vage et al., 2008;de Jong et al., 2012). Deep mixed layers indicate the presence of deep convection in both regions in both simulations, but reduced dissipation in the biharmonic case favours it in the Labrador Sea. Since reduced dissipation drives both the AMOC and subpolar gyre strength toward compare the Laplacian friction to the biharmonic friction, with the latter having less dissipation on resolved scales.  observations in our test, we choose to use the low dissipative biharmonic friction operator. Different regions of the global ocean were analysed in viscosity sensitivity experiments by Jochum et al. (2008), and generally improved ocean circulations were observed in their coupled climate model with reduced dissipation (at the expense of an increase in numerical noise). Figures 13b and 14b compare the impact of Smagorinsky and flow-independent, prescribed viscosities. In both simulations the biharmonic friction is used. In the prescribed viscosity case, viscosity is a function of the cubed grid resolution, B 0 3 / 3 0 , where B 0 = −2.7 × 10 13 m 4 s −1 and 0 = 112 km. No obvious instability is visible in both simulations. The difference in the large-scale circulation between the two simulations is clearly less significant than for the two friction operators. With the Smagorinsky viscosity the barotropic stream function is higher by 2-3 Sv in the central Labrador Sea. The increase in the AMOC upper cell is also rather small (0.1-0.2 Sv). The small difference between the two simulations is not unexpected because the prescribed viscosity is relatively small.
Further evaluation of the impact of momentum dissipation on the large-scale circulation still needs to be pursued, especially for eddy-permitting and eddy-resolving simulations. For example, an intermediate value of biharmonic viscosity (0.5B 0 3 / 3 0 ) is found to produce good Gulf Stream separation and realistic North Atlantic Current penetration into the Northwest Corner region in eddy resolving (0.1 • ) simulations by Bryan et al. (2007). They got a southward displaced Gulf Stream separation with a lower viscosity and large SST errors at the subtropical-subpolar gyre boundary with a higher viscosity, indicating that only a small range in the parameter space exists for tuning their eddy-resolving model. To overcome the problems of too early separation of the Gulf Stream with a small biharmonic viscosity and establishment of a permanent eddy north of Cape Hatters with a large biharmonic viscosity, Chassignet and Garraffo (2001) and Chassignet and Marshall (2008) recommend to jointly use biharmonic and Laplacian viscosity in eddy resolving models, with both values smaller than those when only one friction form is used. They found that by combining the two operators it is possible to retain the scale selectiveness of the biharmonic operator and to provide useful damping at larger scales, the latter of which helps to eliminate the wrong permanent eddy. Some recent eddy permitting coupled climate models have chosen to use the biharmonic friction operator (Farneti et al., 2010;Delworth et al., 2012). Providing a unified closure for momentum dissipation valid in various dynamical situations and on meshes refined in different ways and in different regions remains an important and challenging task in developing unstructured-mesh models.
In the sigma grid region in the case of using a hybrid grid, we apply momentum friction along the sigma grid slope to maintain numerical stability. 20 For unknown reasons the biharmonic operator turns out to be unstable even if it acts along the sigma grid slope. The two-step implementation of the biharmonic operator is presumably the cause of this problem. Therefore we currently employ the Laplacian operator on sigma grids together with the Smagorinsky viscosity.

Eddy mixing and stirring
Much of the mixing induced by mesoscale eddies is oriented along locally referenced potential density surfaces (neutral surfaces, McDougall, 1987), which has motivated the utilization of rotated tracer diffusion (Redi, 1982;Olbers et al., 1985;Griffies et al., 1998). 21 The use of isoneutral diffusion 20 As the sigma grid slope can be very steep, a horizontal friction imposes a large component perpendicular to the grid slope, which can readily lead to instability even with a very small time step in the case of a forward time stepping. 21 Neutral diffusion is described by Laplacian operators in ocean climate models, although eddy-resolving models use neutral biharmonic operators to add dissipation at grid scales to maintain numer-(often called Redi diffusion in the literature) significantly reduces the unphysical diapycnal mixing associated with horizontal diffusion (Veronis, 1975;Böning et al., 1995), thus improving model integrity. In the bulk of the ocean interior the neutral slope is small, which motivates the application of the small slope approximation to simplify the diffusion tensor (Gent and McWilliams, 1990). Gent and McWilliams (1990) and Gent et al. (1995) (referred to as GM90 hereafter) provided a closure to represent the adiabatic stirring effects of mesoscale eddies. They suggested a form of eddy-induced bolus velocity for zcoordinate models by considering the reduction of available potential energy through baroclinic instability. This bolus velocity is added in tracer equations to advect tracers together with resolved velocity. The implementation of the GM parametrization in z-coordinates models significantly improves the model results including temperature distribution, heat transport and especially deep convection (Danabasoglu et al., 1994;Danabasoglu and McWilliams, 1995;Robitaille and Weaver, 1995;Duffy et al., 1995Duffy et al., , 1997England, 1995;England and Hirst, 1997;McDougall, 1996, 1998;Hirst et al., 2000). The eddy-induced velocity as given by GM90 is v * = −∂ z (κ gm S) +ẑ∇ · (κ gm S), where κ gm is the GM thickness diffusivity and S is the neutral slope. It involves computing the derivative of the thickness diffusivity and neutral slope and appears to be noisy in numerical realizations, which is one of the factors that motivated the derivation of the skew flux formulation for eddy-induced transport in Griffies (1998). Using the skew flux formulation also unifies the tracer mixing operators arising from Redi diffusion and GM stirring. It turns out to be very convenient in its implementation in the variational formulation in FESOM. Overall, the small slope approximation for neutral diffusion (Gent and McWilliams, 1990) and the skew diffusion form for eddy stirring  are the standard neutral physics options in FESOM.
In FESOM there is a caveat with hybrid grids, for which the sigma grid is used around the Antarctic coast when ice shelf cavities are present. Because sigma grid slopes and neutral slopes are very different, using neutral physics parametrization will lead to numerical instability on sigma grids. For the moment we use along-sigma diffusivity. The roles played by mesoscale eddies on continental shelf and in ice cavities around the Antarctic are unclear. In practice we use high resolution (∼ 10 km or finer) in the sigma grid region, which can eliminate the drawback to some extent.

Diabatic boundary layer
Neutral diffusion represents mesoscale mixing in the adiabatic ocean interior. In the surface diabatic boundary layer where eddies reaching the ocean surface are kinematically ical stability (Roberts and Marshall, 1998). Griffies (2004) provides a thorough review on the properties of biharmonic operators and explains why Laplacian operators are preferable for tracer diffusion. constrained to transport horizontally, horizontal diffusion is more physical and should be applied (Treguier et al., 1997). This idea has been commonly taken in ocean climate model practice (e.g. Griffies, 2004;Griffies et al., 2005;Danabasoglu et al., 2008). We use the mixed layer depth (MLD) as an approximation of the surface diabatic boundary layer depth, within which horizontal diffusion is applied. The MLD is defined as the shallowest depth where the interpolated buoyancy gradient matches the maximum buoyancy gradient between the surface and any discrete depth within that water column (Large et al., 1997).
The diffusion tensor is not bounded as the neutral slopes increase, so numerical instability can be incurred when the neutral slopes are very steep. Therefore, the exponential tapering function suggested by Danabasoglu and McWilliams (1995) is applied to the diffusion tensor to change neutral diffusion to horizontal diffusion in regions of steep neutral slopes. 22 The same tapering function is also applied to the GM thickness diffusivity κ gm below the MLD to avoid unbounded eddy velocity v * , which is proportional to the gradient of neutral slopes.
Within the surface boundary layer we treat the skew flux as implemented by Griffies et al. (2005). The product (κ gm S) is linearly tapered from the value at the base of the surface boundary layer to zero at the ocean surface, as suggested by Treguier et al. (1997). A linear function of κ gm S with depth means that the horizontal eddy velocity u * = −∂ z (κ gm S) is vertically constant in the surface boundary layer. Maintaining an eddy-induced transport in the boundary layer is supported by the fact that baroclinic eddies are active in deep convection regions (see the review by Griffies, 2004). Indeed, simulations parametrizing eddy-induced velocity in the surface boundary layer show significant improvements compared to a control integration that tapers the effects of the eddies as the surface is approached . Our implementation of the GM eddy flux near the surface is different from that of Griffies et al. (2005) with respect to the definition of the boundary layer. We define the surface diabatic boundary layer depth using the MLD definition of Large et al. (1997), while the boundary layer base is set to where the magnitude of the slope S in either horizontal direction is just greater than a critical value in Griffies et al. (2005).
Using the MLD to define the surface diabatic layer eliminates the requirement to choose a critical neutral slope for defining the boundary layer. Additionally, tests with FESOM show that boundary layer depth fields defined via critical neutral slopes are less smooth than MLD, which is possibly 22 The tapering function is only applied to the off-diagonal entries of the diffusion tensor, thus maintaining a horizontal diffusion when these entries are tapered to zero. The function suggested by Danabasoglu and McWilliams (1995) is f (S) = 0.5(1 + tanh( S max −|S| S d )), where S d = 0.001 is the width scale of the tapering function and S max = 0.05 is the cut-off value beyond which f decreases to zero rapidly. linked to the fact that static instability is not completely removed instantaneously through the large but finite vertical diffusivity (Sect. 3.8). Based on theoretical consideration, Ferrari et al. (2008) proposed an eddy parametrization for the near-boundary regions. They introduced a transition layer connecting the quasi-adiabatic interior and the turbulent boundary layer where eddy-induced velocity is parallel to the boundary. A simplified version of this parametrization was implemented by Danabasoglu et al. (2008). Our implementation is the same as the case with vanishing transition layers in Danabasoglu et al. (2008), who reported only minor difference induced by nontrivial transition layers.
Eddy-induced bolus velocity can go infinite if the neutral slope is not limited from above. In addition to the exponential tapering function being applied below the surface boundary layer as mentioned above, the magnitudes of neutral slopes at the base of the surface boundary layer are also constrained below a critical value S max to ensure finite bolus velocity without incurring numerical instability in the surface boundary layer. Sensitivity tests show that using a critical slope larger than 0.1 can entail numerical instability sometimes. Another consideration for the choice of S max is associated with the tapering function for the ocean interior (see footnote 22). The magnitude of neutral slopes is mostly less than 0.01 below the surface layer, so we have empirically chosen S max = 0.05. As suggested by Gerdes et al. (1991) and Griffies et al. (2005) the GM parametrization is changed back to horizontal diffusion at grid points adjacent to ocean bottom to avoid overshoots in tracer fields.

Neutral and thickness diffusivity
Methods to specify neutral diffusivity and thickness diffusivity differ among ocean climate models, including using a constant value (Danabasoglu et al., 2006), horizontally varying diffusivity depending on vertically averaged flow fields (Visbeck et al., 1997;Griffies et al., 2005), and diffusivity varying in three dimensions depending on flow fields (Danabasoglu and Marshall, 2007;Eden et al., 2009). Estimates from observations (e.g. Ledwell et al., 1998;Bauer et al., 1998;Sundermeyer and Price, 1998;Zhurbas and Oh, 2003;Marshall et al., 2006) and high-resolution ocean models (e.g. Bryan et al., 1999;Eden, 2006;Eden et al., 2007) have revealed pronounced variability of eddy diffusivity in space and time. Many numerical and theoretical studies have focused on the prescription for the vertical variation Killworth, 1997;Treguier, 1999) and horizontal variation (Held and Larichev, 1996;Visbeck et al., 1997;Griffies et al., 2005) of eddy diffusivity. More recent efforts in prescribing eddy diffusivity have focused on schemes of eddy diffusivity varying in three dimensions and time. Motivated by the finding that the squared buoyancy frequency (N 2 ) shows a vertical structure similar to the diagnosed diffusivity (Ferreira et al., 2005;Eden, 2006;Eden et al., 2007;Ferreira and Marshall, 2006), Danabasoglu and Marshall (2007) have investigated the impacts of using diffusivity proportional to N 2 in model simulations. Eden and Greatbatch (2008) proposed a closure for eddy thickness diffusivity consisting of a prognostic equation for the eddy kinetic energy and an eddy length scale.
The N 2 -dependent thickness diffusivity as suggested by Ferreira et al. (2005) and Ferreira and Marshall (2006) was implemented in the NCAR Community Climate System Model (CCSM3), and it leads to improved results with respect to observations compared to using a constant diffusivity (Danabasoglu and Marshall, 2007). It is further used in the updated NCAR climate model CCSM4 (Danabasoglu et al., 2012). Currently this approach is also used in FESOM simulations. The thickness diffusivity is calculated as κ gm (x, y, y) is a reference diffusivity at horizontal location (x, y) and N (x, y, z, t) is the local buoyancy frequency. N ref (x, y, t) is the reference buoyancy frequency taken just below MLD, provided that N 2 > 0 there. Otherwise N 2 ref is the first stable N 2 below MLD. Following Danabasoglu and Marshall (2007) where N min sets the lower bound for diffusivity. The neutral diffusivity is set equal to the thickness diffusivity below the MLD. Within the MLD the linear tapering is applied to eddy skew flux and the horizontal diffusivity is set to the reference diffusivity.
The reference diffusivity κ ref (x, y) is set to a constant (1500 m 2 s −1 ) for regions where horizontal resolution is coarser than 50 km, and scaled down when resolution is finer. 23 We prescribed the scaling function for diffusivity based on experience obtained so far. Sensitivity tests with 25 km resolution in the Arctic Ocean (where the first baroclinic Rossby radius is less than 8 km in the Eurasian Basin as derived from climatology data -Qun Li, personal communication, 2012) show that using a neutral diffusivity larger than 50 m 2 s −1 leads to a too diffused boundary currents in the Arctic Atlantic Water layer. 24 Therefore we reduce the reference diffusivity rapidly from 1500 m 2 s −1 at 50 km resolution to 50 m 2 s −1 at 25 km resolution. The ratio of the first Rossby radius to the grid scale (λ 1 / ) is a pertinent control parameter for scaling mesoscale eddy diffusivity. 25 Using it to construct a scaling function may need case-based tuning in 23 The reference diffusivity [m 2 s −1 ] is set to 1500 for ≥ 50, 50+58( −25) for 25 ≤ < 50, and 50( /25) 2 for < 25. Here is local horizontal resolution with unit km. 24 Other parametrizations such as the anisotropic GM parametrization suggested by Smith and Gent (2004) and the Neptune parametrization (Maltrud and Holloway, 2008;Holloway and Wang, 2009) could improve the solution of the Arctic boundary currents. These options need to be explored in the future. 25 The recent work of Hallberg (2013) provides insight into this subject.
Geosci. Model Dev., 7, 663-693  multi-resolution climate simulations and remains a research topic for FESOM applications. 26 We set N min = 0.2, meaning that the diffusivity is confined above 300 m 2 s −1 in regions with resolution coarser than 50 km. The time mean thickness diffusivity at 300 m depth on the reference mesh (Fig. 3) is shown in Fig. 15. Largest values are found in regions where intense eddy activity is expected, including the ACC and western boundary currents. Due to the resolution dependence of the reference diffusivity, reduced values are found in the equatorial band and north of 50 • N where the grid spacing is small (Fig. 3). As also noticed in Danabasoglu and Marshall (2007), the diffusivity scheme produces undesirable large values in the eastern South Pacific. The zonal-mean distribution (Fig. 16) shows that the diffusivity decreases from the base of surface diabatic layer downwards as expected from vertical distribution of squared buoyancy frequency. Largest values are found in the Tropics just below the diabatic layer, with the value in the equatorial band scaled down due to higher horizontal resolution. Deep penetration of large diffusivity occurs at midto high latitude on both hemispheres, while the deepest penetration is in the Southern Ocean. The deep reaching high diffusivity north of 60 • N in Danabasoglu and Marshall (2007) (their Fig. 1, associated with deep convection regions in the  North Atlantic) is absent in Fig. 16 because the reference diffusivity is scaled to about 50 m 2 s −1 on our reference mesh.
N min turns out to be one of the key tuning parameters in the calculation of diffusivity. The residual meridional overturning stream functions (Eulerian mean plus eddy contribution parametrized by thickness diffusivity) in the Southern Ocean from simulations with N min = 0.2 and 0.1 are shown in Fig. 17a and b, respectively. Both the Deacon cell and Antarctic Bottom Water (AABW) cell show very similar structures between the two simulations. State estimate using an adjoint eddy-permitting (1/6 • ) model by Mazloff et al. (2010) shows a Southern Ocean Ekman transport of about 31 Sv, a Deacon cell in depth space (Doos and Webb, 1994) reaching more than 3000 m depth and a maximum AABW transport of about 16 Sv (their Fig. 10a). Both simulations reasonably reproduce the meridional overturning circulation structure reported by Mazloff et al. (2010), with some underestimation of the circulation strength of both bottom water and intermediate water. Figures 17c and d compare the parametrized eddy meridional overturning stream functions between the two simulations. With a decrease of N min from 0.2 to 0.1, the eddy MOC maximum reduces from about 10 Sv to 5 Sv. Although decreasing N min (the lower bound of diffusivity) can have impacts on the diffusivity mainly below 1500 m depth for the Southern Ocean region (see Fig. 16), the weakening of the eddy meridional overturning is over the whole water column. Along with the weakening of the eddy-induced transport, the AABW transport also weakens ( Fig. 17a and b), consistent with other model results (Farneti and Gent, 2011).
Mesoscale eddies in the Southern Ocean can buffer the ocean response to atmospheric changes (Meredith and Hogg, 2006;Hallberg and Gnanadesikan, 2006;   2008; Farneti et al., 2010;Viebahn and Eden, 2010;Jones et al., 2011;Abernathey et al., 2011;Meredith et al., 2012;Morrison and Hogg, 2013;Munday et al., 2013), so assessing and improving their parametrization in climate models is critically important. It is possible to use the present GM parametrizations to produce a response of the Southern Ocean to changing wind stress in coarse climate models that is broadly consistent with what is seen in eddying ocean models (e.g. Gent and Danabasoglu, 2011). However, eddy parametrizations have only demonstrated some success in reproducing the eddy compensation but not the eddy saturation. 27 As remarked by Munday et al. (2013), eddy compensation is achieved at the expense of being not able to realize the eddy-saturated regime using the parametrization. Hence they suggested that parametrizations with a prognostic eddy kinetic energy (EKE) variable (Eden and Greatbatch, 2008;Marshall and Adcroft, 2010), which can be tied directly to 27 Eddy saturation refers to the phenomenon that ACC transport shows limited response to increased wind stress. It can be explained by a rough balance between the tendency for Ekman transport to steepen isopycnals and for eddies to flatten them. Eddy compensation refers to the phenomenon that changes in eddy-induced MOC can partially compensate those of Ekman transport. These two phenomena are linked but with dynamical distinction (e.g. Morrison and Hogg, 2013). wind stress, be preferable schemes for thickness diffusivity. However, schemes such as the one proposed by Eden and Greatbatch (2008), while probably a better way to go, assume that the time evolution of EKE can be parametrized in a model. This is not a trivial task as the relationship of changes in EKE to changes in forcing is one of the big unsolved problems.
Attention has been paid to the practical implementation of traditional GM parametrizations. For example, Gnanadesikan et al. (2007) and Farneti and Gent (2011) found that model results are very sensitive to the critical neutral slope in their simulations; some simulated features can be improved at the expense of worsening some other features when increasing the critical neutral slope. These findings also indicate that much research is still required for mesoscale eddy parametrizations. Ferrari et al. (2010) propose a parametrization for mesoscale eddy transport which solves a boundary value problem for each vertical ocean column. They show that this scheme works robustly and performs as well as their implementation of the more conventional GM scheme. Further study is required to explore its potential in increasing the fidelity of ocean model simulations.

River runoff distribution
River discharge is one of the important processes that redistribute water masses in the earth system. For example, the Arctic Ocean is the largest freshwater reservoir in the global ocean, with 38 % of its freshwater source provided by river runoff (Serreze et al., 2006). Faithfully representing the circulation of freshwater supplied by rivers in ocean models is important, but depends on the numerical treatment of the runoff. As reported in Griffies et al. (2005), adding river runoff into the surface grid cell can lead to too much freshwater on the surface stabilizing the water column. This motivated them to insert river runoff into the upper four model grid cells. This approach is a parametrization for unresolved processes that can influence river runoff distribution in reality, including tidal mixing. Due to the current model numerics we did not implement this approach in FESOM. The diapycnal mixing parametrization for barotropic tides proposed by Lee et al. (2006) is a remedy we use to improve river runoff representation, as mentioned in Sect. 3.6.1.
Another approach used in climate models is to spread river runoff over a wide region near river mouths (Danabasoglu et al., 2006). This approach is expected to remedy possibly under-resolved spreading of river runoff at coarse resolution, for example by eddies. Using a high-resolution model McGeehan and Maslowski (2011) showed how eddies transport water masses of the Labrador Sea boundary current into the gyre interior. The first baroclinic Rossby radius of deformation λ 1 is typically small in coastal regions. For example, λ 1 is of the order of 3 km on the western Arctic shelf, so resolving mesoscale eddies cannot be afforded even in regional models. Arguably, adding river runoff over a wide region can be a poor man's approach to account for the underresolved processes that facilitate freshwater penetration to ocean basins.
Typically we distribute river runoff around river mouths using a linear function decreasing from one at the river mouths to zero at 400 km distance. Figure 18a shows the river runoff distribution of the long-term climatology derived from Dai et al. (2009). We carry out sensitivity tests using this distribution (reference run) and another one where the river runoff is distributed within 100 km distance from river mouths (sensitivity run, Fig. 18b). The difference between the two experiments for salinity at surface and 100 m depth is shown in Fig. 19a and b. As expected, the largest difference is close to river mouths where the difference is directly enforced, with lower salinity immediately at river mouths and higher salinity around them in the sensitivity run.
The impact of applying different river runoff distribution on the Arctic basin develops with time. The salinity in the halocline is characterized by local difference patches of ±0.1 psu in the Arctic basin (Fig. 19b). The changes of salinity between the two runs are nonuniform with both positive and negative signs, implying that associated changes in local circulation are also responsible for the observed difference  in salinity. To assess the significance of the impact from adjusting the river runoff distribution, we compared the difference in salinity in the halocline to that induced by adjusting background vertical diffusivity (changing from the currently used value 0.1 × 10 −4 m 2 s −1 to 0.01 × 10 −4 m 2 s −1 as used by Nguyen et al. (2009) and Zhang and Steele (2007), see discussion in Sect. 3.6.2). We found that the difference obtained here is a few times smaller. Further analysis shows that the freshwater export flux remains almost intact for both Fram Strait and CAA in the sensitivity run, so the impact of applying different river runoff distribution in the Arctic Ocean is mainly limited to the Arctic basin.
Both the temperature and salinity in the North Atlantic subpolar gyre are increased in the sensitivity run (Fig. 19). This is consistent with the strengthening of the AMOC upper cell (Fig. 20), which increases the supply of warm, saline Atlantic Water to the subpolar gyre. As the freshwater export from the Arctic Ocean remains the same, the changes in AMOC are linked to modified river runoff distribution along the North American and Greenland coasts. This is not an unexpected impact as confining river runoff more to the coast will facilitate deep convection in the Labrador Sea thus a stronger AMOC. Although the impact of a wide spreading of river runoff is moderate as tested here, we keep this option in the model.

Free surface formulation
While rigid-lid ocean models are becoming obsolete, ocean models with the free-surface method and fixed volume for tracer budget have been widely used during the last decade. In the latter type of models, the sea surface height equation has a free-surface algorithm whereas tracers cannot experience dilution or concentration associated with ocean volume changes. To have the impact of surface freshwater flux (evaporation, precipitation, river runoff, and freshwater associated with ice/snow thermodynamics) on salinity, a virtual salt flux has to be introduced for the salinity equation as a surface boundary condition. In reality there are no salt changes in the ocean except for changes through storage of salt in sea ice. Therefore the virtual salt flux formulation is unphysical, although it parametrizes most of the effect of surface freshwater flux on surface salinity.
The virtual salt flux is given by F virtual, salt = s ref q w , where q w is total surface freshwater flux and s ref is a reference salinity. If s ref is set to a constant, the ocean salt is conserved upon that the global integral of q w is zero. A problem with this choice is that the local sea surface salinity can be very different from the reference salinity and the dilution effect of surface freshwater on salinity cannot be well represented in the model. When the local sea surface salinity is far from the reference salinity, virtual salt flux formulation can lead to too small or too large salinity and thus model blowup. Another choice for calculating the virtual salt flux is to use local sea surface salinity as the reference salinity. In this case the local feedback on salinity from freshwater flux is properly simulated, but the total salt conservation in the ocean is not automatically ensured even when the global integral of q w is zero. One practical remedy for this is to calculate the global integral of virtual salt flux and remove it by subtracting its global mean over the globe during the model runtime. Effectively this remedy alters local salinity unphysically and might spoil model integrity on climate scales.
In a full free-surface formulation the ocean volume changes with the vertical movement of surface grid points and tracer concentration directly reacts to these changes. The virtual salt flux is not required and the surface water flux is accounted for in the sea surface height equation. Although the comparison by Yin et al. (2010) shows that the difference between the results using virtual salt fluxes and freshwater  Fig. 120. AMOC difference of a run with river runoff distributed over 400 km distance from a run with a 100 km distribution distance. The results are the last 10 yr mean in a 60 yr simulation. fluxes is statistically insignificant in both unforced control runs and water-hosing runs with freshwater forcing resembling future projections, the practice of employing virtual salt fluxes is physically compromised, prompting the trend to full free-surface formulation. Indeed, the majority of present ocean climate models are using full free-surface formulation (see, for example, the models used in Danabasoglu et al., 2014).
FESOM has taken the free-surface formulation with fixed ocean volume since its first construction (Danilov et al., 2004). The basic numerical core of the current model version was finished in 2009 and still assumed fixed ocean volume Timmermann et al., 2009). Because of human limitations it was another two years before the full free-surface formulation was updated in the model. The full free-surface algorithm uses the arbitrary Lagrangian Eulerian (ALE) formulation (Farhat et al., 2001;Formaggia and Nobile, 2004;Nithiarasu, 2005;Badia and Codina, 2006). An example of ALE formulation implementation in a finite element ocean model is described by White et al. (2008). In principle, the ALE formulation allows vertical movement of all grid layers (an analogue to the z * coordinate). However, matrices and derivatives need to be updated when the mesh geometry is changed, which is costly, so we only allow the surface grid points to move. Tests show that only the moving surface layer in the full free-surface formulation increases about 10 % of the total computation time on typical meshes. The drawback of only moving the surface grid points is that the sea ice loading is constrained by the first layer thickness. We limit the loading from sea ice to half of the first layer thickness to make sure that the first layer thickness will not vanish. Limiting ice load cannot realistically represent oceanic variability associated with ice-loading effects in the response to wind as shown by Campin et al. (2008), while the importance of such high-frequency variability on climate scales is unclear. The freshwater flux boundary condition, however, is not influenced by applying this constraint. At the moment the model is still not updated to support full free-surface on hybrid grids.
By now all published and most on-going applications of FESOM have been performed with the virtual salt flux formulation. The current coupling of FESOM to an atmospheric model also uses the virtual salt flux option in the on-going model evaluation. For new applications and for the coupled model in a later stage the physically more consistent full freesurface formulation is the recommended option.

Ice shelf model
Ice sheets are an important component of the earth system. They should be adequately taken into account in order to predict and understand sea level rise. Ice shelves provide an important interface between the Antarctic Ice Sheet and the surrounding ocean. Ice shelf basal melting feeds the AABW and modifies the ice shelves, the latter of which can potentially influence the ice sheet dynamics. FESOM has an ice shelf component which can explicitly simulate the ocean dynamics in sub-ice-shelf cavities and ice-shelf-ocean interaction .
A three-equation system is used to compute the temperature and salinity in the boundary layer between ice and ocean and the melt rate at the ice shelf base as proposed by Hellmer and Olbers (1989) and refined by Holland and Jenkins (1999). Turbulent fluxes of heat and salt are computed with coefficients depending on the friction velocity following the work of Jenkins (1991). To initialize temperature and salinity in ice cavities we take the temperature and salinity profiles at the nearest ice shelf edge. Enough spin-up time (O(20) yr) is necessary to adjust the hydrography under the ice shelf. By now we assume a steady state for ice shelf thickness and cavity geometry; basal mass loss is assumed to be in equilibrium with surface accumulation and the divergence of the ice shelf flow field. Investigating the impact of varying cavity geometry and grounding-line migration is an active research topic.
The numerical formulation of the ice shelf model is summarized here. Locally refined resolution is needed to resolve small ice shelves; sigma grids are used for ice cavities and surrounding continental shelf regions (Sect. 3.2), with measures to control pressure gradient errors (Sect. 3.4); bottom topography and ice shelf draft data with improved quality compiled by Timmermann et al. (2010) are used (Sect. 3.3). As shown by Timmermann et al. (2012) and Timmermann and Hellmer (2013), basal mass fluxes are in most cases realistic from the model, but differences from observations suggest that further improvement is still desirable. The major issues are linked to the utilization of sigma grids, including possible distortion to flow dynamics caused by smoothing of bottom bathymetry and ice shelf draft required for reducing pressure gradient errors, requirement for individual numerical formulation of GM parametrization (Sect. 3.10), and missing support for the full free-surface option in the sigma grid region (Sect. 3.12). These aspects remain to be explored and improved.

Conclusions
Unstructured-mesh models open new horizons for climate modelling: local dynamics can be better resolved with locally increased resolution without traditional nesting and the improved local processes can provide feedback to the largescale circulation. In this paper we give an overview about the formulation of FESOM, which is the first ocean general circulation model that uses unstructured meshes and therefore makes it possible to carry out multi-resolution simulations. We described the key model elements including the two-dimensional mesh, vertical discretization, bottom topography, pressure gradient calculation, tracer advection scheme, diapycnal mixing parametrization, penetrative short-wave treatment, convection adjustment, horizontal momentum friction, GM parametrization, river runoff distribution, free-surface formulation and ice shelf modelling. The progress reported here is a result of our own continuing model development efforts as well as the recent advances by the ocean modelling community in general. The model version described here is the standard version for our ocean stand-alone studies and is employed as the sea-ice ocean component of a new coupled climate model . Along with the model description we briefly reviewed some of the key elements related to ocean climate models. Discussions on the knowledge gained in the community provide the guideline for making choices in constructing our model. Griffies (2004) has provided a thorough review on ocean model fundamentals. Due to limitations in resources we did not implement or test all numerical and parametrization options recommended in other studies. Investigations to further improve numerical and physical schemes are required as outlined in Sect. 3. There are other model components that should be updated in FESOM -for example, the overflow parametrization (there are different schemes suggested in previous studies, e.g. Beckmann and Döscher, 1997;Danabasoglu et al., 2010). Parametrizations with scale selectivity are critically important in unstructured-mesh models. We are only beginning to explore the multi-resolution aspects of parametrizations. More sensitivity studies and multiresolution tests are desirable to improve the formulation and implementation of such parametrizations. We note that broad collaborations, like the ongoing international joint project COREs (Griffies et al., 2009;Danabasoglu et al., 2014), are helpful to identify common issues in present state-of-the-art ocean models and consolidate efforts in ocean model development.
In summary, we would argue that unstructured-mesh seaice ocean models have matured substantially in recent years. Consequently, they have become attractive options for simulating multi-resolution aspects of the climate system. First climate-relevant applications are appearing (e.g. X. Hellmer et al., 2012;Timmermann et al., 2012;Timmermann and Hellmer, 2013;Wekerle et al., 2013;Jung et al., 2014). However, further research is urgently required to explore the full potential of multi-resolution modelling in climate research. Large model uncertainty as shown in the previous IPCC reports and recent COREs model intercomparisons (Griffies et al., 2009;Danabasoglu et al., 2014) indicates that model development requires long-term continuous efforts in the broad modelling community; both international collaboration and individual effort from each model development group are necessary to advance the field.

Overview on the development history of FESOM
The first FESOM version (version 1.1) was documented by Danilov et al. (2004). In that version the model used the GLS stabilization which required to introduce the barotropic velocity. It needed to be solved for together with the sea surface elevation (as mentioned in Sect. 2.1). Advection and friction operators for both momentum and tracers were implicit in time, so iterative solvers were called for all equations and particularly needed to be pre-conditioned in every time step. Overall, the approach proved to be too slow for climate scale simulations.
The issue of model efficiency prompted the model development team to pursue different numerical formulations (version 1.2, Wang et al., 2008). With the purpose to build up a more efficient and robust ocean model, the pressure projection method was adopted to decouple the solution of surface elevation and velocity, the momentum advection and lateral diffusivity and viscosity terms were changed to explicit schemes, the FCT tracer advection scheme was introduced, the hybrid grid functionality was developed, and some physical parameterizations were incorporated. All these features are kept in the current model version. Further experience was obtained through the work of Timmermann et al. (2009, version 1.3), who concentrated on coupling a finite-element seaice model to the ocean code. Since that work was initiated before the work reported in Wang et al. (2008), it was based on a preliminary ocean model code void of most of model updates except for the pressure projection method.
The explored model features from Wang et al. (2008) and Timmermann et al. (2009) have been combined afterwards (version 1.4). For the sake of model development the prism elements (see Fig. 2a) were used in Wang et al. (2008). In prisms basis functions are bilinear (horizontal by vertical) on z-level grids, which allows one to carry out analytical computations of integrals. They deviate from bilinear on general meshes (like sigma grids or shaved prisms) and require to use slower quadrature rules. The code should handle these situations separately for the purpose of high numerical efficiency and turns out to be inconvenient to maintain. In contrast, tetrahedral elements always allow for analytical integration. The final model hence uses tetrahedral elements as illustrated in Fig. 2b. The new model version is about 10 times faster than the early version described in Danilov et al. (2004).
There have been a few accomplishments with FESOM development since the last FESOM reports in Wang et al. (2008) and Timmermann et al. (2009). First, a finalized model version combining features obtained during the past development phase is released. It remains stable with respect to its dynamical core over past three years and is recently employed in a coupled climate model. Second, the functionality of modelling ice shelves is added (Sect. 3.13), which utilizes a hybrid grid (Sect. 3.2). Third, the full free-surface formulation is added (Sect. 3.12). This is the recommended option for future applications. Finally, in contrast to the earlier development phase when our focus was mostly on the numerical core, more attention is paid to verifying parameterizations and evaluating global models (Sidorenko et al., 2011;X. Wang et al., 2012;Scholz et al., 2013). Although the development of FESOM has reached a milestone, much research is still required on the route of our model development as outlined along the discussions in Sect. 3.