Calibrating the sqHIMMELI v 1 . 0 wetland methane emission model with hierarchical modeling and adaptive MCMC

Estimating methane (CH4) emissions from natural wetlands is complex and the estimates contain large uncertainties. The models used for the task are typically heavily parametrized and the parameter values are not well known. In this study we perform a Bayesian model calibration for a new wetland CH4 emission model to improve quality of the predictions and to understand the limitations of such models. The detailed process model that we analyze contains descriptions for CH4 production from anaerobic respiration, CH4 5 oxidation, and gas transportation by diffusion, ebullition, and the aerenchyma cells of vascular plants. The processes are controlled by several tunable parameters. We use a hierarchical statistical model to describe the parameters and obtain the posterior distributions of the parameters and uncertainties in the processes with adaptive MCMC, importance resampling and timeseries analysis techniques. For the estimation, the analysis utilizes measurement data from the Siikaneva flux measurement site in Southern Finland. 10 The uncertainties related to the parameters and the modeled processes are described quantitatively. At the process level, the flux measurement data are able to constrain the CH4 production processes, methane oxidation and the different gas transport processes. The posterior covariance structures explain how the parameters and the processes are related. Additionally, the flux and flux component uncertainties are analyzed both at the annual and daily levels. The parameter posterior densities obtained provide information regarding importance of the different processes, which is also useful for development of wetland methane 15 emission models other than sqHIMMELI. The hierarchical modeling allows us to assess the effects of some of the parameters on an annual basis. The results of the calibration and the cross validation suggest that the early spring net primary production could be used to predict parameters affecting the annual methane production. Even though the calibration is specific to the Siikaneva site, the hierarchical modeling approach is well suited for larger 20 scale studies and the results of the estimation pave way for a regional or global scale Bayesian calibration of wetland emission models.

Abstract.Estimating methane (CH 4 ) emissions from natural wetlands is complex, and the estimates contain large uncertainties.The models used for the task are typically heavily parameterized and the parameter values are not well known.In this study, we perform a Bayesian model calibration for a new wetland CH 4 emission model to improve the quality of the predictions and to understand the limitations of such models.
The detailed process model that we analyze contains descriptions for CH 4 production from anaerobic respiration, CH 4 oxidation, and gas transportation by diffusion, ebullition, and the aerenchyma cells of vascular plants.The processes are controlled by several tunable parameters.We use a hierarchical statistical model to describe the parameters and obtain the posterior distributions of the parameters and uncertainties in the processes with adaptive Markov chain Monte Carlo (MCMC), importance resampling, and time series analysis techniques.For the estimation, the analysis utilizes measurement data from the Siikaneva flux measurement site in southern Finland.
The uncertainties related to the parameters and the modeled processes are described quantitatively.At the process level, the flux measurement data are able to constrain the CH 4 production processes, methane oxidation, and the different gas transport processes.The posterior covariance structures explain how the parameters and the processes are related.Additionally, the flux and flux component uncertain-ties are analyzed both at the annual and daily levels.The parameter posterior densities obtained provide information regarding importance of the different processes, which is also useful for development of wetland methane emission models other than the square root HelsinkI Model of MEthane buiLd-up and emIssion for peatlands (sqHIMMELI).
The hierarchical modeling allows us to assess the effects of some of the parameters on an annual basis.The results of the calibration and the cross validation suggest that the early spring net primary production could be used to predict parameters affecting the annual methane production.
Even though the calibration is specific to the Siikaneva site, the hierarchical modeling approach is well suited for larger-scale studies and the results of the estimation pave way for a regional or global-scale Bayesian calibration of wetland emission models.

Introduction
Methane is the third most important gas in the atmosphere in terms of its capacity to warm the climate, after water vapor and carbon dioxide, currently with the radiative forcing of 0.97 W m −2 (IPCC, 2013).This is a sizable part of the total effect of well-mixed greenhouse gases, which is approximately 3.0 W m −2 .According to IPCC (2013), the amount of CH 4 in the atmosphere has risen to its highest level in at Published by Copernicus Publications on behalf of the European Geosciences Union.
least the last 800 000 years due to human activity, and based on ice core measurements, also its growth rate is presently very likely at its highest level in the last 22 000 years.
The sources of CH 4 are both anthropogenic and natural.In the years 2003-2012, 60 % of the global emissions were anthropogenic (range 50-65 %), and about one-third came from natural wetlands.The most important source of uncertainty in the global methane budget is attributable to emissions from wetlands and other inland waters.Combining topdown and bottom-up estimates, natural wetland emissions range from 127 to 227 Tg CH 4 yr −1 (Saunois et al., 2016).Anthropogenic sources include rice paddies, landfills, enteric fermentation and manure, incomplete combustion of hydrocarbons, and natural gas leaks (Ciais et al., 2013).
The methane from wetlands is produced by prokaryotic archaea under anaerobic conditions.The main sink for atmospheric CH 4 is its oxidation in troposphere by OH, and the average lifetime of a CH 4 molecule in the atmosphere is 9.1 ± 0.9 years (Prather et al., 2012;IPCC, 2013).
The wetlands in the boreal zone are a significant contributor to the total CH 4 emissions from wetlands (Kirschke et al., 2013), and for this reason the CH 4 emissions from them have been intensively studied, also with models, during the past years (Wania et al., 2010;Kaiser et al., 2016;Petrescu et al., 2015).However, major discrepancies between predictions from those models remain (Melton et al., 2013;Bohn et al., 2015).
The need for improved wetland methane emission modeling is amplified by the fact that although annual mean precipitation is projected to increase in the boreal zone (Ruosteenoja et al., 2016), changes in the frequency and duration of severe drought may follow an alternate path (Lehtonen et al., 2014), manifesting the need to study wetland responses to extreme events.
Changes to hydrological conditions such as draining or recurring low water table depth can alter the balance of greenhouse gas emissions (Frolking et al., 2011;Petrescu et al., 2015).Modeling and calibrating for such exceptional events can be difficult, as was found, for instance, by Mäkelä et al. (2016).
The HelsinkI Model of MEthane buiLd-up and emIssion for peatlands (HIMMELI) is a relatively full-featured wetland/peatland CH 4 emission model and it is described in detail in Raivonen et al. (2017).The model contains process descriptions for CH 4 production from anaerobic respiration, O 2 consumption and CO 2 production from oxic respiration, and gas transport processes via diffusion, ebullition, and plant transport.Modeling the concentrations of CH 4 , O 2 , and CO 2 in the peat column is explicitly included.The peat column depth can be set at any desired value, and the water table movement determines the part of the peat column that is favorable for CH 4 production.The version of HIMMELI in this work has additional processes, described in Sect.3.1, and the modified model is referred to as sqHIMMELI (square root HIMMELI), as it contains a description of CH 4 production from root exudates.The sqHIMMELI model is geared towards site-level studies, whereas HIMMELI is more suited for integration directly as a component in, e.g., land surface schemes.
Even well-constructed computer models describing environmental processes accumulate error at many levels (Sanso et al., 2008).The sources include time and space discretization, compromises in model physics and biochemistry descriptions due to computational constraints, insufficient information about the initial states of the model, and numerical errors, along with parameterization-induced inaccuracies of the subgrid-size processes.This leads to a need to calibrate and optimize models, as the physical variables do not necessarily exactly correspond to the model variables, and hence the model parameters cannot often be directly measured.Of course, any physically insightful interpretation of calibration results makes sense only for a well-constructed physical model.
Several current CH 4 models include the important physical processes controlling both CH 4 production and transport in the peat column (Kaiser et al., 2016;Lai, 2009b;Müller et al., 2015;Grant and Roulet, 2002).The modeled peat column depth affects the total modeled CH 4 emission from the peatlands and it is directly included in some models (Lai, 2009b;Walter and Heimann, 2000).These models are in general highly sensitive to changes in the values of the parameters (van Huissteden et al., 2009).However, even though algorithmic parameter optimization has been done in some studies, the stress is often on parameter efficiencies (van Huissteden et al., 2009) or optimal values (Müller et al., 2015), and hence the full uncertainty of the values of parameters in these models is not well understood.
Methane models typically use measured values from field campaigns and parameters estimated from those studies where applicable (Lai, 2009b;Walter and Heimann, 2000;Tang et al., 2010;Riley et al., 2011), and, when needed, include extra tuning parameters for processes (Walter and Heimann, 2000).This is a practical and much-used route, as information regarding all of the needed parameters is not available at all sites (van Huissteden et al., 2009;Walter and Heimann, 2000).Wide variability can be expected from some parameters, such as those controlling CH 4 oxidation (Segers, 1998).Emissions from different areas of the same wetland can also vary, due to microtopography and differences between how fast the peat decomposes in different areas (Lai, 2009a;Cresto Aleina et al., 2016), making straightforward parameter value assignment difficult.
Due to these uncertainties, values of parameters vary widely from research to research.For instance, for the Q 10 value controlling the temperature dependence of CH 4 production, Walter and Heimann (2000) use the value 6, handpicking it from the interval of 1.7-16, whereas van Huissteden et al. (2009) use a range of 3-8, and Müller et al. (2015) constrain the value between 1 and 10, with the default value of 1.33 and eventually optimize it to the value of 1 for two of the three optimizations presented.For other parameters, such as those controlling diffusion rates in peat, the situation is similar.
Calibration done for the models is usually quite basic.Wania et al. (2010) tune their model by running it with parameters from a parameter grid, containing only three values for each of the seven parameters tested, and Riley et al. (2011) follow a similar procedure for the wetland CH 4 model component, CLM4Me, of the Community Land Model.Such sensitivity studies obviously are not able to find out how a model is able to perform at its best.Müller et al. (2015) have further optimized the CLM4Me model using an emulator combined with a simple minimization algorithm, with respect to several different sites, which are bound to have quite different physical characteristics, and are yielding optimal values often at the borders of the prescribed allowed area of variation.In a sensitivity analysis of the PEATLAND-VU model, a derivative of the Walter Heimann model, van Huissteden et al. (2009) look at the efficiencies of the different parameters but do not elaborate on other qualities of the posterior.
Using hierarchical modeling to estimate annually varying parameters is sensible, since the flux measurement site has both properties that change from year to year (e.g., small changes in vegetation, plant roots, and microbe populations) and properties that are more permanent (e.g., peat quality and plant species).With fixed parameter values for all years, the model sometimes does not accurately and appropriately describe the observations.On the other hand, with different parameters for all the years, the parameters are easily overfitted, meaning that while the resulting model fits the data well, it does not accurately predict future fluxes (Gelman et al., 2013).Hierarchical modeling provides a solution for these problems.
In the present study, the sqHIMMELI model is calibrated using adaptive Markov chain Monte Carlo (MCMC) and importance resampling techniques to evaluate a hierarchical statistical model for the model parameters.The calibration is done for the boreal Siikaneva site.This study complements the work in Raivonen et al. (2017) in describing the effects of various parameters on the processes and fluxes, and analyzing what kinds of configurations best describe the studied boreal wetland.
Merely optimizing model parameters may lead to misleading results due to the presence of several local minima in the objective function; for example, Müller et al. (2015) reported in a study where they used a surrogate model to calibrate the parameters of the CH 4 model component of the Community Land Model.This multimodality can be accommodated for by using MCMC techniques.Utilizing MCMC methods for optimizing environmental models and studying their uncertainties is not new (Laine, 2008;Ricciuto et al., 2008;Hararuk et al., 2014), but to our knowledge they have not been used for wetland CH 4 model parameter estimation before.Moreover, the research that the authors are aware of does not investigate the interannual variability of parameters, as is done in this study.
The main objective of this work is to analyze the capabilities and limitations of a modern feature-filled wetland CH 4 model by looking into the shape of the posterior parameter distributions, parameter correlations, and the roles, identifiabilities, interdependencies, and interconnections of the parameters and the processes they control.As a part of this work, knowledge about how the methane and carbon dioxide flux data are able constrain the parameters and processes is obtained.
2 Siikaneva wetland flux measurement site and model input data Methane and carbon dioxide flux measurements were needed for estimating the model parameters, and for that purpose observational data from the Siikaneva peatland flux measurement site in southern Finland (61 • 50 N, 24 • 12 E) were used.
The site is a boreal oligotrophic fen with a peat depth of up to 4 m.Measurement of ecosystem-scale gas fluxes started in 2005, and in this work eddy covariance (EC) CH 4 and CO 2 flux measurements from the years 2005 to 2014 were used.In the current application of the EC method, the gas fluxes were calculated from the wind speed and direction, and CH 4 and CO 2 concentration information.All these variables were sampled with 10 Hz and fluxes were calculated over 30 min averaging time in order capture to the whole spectrum of turbulent exchange.During the measurement period, several different instruments were used for methane concentration measurements: Campbell TGA-100 (2005-2007and April-August 2010), Los Gatos RMT-200 (January 2008-February 2014), Picarro G1301-f (April 2010-October 2011), and Los Gatos FGGA (2014).Carbon dioxide concentrations were measured throughout the period with a LI-7000 manufactured by LI-COR Inc.The wind velocity vector was analyzed by a USA-1 acoustic anemometer by METEK (Rinne et al., 2007).All the EC data were postprocessed in a consistent manner using an in-house software EddyUH (Mammarella et al., 2016).Flux data were screened for instrumental problems and for insufficient turbulent mixing.Due to instrument problems, data from 2009 were not available.
For this study, daily means of CH 4 fluxes were calculated from the screened data that contained gaps.This is a viable approach, since CH 4 fluxes do not show a diel pattern at this site (Rinne et al., 2007).However, before calculating the daily values of net ecosystem exchange of CO 2 , standard gap-filling methods for peatland CO 2 fluxes were applied (Aurela et al., 2001(Aurela et al., , 2007)).In short, the gap-filling algorithm estimated the CO 2 flux dependency on photosynthetic photon flux density, air temperature, and water table position, and the algorithm was used to fill periods when CO 2 For using these carbon dioxide data with the cost function, the CO 2 flux produced by sqHIMMELI was matched with the sum of net ecosystem exchange and the net primary production of all plants.We assumed that the share of aerenchymatous plants is 70 % of the total net primary production (NPP).The fact that the net primary production is not a measured but modeled quantity (see below) introduces some uncertainty into the CO 2 flux against which the model is calibrated.
The required inputs for sqHIMMELI are daily soil temperatures, water table depths (WTDs), NPP, and leaf area indices (LAIs).The soil temperature profile for the grid used was generated by interpolating from measurement data between the measurement depths (−5, −10, −20, −35, and −50 cm) and assuming that at −3 m and below the temperature is a constant +7 • C.This was the mean temperature of all the years at −50 cm depth.The WTD data used were available as measurement data, and where data were missing they were gap-filled by repeating the previous measured value.Net primary production cannot be measured in a direct way, and hence values obtained from a regression model were used.The methodology is explained in Appendix E and still further in Raivonen et al. (2017).Similarly for LAI, a simple model was used for obtaining the input.The details are, again, given in Appendix E. A summary of the data used is given in Table 1.

The sqHIMMELI model
The HIMMELI model (Raivonen et al., 2017) is a detailed model for estimating CH 4 emissions from wetlands.It was developed at the University of Helsinki in collaboration with the Finnish Meteorological Institute and the Max Planck Institute for Meteorology in Hamburg.The model is designed to be used as a submodel in different modeling environments, such as regional and global biosphere models.It contains processes describing the production of CH 4 and CO 2 including anaerobic production of CO 2 , the loss of CH 4 and O 2 , and transport of CH 4 , O 2 , and CO 2 between the soil and the atmosphere.The CH 4 transport can take place by diffusion in peat (in water and in the air), by ebullition (transport by bubble formation), and by diffusion in the porous aerenchyma tissues in vascular plants.The model is driven with peat temperature, WTD, and LAI of the aerenchymatous plants.The process descriptions are mainly adopted from previous wetland CH 4 models such as Arah and Stephen (1998), Wania et al. (2010), andTang et al. (2010).The version of the model used here differs slightly from that presented in Raivonen et al. (2017) and is therefore called with the different name of sqHIMMELI to avoid confusion.
The model simulates the processes in a discretized peat column.The number and thickness of the peat layers can be varied, but in this work six 10 cm layers are used, similarly to, e.g., Kaiser et al. (2016), with one thicker bottom layer under these, so that the total modeled peat column depth is 85 % of the maximum observed 4 m depth of the wetland, i.e., 3.4 m.The water table divides the column into waterfilled and air-filled parts, and CH 4 is produced only in the inundated anoxic layers.In the present configuration, the NPPrelated CH 4 production is allocated into the layers according to the vertical distribution of the root mass, described in Sect.3.2.The internal time resolution of the model is dynamically adjusted depending on the model state, and the output interval is set to 1 day.
At present, the model does not contain descriptions for processes related to snow pack or ice such as diffusion through snow, or release of accumulated gas bubbles under ice in springtime as described by, e.g., Mastepanov et al. (2013) andSriskantharajah et al. (2012).
HIMMELI itself, as presented in Raivonen et al. (2017), does not simulate carbon uptake (photosynthesis) or peat carbon pools but instead it takes as input the rate of anoxic respiration.The differences between HIMMELI and sqHIMMELI are described below in Sect.3.1 and 3.2 and in Sect.3.5.3.
For each modeled process in sqHIMMELI, there are parameters regulating the process, affecting the concentrations of CH 4 , O 2 , and CO 2 in the peat column, and the wetland methane emissions.The equations describing the physics relevant to the optimized parameters are listed in Sect.3.4.Other relevant model equations are listed in Sect.3.5.

Root exudates and peat decomposition
Methanogens prefer recently assimilated fresh carbon as their energy source, for instance, the root exudates of vascular plants (Joabsson and Christensen, 2001).A connection between ecosystem productivity and CH 4 emission has been observed in several wetland studies (Bellisario et al., 1999;Whiting and Chanton, 1993).However, anoxic decomposition of litter and older peat also produces CH 4 (Hornibrook et al., 1997).Many models form CH 4 substrates by extracting directly a fraction of the net primary production (van Huissteden et al., 2009;Wania et al., 2010), and some rely on heterotrophic peat respiration only (Riley et al., 2011).In sqHIMMELI, both primary production and anaerobic peat decomposition were included.
The modified sqHIMMELI model contains an exudate pool description, from which it produces methane (Eqs.3 and 15).The exudate pool itself is described by Eq. ( 4), detailing how the modeled NPP turns into root exudates.Effectively, a fraction of NPP determined by the parameter ζ exu (-) produces root exudates, which are then distributed as anaerobic respiration according to the root distribution into the peat column at the rate determined by the model parameter τ exu (s).The part ending up under the water table produces CH 4 and CO 2 , depending on the oxygen content of the water, and above the water table the exudates are respired into CO 2 .
The second source of anaerobic respiration, the anaerobic peat decomposition, is modeled in sqHIMMELI with a simple Q 10 model adopted from Schuldt et al. (2013).The peat under the water table is prescribed a turnover time, based on which anaerobic respiration and CH 4 are produced according to Eqs. ( 5) and (16).

Root distributions
The sqHIMMELI model differs from HIMMELI in the details regarding the root distribution model.Compared to measurement data of root distributions of aerenchymatous sedges from Saarinen (1996), the original root distribution π(z), adopted from Wania et al. (2010) and described by does not describe the distribution of roots well.Here, z is depth, and λ root is a parameter describing the steepness of the decaying exponential curve.This formula is replaced with With the Gaussian shape, the new root density decreases faster with depth.Without this change, the optimization process calibrates the model to have very high root masses below 50 cm underground.The other difference between the models is that in the original model there are vanishingly few roots below the depth of 1 m, but according to Saarinen (1996) Saarinen (1996).
The new root distribution curve with optimized parameters is shown along with the curves resulting from the MCMC optimization.The original distribution gives more root mass to depths of 50-80 cm than the MCMC-optimized curves of the new root distribution.All curves are normalized to the same total root mass.
sedge roots can reach to as low as 2.3 m under the surface.
The term C 1 in Eq. ( 2) was added to remedy this.Before starting the optimization, the parameters C 0 , C 1 , and z 0 were fitted to data from Saarinen (1996), resulting in values of C 0 = 215, C 1 = 6, and z 0 = 0.105.The different root distributions are shown in Fig. 1.

Peat depth
Methane is produced from anaerobic peat decomposition at all peat depths in the sqHIMMELI model, and its transport and oxidation affect the modeled CH 4 emission.The homogeneous model description of the peat column is highly idealized, as in reality the peat column varies from place to place with respect to CH 4 production rate, production depth, and gas transport.We model the peat column to be 3.4 m deep, which is 85 % of the maximum observed depth of the Siikaneva wetland.Small uncertainty in the value of the parameter is acceptable since the parameter τ cato , which regulates the rate of peat decomposition into CH 4 , can partly compensate for this uncertainty.

Parameter descriptions for sqHIMMELI
The parameters for the optimization were chosen to constrain the processes most important for the CH 4 emission.Of the optimized parameters, all but ζ exu (-) and Q 10 (-) are the same for all years.However, ζ exu and Q 10 change year to year to reflect the changes in the relative CH 4 input to the www.geosci-model-dev.net/11/1199/2018/Geosci.Model Dev., 11, 1199-1228, 2018 Table 2. Parameters that were not calibrated.Based on an initial sensitivity analysis, the Michaelis-Menten parameters K were not constrained by the data enough strongly and consistently to include them in the optimization.The same applies for the ebullition half-life, which is understandable given the temporal resolution of the observed data.The peat porosity was dropped from optimization in favor of the diffusivity parameters f D,w and f D,a , and the specific leaf area (SLA) was not chosen for optimization since the optimized parameters τ (m m −1 ) and ρ (m 2 kg −1 ) are already part of Eq. ( 22) where SLA appears.system from peat decomposition and NPP-based production.This will allow to analyze the year-to-year changes in relative importance of the production pathways.The setup is natural; for example, Bergman et al. (2000) report the Q 10 values changing from measurement date to another, even within a single year.As the values reported for minerotrophic lawn in Bergman et al. (2000) indicate that they may vary quite irregularly within a growing season, the modeling performed here does not take intra-annual variations into account and concentrates on the year-to-year variation.Possible mechanisms for the parameter variations include variations in substrate supply and desiccation stress, and are discussed in, e.g., Davidson et al. (2006).Table 2 shows the parameters that are used in the equations below but not optimized in this work, along with their values and explanations of why they were left out.The list of calibrated parameters along with their physical meanings is presented below.
CH 4 production-related parameters 1. τ exu (s) controls the decay rate of exudates, ν, from the root exudate pool P exu , 2. ζ exu (-) is the fraction of NPP carbon that goes to the root exudate pool.
where ψ t is the rate of NPP at time t, P exu is size of the root exudate pool, and ν was given by Eq. (3).
4. Q 10 (-) controls the temperature dependence of the rate of peat decomposition into CH 4 in anaerobic conditions via factor k cato , given by 5. f exu CH 4 (-) is the fraction controlling the methane production from anaerobic respiration of root exudates in Eq. ( 15).
Oxidation and respiration parameters 6. V R0 (mol m −3 s −1 ) is the respiration parameter controlling the rate of heterotrophic respiration, which consumes O 2 and produces CO 2 .This affects the rate of temperature dependent heterotrophic respiration, V R (z), given by Here, E R (J mol −1 ) is a parameter affecting the temperature dependence of the heterotrophic respiration, R is the universal gas constant, and T (z) is temperature at depth z.
7. E R (J mol −1 ) is described above in the context of Eq. ( 6).
8. V O0 (mol m −3 s −1 ) is the CH 4 oxidation parameter controlling the potential rate of CH 4 oxidation V O : 9. E oxid is described in Eq. ( 7), affecting temperature response of CH 4 oxidation.
Gas transport-related parameters 10. λ root (m) controls how the root mass is distributed; see Eq. ( 2).
12. τ (m m −1 ) is the root tortuosity parameter affecting the root conductance K R .A tortuosity of 1 means that the roots are not decreasing the conductance via their curvedness.The equation for the conductance is where π(z) is the root mass density as a function of depth, over which the sum of the density is 1, and m is the total root mass per square meter, set to be proportional to LAI.
13. f D,a (-) is the fraction of the diffusion rate in air-filled peat divided by the diffusion rate in free air.The parameter affects the diffusion and the plant transport fluxes in the model: the higher this parameter is, the more diffusion there is, as it takes a shorter time for the CH 4 to exit the peat, reducing the possibility of oxidation and increasing the concentration gradient driving diffusion.The equation is where D air is the diffusion rate in air-filled peat, D 273 air is the diffusion base rate at 273 K, and T is the temperature.The effect on plant transport comes via Eq.(8).
14. f D,w (-) is the same as above but in water.The equation describing the peat-water diffusion rate is where the terms are analogous to the ones in Eq. ( 9).

The sqHIMMELI model equations
The version of HIMMELI presented here describes processes for CH 4 production and transport.It differs from the version presented in Raivonen et al. (2017) in that the model presented there does not contain the processes for anaerobic respiration but rather takes them as input, the idea being that such input would be available when using HIMMELI as a part of a larger model.Hence, the equations presented in Sect.3.5.2are specific to the version used in this study.
The other difference between the models is the difference between the root distributions described in Sect.3.2.

Governing equations
The gas concentrations of CH 4 , carbon dioxide, and oxygen in the peat column are governed by the equations where T X (t, z) describes transport of gas X containing the diffusion, ebullition, and plant transport components, and R stands for production or consumption.The different terms in the equations are described below.

Anaerobic respiration producing CH 4
The equations presented in this section are specific to the version of HIMMELI used in this study.The version in Raivonen et al. (2017) takes the rate of anaerobic decomposition of carbon as input and does not treat the different sources of that carbon separately.The carbon for methane production in this model version comes from two sources: root exudates and anaerobic peat decomposition.The methane production from anaerobic respiration of that carbon is given by the terms R exu CH 4 and R peat CH 4 , described by where, in Eq. ( 15), ν is the decay rate of root exudates from Eq. ( 3), η is an oxygen inhibition parameter, C O 2 (z) is the oxygen concentration at depth z, and π(z) is the normalized proportion of the total anaerobic root mass, also at depth z, given in an unnormalized form in Eq. ( 2).The decay rate of root exudates does not depend on the peat column thickness.The parameter f exu CH 4 (-) determines what fraction of root exudates in anaerobic conditions will turn into CH 4 .Equation ( 15) is only used below the water table.The anoxic peat decomposition described by Eq. ( 16) depends on the amount of peat and its temperature, among others.The factor g Q 10 m (-) is the proportion of the anaerobic peat decomposition process producing CH 4 , ρ cato is the peat density in the catotelm, f C cato is the fraction of carbon in catotelm peat, and M C is the molar mass of carbon.The parameter k cato = Q (T −273.15) 10 10 /τ cato is described in Eq. ( 5), and is zero above water table.The equations for CO 2 are similar: and the meanings of the symbols are analogous to the ones in equations for CH 4 .

Peat respiration and methane oxidation
Peat respiration (aerobic respiration) is described with an equation of the Michaelis-Menten form: where C w O 2 is the oxygen concentration in water.Above the water table, we assume a water phase that is in equilibrium with the gas phase, i.e., C w O 2 = αC a O 2 .The parameter α is a dimensionless Henry solubility constant for oxygen.Parameter K R is the Michaelis-Menten constant of the process, and V R (z) is given by Eq. ( 6).Methane oxidation is controlled by dual-substrate Michaelis-Menten kinetics, and here the terms are analogous to those in Eq. ( 19), except for that the term V O (z) is described by Eq. ( 7).

CH 4 transport
The transport term T X (t, z) in Eq. ( 11) consist of the following terms: The first of these is the diffusion, where the diffusion coefficients D are given by Eqs. ( 9) and (10), and "medium" refers to either air or water.The second equation is for plant transport, with ρ (m 2 kg −1 ) and τ (m m −1 ) described in context of Eq. ( 8), π(z) the normalized root distribution mentioned above, and C atm X referring to the atmospheric partial pressure of gas X.LAI stands for the leaf area index, given as input, and SLA is the specific leaf area.The third equation is the ebullition component of the gas transport, where pp i refers to the partial pressure of different gases indexed with i, R is the universal gas constant, k is an ebullition rate constant, and σ is the peat porosity.The parameters P atm and P hyd (z) refer to the atmospheric pressure and hydrostatic pressure at depth z, respectively.

Model calibration
The model calibration consists of several steps but can be summarized as first estimating the posterior with MCMC and then based on those results, recalibrating the objective func-  2003) * For importance resampling, the hierarchical modeled parameters' (Q 10 (-) and ζexu (-)) priors were relaxed by a factor of 3 to allow for a more data-constrained resampling and to accommodate the low values of Q 10 reported by Szafranek-Nakonieczna and Stepniewska (2014).Note that the values of the prior for these two parameters were sampled at each iteration with Gibbs sampling.tion and using this new formulation for importance resampling.Importance resampling is typically used for obtaining posterior distributions from minor changes to the objective function descriptions (Gelman et al., 2013).This is also its purpose here.
In more detail, first, a posterior estimate was drawn running 500 000 iterations of sqHIMMELI simulations with the adaptive Metropolis Markov chain Monte Carlo algorithm with a Laplace-distributed error description and a first-order autoregressive model, AR(1), for the residuals.Second, for defining the more refined cost function for importance resampling, the optimal order for an autoregressive moving average (ARMA) time series model for the model residuals was identified from the maximum a posteriori estimate by minimizing the Akaike and Bayesian information criteria with respect to the model order.The third step was drawing a random sample of size 50 from the posterior estimate obtained with MCMC, with which the error model parameters α and γ , described in conjunction to the details of the error model in Eq. (A3), were calibrated by minimizing the Kullback-Leibler divergence (Kullback and Leibler, 1951) with respect to the standard Laplace distribution for the methane and carbon dioxide separately.The median of the obtained parameters was chosen for the second cost function used in the importance resampling.Fourth, a random sample of size 10 000 was drawn from the MCMC posterior and importance resampling was performed by drawing a subsample of size 1500 utilizing weights calculated with the new cost function values obtained from the abovementioned error model calibration as described by, e.g., Gelman et al. (2013).
The need for the importance resampling arises from the fact that the error-model-transformed methane and carbon dioxide residuals emerging from the maximum a posteriori and posterior mean estimates from the calibration with the AR(1) model are not fully independent and identically distributed.The recalibration of the error model, and resampling from the simulated posterior using importance resampling, remedies this problem, as can be seen in the residual histogram and autocorrelation functions in Fig. 2.

Hierarchical description of parameters
In order to be able to assess the annual parameter and CH 4 transport pathway changes, a hierarchical description for two of the parameters was used.These parameters were Q 10 (-) controlling the temperature dependence of the peat decomposition rate, and ζ exu (-) regulating the production of root exudates from NPP.
The "hyperparameters" are the means and variances defining the Gaussian priors of the hierarchical parameters Q 10 (-) and ζ exu (-).They were updated using fixed Gaussian "hyperpriors" with Gibbs sampling.The sampling distribution depends on the current values of the hyperparameters.The role of the hyperprior is to constrain the distribution from which the hyperparameters are sampled.
Technically, a "Metropolis-within-Gibbs" method (Gelman et al., 2013) for sampling the hierarchical parameters, non-hierarchical parameters, and the hyperparameters was used, presented briefly in Appendix C. The model parameters (i.e., everything except the hyperparameters) were sampled with the adaptive Metropolis (AM) MCMC algorithm (Haario et al., 2001), which uses a Gaussian proposal distribution, whose covariance matrix is adapted as the chain evolves, and over time the acceptance rate gets closer to an optimal value, which is 0.23 for Gaussian targets in large dimensions (Roberts et al., 1997).If the algorithm proposes  values outside the hard parameter limits listed in Table 3, the model will not be evaluated and the value is rejected.
Our empirical data for the hierarchical model were the 9 years from 2006 to 2014, meaning that for each of these years there were corresponding ζ exu (-) and Q 10 (-) parameters in the optimization.The model was spun up for each annual flux estimation in order to have a realistic column of gas concentrations available.For this reason, the previous year was always also simulated, and for the likelihood only the residuals from the latter year were included in the calculations.Therefore, the year 2005 did not contribute directly to the values of the objective function.The different years were run in parallel to save execution time.

Objective functions for MCMC and importance resampling
As in many practical uncertainty quantification applications, a major part of the parameter estimation problem is the proper definition of the objective function.For MCMC, it is defined here based on a priori information about the measurement uncertainties, based on information from the model residuals, and based on additional prior information.For the importance resampling, we modify the error model for the CO 2 and CH 4 residual components of the objective function based on an analysis of the MCMC results.

Model residuals and error model
The form of the objective function is the same for both MCMC and importance resampling.The first two components of the objective function contain the contributions from the modeled differences to the daily CH 4 and CO 2 flux measurements.In the MCMC objective function, it is assumed that the daily flux estimate uncertainties are dependent on approximately a fraction α of the flux measurement (Richardson et al., 2006) and some constant error, γ (e.g., measurement device precision).The model error is expected to follow a similar form, and hence α and γ contain the contributions from both the model and measurement errors.For importance resampling, the description is the same except for that a 14-day running mean of the interannual variability is used for α.These parameters are set independently for both CH 4 and CO 2 .
When determining the parameters γ and α, the resulting residuals end up being autocorrelated.Therefore, they are treated as such with the AR(1) model for MCMC and with the ARMA(2,1) model for the importance resampling, described, e.g., in Chatfield (1989).
Since the primary interest is in the methane fluxes, the carbon dioxide residuals are scaled down to a fifth in the importance resampling cost function, which is enough to guide the parameter values since several years of CO 2 flux data are used.Furthermore, as the model does not contain descriptions for the effects of snow and ice on the fluxes, the fit cannot be expected to be very good in the winter months.Therefore, we further only consider 20 % of the contribution of the residuals in the winter season from December to February.The obtained residuals, denoted by the terms in the objective function, Eq. ( 24), are treated as Laplace distributed.The flux observation errors are reported to follow a distribution of this type, rather than a Gaussian distribution (Richardson et al., 2006).The error model is explained in more detail in Appendix A.

Prior information
The parameters affecting the CH 4 production of the wetland model are not known well, but despite this, not setting any   4.
prior distributions on parameters can lead to nonphysical parameter values in the posterior distribution.The parameter priors are set to zero outside prescribed bounds.Within these bounds, the parameters are assigned Gaussian priors, with the exception of one parameter whose prior is set to be flat.The prior values are based on both lit-erature and expert knowledge, and the information regarding the parameter values is summarized in Table 3.

The objective function
The "objective function" for the parameter optimization, J (θ ), is the negative logarithm of the value of the unnormalized posterior probability density function at θ .It combines 2 .

.
3 .our statistical knowledge of the flux observations and parameter priors presented in Sect.4.2.1-4.2.2 and is given by Here, • t are the AR(1)-or ARMA(2,1)-transformed, Laplace-distributed residuals, and the last term is the prior contribution, where θ k is the proposed parameter value, µ k is the prior mean, and σ 2 k is its variance.For further technical details, see Appendix A.

Results and discussion
The Markov chain Monte Carlo simulations yielded a chain of 500 000 samples.From these, 70 % from the start of the chain were discarded as a warm up (Fig. 3).A revised posterior distribution, obtained by first sampling 10 000 entries randomly from the chain, and after that obtaining 1500 entries from those with importance resampling, is shown in Fig. 4, and the correlation features are shown in the upper triangle of that figure.For the different processes, Fig. 5 shows an example of the posteriors and the process correlations.Three different parameter estimates obtained from the posterior distribution were used to look at its features and fluxes: the maximum a posteriori (MAP) estimate, posterior mean estimate, and a non-hierarchical posterior mean estimate, where the mean values of the parameters ζ exu (-) and Q 10 (-) over the different years were used.The "default" parameters in the text and figures refer to values adapted from Raivonen et al. (2017).If not stated otherwise, the maximum a posteriori and posterior mean estimates refer to the values obtained from the importance resampling, not from the MCMC.

Parameter values
The parameter values used in the analyses are shown in Table 4.The MAP and posterior mean estimates agree on the value of the water diffusion rate coefficient f D,w (-), and the posteriors shown in Fig. 6k show that the estimates are close to the middle of the marginal distribution and slightly above the prior value.In tests with a shallower peat column, smaller values of this variable were obtained (not shown).
Contrary to this, the air diffusion rate coefficient, f D,a (-), finds its best values lower, and the variability of the parameter is larger than that for the diffusion rate coefficient in water-filled peat.
The root distribution parameter, λ root , is optimized larger than expected, and again the MAP estimate is close to the posterior mean.This implies that the model optimizes best when the CH 4 produced from the photosynthesis-induced exudate production goes relatively far below the surface: with a value of 0.3, 49% of the roots are deeper than 25 cm, 15 % of the roots are deeper than 50 cm, and just 2.5 % are deeper than 75 cm; see Fig. 1.In relation to these numbers, the water table depth is most of the time above the depth of −20 cm.Additionally, a larger λ root will facilitate the emission of the CH 4 produced by peat decomposition in the catotelm.
The values of the exudate pool turnover time τ exu are close to the default value of 2 weeks, with the MAP estimate at a little under 14 days and the posterior mean at 2.5 days more.The results from the importance resampling show that the spread is around 3 days around this posterior mean value.However, the value of ζ exu controlling amount of exudates produced from photosynthesis is smaller than the default value at roughly 0.15-0.45,with the MAP and posterior mean estimates at 0.343 and 0.292, respectively.In contrast to this, and balancing the effect of a relatively low ζ exu , the parameter f exu CH 4 (-), controlling how much methane is produced from anaerobic decomposition of exudates, has a  skewed posterior marginal distribution with most of the mass above the value of 0.7, as can be seen in Fig. 6.The non-hierarchically optimized parameter, V O0 (mol m −3 s −1 ), controlling the amount of CH 4 oxidation taking place is close to the minimum allowed value at onefifth of the default value.This is also true for the parameter controlling heterotrophic respiration, V R0 (mol m −3 s −1 ), all of whose optimized estimates reside close to its minimum value, reducing the amount of heterotrophic respiration taking place.The posteriors are very narrow.In contrast to these narrow posteriors, the parameters E oxid (J mol − 1) and E R (J mol −1 ), which are present in the same equations as the V O0 and V R0 parameters, have slightly wider posterior distributions, with the former slightly under and the latter slightly above the default values.
Table 4 shows that the hierarchically optimized parameter Q 10 (-), controlling the temperature dependence of the CH 4 production from peat decomposition, has slightly different values for the MAP and posterior mean estimates, with the Gibbs-sampled mean value (mean of those values in the case of the posterior mean) at 5.72 and 4.43, respectively.
The parameter τ cato (y), also controlling the peat decomposition rate in the catotelm, compensates for the differences of Q 10 between the MAP and posterior mean estimates by having a faster turnover time for the posterior mean than the MAP estimate.That parameter has a wide posterior, ranging from around 10 000 to 30 000, which was the value used by Raivonen et al. (2017) and the upper limit of the parameters in our work.Our posterior density goes to zero towards the higher limit, and the posterior mean is found at the value of 22 690 years.
The interannual variability of Q 10 (-) is mostly similar for both MAP and posterior mean estimates.For instance, the Almost all ebullition takes place when the water table is below the peat surface, and hence it is emitted to the atmosphere as part of the diffusion flux.Plant transport is not shown, as it is very close to the complement of the diffusive flux: together, these two streams add up to more than 98 % of the total flux.Plant transport variation is very close to that of diffusion.On the right side of the figure, the average annual errors are shown for the interannual variation of the fluxes.The results of the cross validation of the regression modeling of the hierarchically varying parameters, discussed in Sect.5.6, are drawn in orange.The "default" parameters produce carbon dioxide fluxes that are above the upper limit of the chart.for the MAP estimate than for the posterior mean estimate, indicating a better fit in terms of the error model.In Fig. 7b, the non-hierarchical posterior estimate shows a large variance of the annual errors, with early years having a positive bias, and later years having a negative bias.Incidentally, the average discrepancy from observations over the whole period for the non-hierarchical posterior mean is small for both methane and carbon dioxide, as Fig. 8 indicates.However, the variation for methane is the largest, implying that the annual variation is not reflected well.The model estimates of the annual fluxes are good in that the variance of the errors is small for both MAP and posterior mean experiments, especially, even though the estimates show a negative bias of 25 %.Compared to the default parameters, which strongly underestimate methane emissions (and even more overestimate the carbon dioxide emissions), the flux estimates are much improved.This is to be expected as the results shown are not for an independent validation dataset.Rather, the motivation with the MAP and posterior mean estimates is to see what the model fit looks like for optimized parameters and how the features differ from the unoptimized ones.It is, however, worth noting that the target objective function did not aim at minimizing annual discrepancies but daily residuals that were considered correlated.

Cost function values and model fit
A cross validation of the regression modeling in terms of the annual errors is shown in Figs.7b and 8.While the annual estimates are not on average better than the ones from the simulation with the non-hierarchically obtained posterior mean, the spread of the errors are acceptable, particularly if the strong negative bias in 2007, which is mostly due to lack of observations during the season, is disregarded.Additionally, the overall biases are surprisingly slightly better than with the optimized parameters, due to effects of the prior, different data resolution in the cost function, and the nontrivial error model used.The cross validation is described in Sect.5.6.
The positive bias in the CO 2 may partly be due to the assumption that 70 % of the NPP comes from the aerenchymatous plants, and this affected the data that the sqHIMMELI model results were matched with.
All years of hierarchically optimized experiments show at least a small negative annual bias in the methane flux when compared to the available observations.This can be due to the high day-to-day variability of the summertime fluxes, which dominate year-round total fluxes, and the fact that the model can not, without data about the fine structure and heterogeneity of the wetland, match the high variability fluxes.The proportional model-data residual error component αy t (Appendix A) allows the model to underestimate the high peaks more than the low flux values.The error model favors the baseline of the lower values during periods when observed variance is very high, for instance, in the peak emission season of 2010.This is also true for periods of increased ebullition, and such fluxes are very difficult to fit into.These periods contribute to both the cost function values and the underestimation of the total methane flux.Any temporal shifts of peaks of seasons are penalized heavily, and the optimized parameter values rather produce less peaks than right size peaks at a slightly wrong time.-c), total ebullition taking place (d), CH 4 production (e-f), CH 4 oxidation (g), and model residuals (h) as functions of water table depth.Shaded areas show the 5th and 95th percentiles.To look at the effect of the optimization, compare the black and the blue/red lines.
Another reason is that the carbon dioxide fluxes are overestimated by the model, leading to need to balance between the two, and as methane production in the wetland also produces carbon dioxide, the optimization algorithm will find a middle ground between the conflicting needs of minimizing carbon dioxide and maximizing methane production.
Additionally, the wintertime methane fluxes are underestimated systematically, and the emissions start slightly late in early summer, which produces a negative bias to the total flux even though visually the fit is good, as can be seen in Fig. 9.This figure also reveals that the observations for the vast majority fall within the confidence margins suggested by the ARMA model for the residual.The variation from the full posterior is higher because the uncertainty shown in Fig. 9 does not take the parameter variations into account.
The carbon dioxide time series against flux observations are shown in Fig. 10.This figure reveals that sqHIMMELI and the error model most of the time are able to explain the carbon dioxide fluxes well, even though some of the largest deviations are not captured.Since in an observational time series outliers can come from an underlying process that is not well explained by these models, having a small number of such deviations is not surprising.
The input data have a role in affecting the model fit to the data, and since NPP is a modeled quantity, there is some additional uncertainty stemming from that modeling involved.For LAI, we note that even though in reality it is not identical every year, in the model, it follows the same pattern (see Appendix E).The parameter calibration must then favor parameters producing a good fit in terms of average model performance.

Parameter values and processes in sqHIMMELI
The sqHIMMELI model produces the CH 4 from anaerobic respiration that originates from peat decay and the decay of root exudates.These production components, along with the different output pathways, CH 4 oxidation, and model residuals, are plotted as functions of water table depth in Fig. 11 for the MAP, posterior mean, non-hierarchical posterior mean, and default parameter values.The process correlations and covariances are shown for the year 2012 in Fig. 5.
In the following, "all ebullition" refers to any ebullition in the peat column regardless of whether the bubbles reach the peat column surface."Ebullition" refers to the part of all ebullition which reaches the surface.Most of the time, the water table is under the peat surface, and at those times ebullition is zero, although all ebullition can be substantial.In that case, the ebullition flux does not go directly into the atmosphere, but into the first air-filled peat layer above the water table level, and continues from there via other pathways.The reason for this separation comes from implementation details of HIMMELI.In all experiments, ebullition reaching the surface is a minor fraction of the total CH 4 emission.
For the posterior mean estimate, the flux components and oxidation are shown as time series in Fig. 12. Optimizing the model leads to increased production of methane from peat decay, as can be seen in Fig. 11f.A similar effect is seen also in the plant transport component in Fig. 11b.
Comparing results from simulations with optimized parameters to results using the default parameter values (shown in Table 4) shows that the optimization somewhat decreases the role of the plant transport pathway in favor of the diffusion pathway, especially for the years 2010, 2011, and 2013.Diffusion and all-ebullition fluxes are closely tied to each other, as can be seen in Fig. 7a, in that in many years (2007)(2008)(2012)(2013)(2014)   estimates.This is also visible in the flux component time series in Fig. 12.

Methane production and oxidation
Figures 13 and 5 show that there is considerable interannual variation in the production of CH 4 from both of the production processes.The year 2007 has a high amount of production from peat decomposition, whereas the year 2006 shows a lot less, even though the ζ exu -controlled proportion does not change equally much.Generally, though, in years of high emissions, the amount of CH 4 from both of the production sources is increased.The shape of the NPP input, shown in Fig. 9, does not change remarkably from year to year, but the emissions change considerably, as the model state and input affect the production non-linearly.For example, in times of low WTD in the peak emission season, the root exudates do not contribute to CH 4 production as much as during slightly wetter times, as much of the roots are located in the dry part of the peat column and the exudates are deposited there (Fig. 11e).Another explanation for changes in CH 4 production comes through the production-determining parameters, whose variation is in Sect.5.6 found to be related to the springtime temperature and NPP.The NPP-based CH 4 production controlled by the parameter ζ exu (-) is not strongly constrained by its hyperprior as can be seen in Fig. 6b and the MAP and posterior mean estimates.The posterior means in Table 4 are between 0.182 and 0.323 for the different years.For the MAP values, the values are slightly higher, leading to a larger input to the root exudates pool.The effect of ζ exu on the exudate pool sizes can be seen by comparing the posterior mean values to the exudate pool sizes in Fig. 9.The values obtained here are in line with values reported by Walker et al. (2003), who give a range of roughly 0.15-0.65 in terms of our ζ exu parameter, when also considering the mean value of the f CH 4 exu .This parameter finds its maximum a posteriori value at 0.729, which is close to the prescribed upper limit of 0.77.The posterior mean is at  0.736.From these results, we can conclude that a relatively large portion of the photosynthesized sugar is respired into methane.
The parameter f CH 4 exu is only affecting the part of the anaerobic respiration generated from root exudates.The two sources of anaerobic respiration (peat decomposition and root exudates) are in sqHIMMELI controlled by two different processes having different sets of parameters.The parameter controlling the peat decomposition, g Q 10 CH 4 , appearing in Eq. ( 16) and functioning analogously to f CH 4 exu , is set at the value 0.4 based on prior information, and this parameter was not part of the calibration.The discrepancy between the g Q 10 CH 4 and f CH 4 exu parameters is after the optimization rather large, and therefore, in any future calibration of the sqHIMMELI model with flux data from another site or with data from several sites, including this parameter could be also considered.If the value of 0.4 for g Q 10 CH 4 is an underestimate, the model produces too much carbon dioxide and too little methane from the peat decomposition component.However, since the production processes are correlated in the posterior distribution, as shown in Fig. 5, increasing the value of g Q 10 CH 4 would also be reflected in decreasing the production of methane from root exudates and increasing the production of carbon dioxide correspondingly.According to Fig. 5, methane oxidation would also be affected by changes to methane production from the root exudate component.Hence, excluding the parameter g Q 10 CH 4 from the optimization does not effect the total CO 2 and CH 4 fluxes in a major way, but the balance of the production processes and methane oxidation can be slightly affected.
The year-to-year variation of the posterior distributions of the ζ exu parameter, shown in Fig. 14, is large, and this difference has an important role in driving the annual CH 4 production.Especially for the years 2007, 2008, 2012, and 2014, the importance resampling has the effect of increasing the value of the parameter, correspondingly increasing the production of methane.This effect is not visible for the other hierarchically modeled production-related parameter, Q 10 , whose posterior is not affected by the resampling despite the more permissive prior.
The methane produced by the action of ζ exu is distributed according to the root distribution, whose form is determined by λ root (m).The posterior means reveal that the contribution of the prior component of λ root to the cost function is large.Its values might well be larger with a wider prior and more permissive prior, but in regard to how root distributions are in reality (Fig. 1), larger values for the parameter would make its interpretation difficult.This parameter affects both how exudates are allocated in the column and how deep the fast plant transportation reaches.Clearly, there is a need to reach further down, implying that the model performs more optimally when it transports CH 4 faster to the atmosphere.
The exudate pool size follows the net primary production in Fig. 9 with a delay, as one could expect.According to the modeling, the pool sizes are up to 0.5 mol m 2 , and the exudate pool is depleted from December until the start of the growing season.
The methane production from decomposition of peat in anaerobic conditions is aided by the rather strongly correwww.geosci-model-dev.net/11/1199/2018/Geosci.Model Dev., 11, 1199-1228, 2018 lated parameters Q 10 (-) and the catotelm carbon decay halflife τ cato (y) as seen in Fig. 4. The prior means of Q 10 (-) are mostly inside the 1σ bounds of the hyperprior, and the temperature dependence of the anaerobic respiration from peat decomposition is close to what was a priori expected.The MCMC utilized a rather strict prior, which constrained the parameter exploration somewhat.Despite this, also very low values were proposed.Methane oxidation is quite steady between the different estimates as can be seen in Fig. 13 -except for the default parameters values, with which the amount of oxidation is several tens of percent more.However, there is considerable interannual variability, which seems to be related to the varying production from exudates, as seems to be suggested in Fig. 5 and also in Fig. 13.
The stronger oxidation with the default parameter values can be for its part also linked to the larger V O0 (mol m −3 s −1 ) parameter, despite the other parameter determining oxidation in Eq. ( 7), E oxid , being slightly lower (50 000 vs. 53 580 for MAP and 55 750 for posterior mean).
The process correlation figure (Fig. 5) also shows that the exudate-and peat-decomposition-based methane production terms are negatively correlated, and that the exudatebased production is roughly 50 % stronger than the peat decay source.
The hard prior bounds of V O0 (mol m −3 s −1 ) were tight; for example, Segers (1998) reports that potential CH 4 oxidation can vary across 3 orders of magnitude.Hence, also lower proportions of CH 4 oxidation could have been seen with a more permissive prior.This would have then also altered the posteriors of the weakly co-varying parameters, most notably λ root .
The parameter V R0 (mol m −3 s −1 ) controlling heterotrophic respiration correlates positively with CH 4 production via τ exu (s) (smaller value enhances methane production), but the correlations with Q 10 and τ cato seem to cancel out each other.The correlations of ζ exu are weak, implying that process is well constrained by the combined CO 2 and CH 4 data.There is also a weak anticorrelation between V R0 and E R , which is to be expected based on Eq. (6).

Plant transport
The amount of plant transport in the calibrated models, shown in Fig. 7a, is between 75 and 95 %, which is just slightly higher than the range of 68-85 % reported in Wania et al. (2010) in a study simulating CH 4 emissions for seven boreal peatlands.The high optimized share of plant transport is mainly due to the high values of the root depth controlling parameter λ root (m) and some of the difference between the MAP and posterior mean estimates in Fig. 7a may be explained by the higher root-ending cross-sectional area in the MAP estimate, controlled by parameter ρ (m 2 kg −1 ).Wania et al. (2010) used the parameterization from Eq. ( 1) with λ root = 0.2517, and the root distribution from the pos-terior mean estimate is shown alongside that distribution in Fig. 1.Compared with measurements from Saarinen (1996), the amount of roots at 20-60 cm is exaggerated by all of the optimized parameter values.The model provides a better fit to the data when the root conductance is high.However, the posterior distribution of the root tortuosity parameter in Fig. 6 is almost identical to the prior, so obviously there is no need to maximize plant transport at any cost.
Since the parameters ρ (m 2 kg −1 ) and τ (m m −1 ) both affect plant transport and are included in Eq. ( 8), one could expect them to be tightly coupled.In the posterior, however, they are only slightly correlated, with the correlation coefficient of only 0.12 in Fig. 4.This might be due to ρ having the tendency to be close to its the lower limit.The rootending area parameter ρ has a notable negative correlation with the air diffusion coefficient f D,a (-).This follows directly from the fact that increased root-ending area increases root conductance, as does faster diffusion through the airfilled aerenchyma cells, via Eq.(8).

Diffusion
The masses of the diffusion coefficient parameters f D,a (-) and f D,w (-) in the posterior distributions (Fig. 6j and k) are within the rather permissive priors having the value of 0.8.The parameter f D,w is optimized close to the upper limit of 1. Kaiser et al. (2016) note that these parameters are not well known and use for both of them the value of 0.8.Constraining the model with the CO 2 flux measurements results in the diffusion component not correlating with the amount of methane produced via anaerobic peat decomposition.

Ebullition
Ebullition is very strongly tied to diffusion in the flux estimates with parameters from the posterior, as is shown in Fig. 5.The flux component time series in Fig. 12 shows that ebullition to the surface is a small fraction (approximately 0-3 % with optimized parameters) of the total flux.Similarly, Wania et al. (2010) report almost virtually no ebullition to the surface.This result is highly dependent on the type of the wetland; for instance, Kaiser et al. (2016) report high ebullition fluxes for a polygonal tundra in the Siberian permafrost region, where the ice-free soil layer reaches only about 30 cm depth during summer.Variation between different sites is very large and depends on whether the water reaches the surface at times of high CH 4 emission.
Contrasting with this, in the simulations with the nonhierarchically optimized parameters, a major part of the diffusive flux, which comprises around 30 % of the total flux for most years, is transported by ebullition (Fig. 8) and diffusion is a major flux component, even though ebullition to the surface accounts for only 5 % of the total flux.Since ebullition is a fast timescale process, it was not directly constrained in the optimization with parameters, as preliminary tests re-vealed that daily data resolution would not be sufficient for this.While finer time resolution data would have been available, using them would not have been feasible, as there is not enough knowledge about the fine structure of the wetland and micrometeorological conditions affecting the footprint area of the flux tower.It is reasonable to believe that the deviations from the daily averaged fluxes at a finer time resolution would only look like noise in the residuals, not improving our parameter posterior.Despite this, ebullition is controlled indirectly by letting CH 4 production and transport parameters control when the water column has enough CH 4 available for ebullition.This happens when the sum of the partial pressures of dissolved gases is larger than the sum of atmospheric and hydrostatic pressures as shown in Eq. ( 23).The high ebullition-related proportion of the diffusive flux strengthens the argument that the likelihood formulation results in the model optimizing towards parameter values that support rapid CH 4 transport.

Parameter and process identifiability
The priors of the hierarchical CH 4 production-related parameters Q 10 (-) and ζ exu (-) in Fig. 6b and d are constrained by the data, as are the hierarchical parameters themselves, shown in Fig. 14.The priors of these distributions are wider than their posteriors, which is also the case for the other production-related parameters τ exu (s) and τ cato (y).Both process descriptions for obtaining the anaerobic respiration are clearly needed for a good model fit, because the parameter posteriors do not have remarkable mass in the regions minimizing either of these processes (hierarchical parameters at the lower bounds or turnover rate parameters τ exu and τ cato at the upper bound).The covariances in Figs. 4 and 5 show that the two production processes covary slightly, with correlation coefficient −0.32, and hence they are to that extent interchangeable.Reasonable identifiability of the Q 10 parameters is not obvious; for example, Müller et al. (2015) optimized a corresponding parameter to end up with the parameter at the lower bound of their prescribed range.However, half of the mass of the production terms in the process correlation plot, Fig. 5, lies within a region that for production from exudates is roughly 10 % of the total production and for the production from peat decay of the order of 35 %, and hence the production processes can be said to be well constrained.
The posterior distributions of V R0 (mol m −3 s −1 ) show that sqHIMMELI performs better when the heterotrophic respiration is close to being minimized, which is also aided by a posterior mean value of E R (J mol −1 ) that is lower than the prior mean.For the oxidation parameters V O0 (mol m −3 s −1 ) and E oxid , the situation is different: the former has the tendency of being very small, but the temperature response has the tendency of being stronger with posterior mean and MAP values above the prior mean.
Whereas the fraction of plant transport is stable and high, but still constrained, not all the parameters affecting root con-ductivity are constrained by the data, as the root tortuosity posterior distribution follows very closely the prior form.The root-ending cross-sectional area, however, is constrained to its lower side despite there being mass also above the prior mean value.For this parameter, the importance resampling resulted in a changed posterior in that there is a lot more mass at the higher end of the distribution, as can be seen in Fig. 6h.In addition to this difference, the effects of the resampling were mostly minor.Still, the resampling informed that the roots should reside slightly higher in the peat column than suggested by the MCMC, and that the f exu CH 4 is constrained to a higher value by the data than suggested by the initial MCMC run.
The transport pathways are well identified, as can be seen in the ranges of variation in the transport characteristics in Fig. 5. Notably, the transport processes do not strongly anticorrelate implying that they are not obviously interchangeable with each other.The correlation between oxidation and plant transport suggests that uncertainty in oxidation is a major part of the uncertainty in the plant transport portion.On the other hand, there is uncertainty in the absolute magnitude of the total flux (in terms of the posterior uncertainty) and this is reflected in the strong positive correlation between plant transport and the total flux.Similar but weaker positive correlations exist between the total flux and diffusion and ebullition, which is to be expected.The variation of oxidation is around 10 % of the total flux.

Low WTD in 2006, 2010, and 2011
The calibrated sqHIMMELI model is able to describe the CH 4 flux correctly in times of low water table, which is not obvious, as other studies have indicated the challenges in parameterizations of emission models with respect to the water table depth (e.g., Zhu et al., 2014).Figure 11 shows how the model processes are described under water stress.In times of a very low water table, the plant transport component and methane production from root exudates are decreased somewhat, as is methane oxidation.This results directly from how the model is constructed, as exudate deposition to the peat column is allocated depth-wise according to the root density profile.The fact that the model continues to perform well during these years implies that this method of regulating methane emissions during dry seasons is realistic.The residuals in Fig. 11h further show that there is a only a slight positive emission bias at the times of the lowest water table levels.

Predicting emissions with sqHIMMELI
Modeled CH 4 flux estimates may have large errors, as was shown in Fig. 8 with the default parameter set.The negative biases in the calibration phase that were found with the maximum a posteriori and posterior mean estimates are reasonable since the quality of the modeled input data from, e.g., a www.geosci-model-dev.net/11/1199/2018/Geosci.Model Dev., 11, 1199-1228, 2018 land surface scheme will also contribute to the uncertainty in the model predictions.Additionally, a known constant bias can be relatively easily accounted for if the interannual variability is correctly modeled.
Compared to the estimate with the optimized annual variations of the methane-production-related parameters, the nonhierarchical posterior mean estimate produces reasonable flux estimates over the assessment period, with twice the variability in fluxes compared to the posterior mean estimate, even though the average of the errors is closer to zero.The variability is seen in Fig. 7.The hierarchical posterior mean, on the other hand, does produce very steady estimates of the CH 4 flux compared with observations even though there is a downward bias of 23 %, and the smaller interannual variance implies better predictive skill.The same is true to a lesser extent also for the maximum a posteriori estimate.
In order to be able to utilize the information regarding the annual variability in the posterior mean estimate for the future prediction of CH 4 emissions, the values of the hierarchical parameters need to be estimated for the simulation years.A simple regression analysis of the hierarchical variables with respect to relevant input data was performed in order to find out if such estimation is possible.As the explaining variables, means, minimums, and maximums of NPP, water table depth, and soil temperature at different depths and over different periods of time were looked at.These time periods were June, July, August, and various different amounts of days from the start of the year.
The analysis revealed that the mean soil temperature of the first 10 weeks (70 days) of the year at the depth of 30-40 cm, denoted here by T 70 30−40 , is the best single-variable predictor of the Q 10 value for that year, and for ζ exu , it is the sum of NPP from the first 130 days of the year, denoted by NPP 130 .This is hardly surprising, since the peat decomposition process regulated by the parameter Q 10 is driven by soil temperature, and the anaerobic respiration from exudates controlled by the parameter ζ exu is driven by the NPP input.These variables also indicate that the timing of the start of the growing season might play a role in determining the parameters.Possible mechanisms could include, e.g., effects of the start of growing season on development of the microbe populations in the spring.However, further analysis would be needed to confirm this.
The p values summarizing the reliabilities of the regressions and the r 2 values, which are the coefficients of determination of the fit, are presented in Table 5.The r 2 values explain what fraction of the variance of the dependent (predicted) variable is explained by the independent (explaining) variables.The p and r 2 values uncover that the hierarchical modeling reveals a clear-cut reliable relationship between the early NPP and the optimal ζ exu parameter (p = 5 × 10 −6 , r 2 = 0.957).This provides new insight into future model development and exemplifies why such a hierarchical description of variables is valuable in Bayesian optimization in a geophysical model context.
For the other interannually changing parameter, Q 10 , the soil temperatures explain only slightly over half of the variation (p = 0.0185, r 2 = 0.571).Since the effect of this parameter is very important for the total methane flux, this result leaves lots of room for further analysis.The hierarchical parameters Q 10 and ζ exu for each year can be estimated with A leave-one-out cross validation (LOO-CV; see, e.g., Gelman et al., 2013) of the regression modeling was performed by optimizing the hierarchical parameters with respect to the cost function in Eq. ( 24) leaving one year at a time out, calculating the estimates for the hierarchical parameters based on the results obtained for other years, and predicting the CH 4 emissions for the year that was left out.The results of the cross validation are shown in Figs.7b and 8.The crossvalidated results are comparable in terms of annual performance to the non-hierarchical posterior mean.Despite the relatively good performance of the non-hierarchical posterior mean simulation, we note that the cross-validated result should be relied on more for prediction, since the wellpredictable ζ exu parameters contain useful information that is not available in the non-hierarchical posterior mean estimate.A hybrid between these approaches could be also used, using the regression-modeled values for the ζ exu parameters and the mean for Q 10 , to minimize the risk of major annual biases due to unsuccessful prediction of the Q 10 parameters.
As Fig. 7b shows, much of the error in the cross validation actually comes from challenges estimating the year 2007, which is missing the peak season observations, and therefore the error percentage (in terms of the annual observed flux) is easily high, especially as the start of season is modeled with a delay, which is readily apparent in Fig. 9, and in this sense the negative bias in Fig. 7 gives an unnecessarily pessimistic view of the model performance.For the CO 2 fluxes, it can be noted that there is a persistent positive bias of some tens of percent, but the observations are very noisy and due to the processing for the use in the cost function, they might have biases.The effect of a small bias on the parameter posterior distribution is, however, minor, since the carbon dioxide observations were given less weight in the cost function than the methane observations.Hence, given their uncertainty, the optimized fit to the measurement data can, also in the cross validation as in the other experiments, be seen as acceptable.

Conclusions
In this study, Bayesian calibration of a new process-based wetland CH 4 emission model, sqHIMMELI, was performed using Markov chain Monte Carlo methods, hierarchical statistical modeling of methane production related parameters, Box-Jenkins-type time series modeling, and importance resampling against daily methane and carbon dioxide flux data from the Siikaneva flux measurement site in Finland.The results show that the modeled processes and the estimated parameters are identifiable with the flux data.The parameter correlations and process correlations from random sampling the posterior reveal that there are no redundant processes in the model description.However, a few strong correlations between parameters exist, reminding about the difficulty of strictly interpreting parameter values to be connected to isolated physical processes.The optimized model fits well to the data in that the modeled fluxes fit within a range from the data that is expected based on the error modeling.
Preliminary results obtained also suggest that estimation of the annual variation of the parameters controlling methane production from anaerobic respiration of root exudates is feasible and may help to improve the future estimates of the boreal wetland CH 4 emissions.
For future studies, combining observations from several sites and optimizing them together with the methods presented here in conjunction with independent validation can provide valuable information about the uncertainties related to wetland emission modeling and about how to best improve the quality of predicting wetland methane emissions in land surface schemes of climate models.
Code and data availability.The HIMMELI source code is available as a supplement to the publication of Raivonen et al. (2017).The sqHIMMELI model code is available as a supplement to this publication.
The model input data and the flux measurement data are available upon a reasonable request to the lead author.

Figure 2 .
Figure2.Residual histograms and autocorrelation functions of the error terms t in the objective function, Eq. (24), show that neither the CO 2 nor the CH 4 residuals are autocorrelated and that they closely follow the Laplace distribution.The results shown are for the residuals from the posterior mean estimate.

Figure 3 .
Figure 3. MCMC chains showing a thinned sample of the half million values in the chain.The first 70 % was discarded for the analyses as a warm up and is grayed out in the figures.The hierarchical parameters in panels (b) and (d) show the mean value in the middle as a black mass, and the colorful surroundings are the values of the parameters for the individual years.Panel (o) shows the value of the objective function.

Figure 4 .
Figure 4. Posterior distributions of the parameters from the importance sampling.The two-dimensional marginal distributions of the posterior distribution are shown in the triangle on the lower left (labels on the left and at the bottom), and the correlations between parameters are shown in the upper triangle on the right (labels on the left and at the top).The images in the lower left triangle show the 90 % (black), 50 % (red), and 10 % (blue) contours, and points from a random sample of the posterior (black dots).On the upper right, each plot shows correlation coefficients between parameters, color coded to show negative correlations in blue and positive in red.The units are listed in Table4.

Figure 5 .
Figure 5. Posterior distributions and correlations of the annual means of the output from the modeled processes for the year 2012.The dynamics for the other years are mostly similar, but the strengths of the correlations vary somewhat.The results shown are based on 1000 random samples from the parameter posterior distribution.The two-dimensional marginal distributions in the triangle on the lower left have their labels on the left and at the bottom, and the correlations between the processes in the upper triangle on the right have their labels on the left and at the top.The images in the lower left triangle show the 90 % (black), 50 % (red), and 10 % (blue) contours.The all-ebullition and diffusion fluxes correlate almost fully, showing that the "diffusion" flux has a strong contribution from underground ebullition.

Figure 6 .
Figure6.Posterior marginal and prior distributions from MCMC and importance resampling for all parameters: panels (a-d) and (n) are the production-related, (e-f) and (l-m) the respiration-and oxidation-related, and (g-k) the gas-transport-related parameters.The blue and orange curves shown are smoothed slightly using Gaussian kernel estimates for readability.To make these figures, 70 % from the start of the MCMC chain was discarded as a warm up (orange line).The dotted vertical lines show the prior mean values and the sample means from both MCMC and importance sampling.For the parameters ζ exu (b) and Q 10 (d), the prior distribution drawn is the hyperprior.

Figure 7
Figure 7. (a) Proportions of flux components as a function of the year.Diamonds are for plant transport, balls for the diffusion flux, and crosses describe the total ebullition taking place.The figure on the right shows the annual model-observation mismatch in percent for the methane flux, where only residuals from days with observation data available have been taken into account.The data in panel (a) have been spread slightly for readability in the x-axis direction.The orange line in panel (b) represents the results from the cross validation discussed in Sect.5.6.Note that the optimization target was not to directly fit annual emissions.

Figure 8 .
Figure 8. Fractions of the annual diffusive fluxes of the total fluxes.Means and 1σ error bars are shown.Almost all ebullition takes place when the water table is below the peat surface, and hence it is emitted to the atmosphere as part of the diffusion flux.Plant transport is not shown, as it is very close to the complement of the diffusive flux: together, these two streams add up to more than 98 % of the total flux.Plant transport variation is very close to that of diffusion.On the right side of the figure, the average annual errors are shown for the interannual variation of the fluxes.The results of the cross validation of the regression modeling of the hierarchically varying parameters, discussed in Sect.5.6, are drawn in orange.The "default" parameters produce carbon dioxide fluxes that are above the upper limit of the chart.

Figure 9 .
Figure 9. Output CH 4 flux (red dots) with parameters from the posterior mean.Methane observations (black crosses) and predicted fluxes with confidence intervals from ARMA(2,1) modeling of a set of 1000 residual time series are shown, as are the input net primary production (green dots) and the exudate pool sizes (brown line).Most of the observations are inside the confidence intervals, but note that the effects of the parameter variations in the posterior are not part of these confidence intervals.The constituents of the total flux are shown in Fig. 12.

Figure 10 .
Figure10.Output CO 2 flux (red dots) with parameters from the posterior mean.Carbon dioxide observations (black crosses) and modelpredicted fluxes with confidence intervals from the ARMA(2,1) modeling of a set of 1000 residual time series are also shown.As with methane, most of the observations are inside the confidence intervals.The parameter variations in the posterior probability distribution are not reflected in these confidence intervals.

Figure 11 .
Figure11.Means of total CH 4 emission (a), its components (b-c), total ebullition taking place (d), CH 4 production (e-f), CH 4 oxidation (g), and model residuals (h) as functions of water table depth.Shaded areas show the 5th and 95th percentiles.To look at the effect of the optimization, compare the black and the blue/red lines.

Figure 12 .Figure 13 .
Figure 12.Diffusion, plant transport, ebullition, CH 4 production, and CH 4 oxidation time series for parameter values from the posterior mean estimate.The figure shows how only a minor part of ebullition in the end comes to the surface as ebullition.The total flux and the observations are shown in Fig. 9.

Figure 14 .
Figure 14.Posterior marginal distributions of the hierarchical parameters from both MCMC and importance sampling, along with the hyperpriors.Panels (a1)-(a9) are for the parameters ζ , and (b1)-(b9) for Q 10 .The curves shown are smoothed slightly using Gaussian kernel estimates for readability.To make these figures, 70 % from the start of the MCMC chain was discarded as a warm up.The dotted vertical lines show the default parameter values and the mean values of the posterior distributions.Importance resampling had the tendency of moving the posteriors of the ζ parameters slightly higher, despite the weaker prior used for that step.

Table 1 .
Description of the data used. , The parameter g Q 10 CH 4 was left out in favor of parameter τ cato , despite their functions regarding CO 2 being different but trusting the prior value.

Table 3 .
Parameter limits and prior distribution parameters.The priors are truncated Gaussian, with mean values µ and standard deviations σ , truncated at the values in the columns "low" and "high".

Table 4 .
Parameter values obtained in the optimization of the sqHIMMELI model with importance resampling.The maximum a posteriori, posterior mean, non-hierarchical mean (mean values used for hierarchically varying parameters), and values from Raivonen et al. (2017) are shown.The horizontal line in the middle separates the hierarchically optimized parameters (including their priors) from the others.

Table 4
lists the cost function values for the MAP and posterior mean estimates, and the annual errors for the MAP, posterior mean, and non-hierarchical posterior mean estimates and default parameter values are shown for each parameter set in Fig. 7.The cost function value is unsurprisingly lower www.geosci-model-dev.net/11/1199/2018/Geosci.Model Dev., 11, 1199-1228, 2018

Table 5 .
The p and r 2 values of the regressions of the Q 10 (-) parameters against the mean soil temperature of the first 10 weeks of the year at the depth of 35 cm, and the ζ exu parameters against the sum of the net primary production of the first 130 days of the year.