Forest soil carbon stock estimates in a nationwide inventory : evaluating performance of the ROMULv and Yasso 07 models in Finland

Dynamic soil models are needed for estimating impact of weather and climate change on soil carbon stocks and fluxes. Here, we evaluate performance of Yasso07 and ROMULv models against forest soil carbon stock measurements. More specifically, we ask if litter quantity, litter quality and weather data are sufficient drivers for soil carbon stock estimation. We also test whether inclusion of soil water holding capacity improves reliability of modelled soil carbon stock estimates. Litter input of trees was estimated from stem volume maps provided by the National Forest Inventory, while understorey vegetation was estimated using new biomass models. The litter production rates of trees were based on earlier research, while for understorey biomass they were estimated from measured data. We applied Yasso07 and ROMULv models across Finland and ran those models into steady state; thereafter, measured soil carbon stocks were compared with model estimates. We found that the role of understorey litter input was underestimated when the Yasso07 model was parameterised, especially in northern Finland. We also found that the inclusion of soil water holding capacity in the ROMULv model improved predictions, especially in southern Finland. Our simulations and measurements show that models using only litter quality, litter quantity and weather data underestimate soil carbon stock in southern Finland, and this underestimation is due to omission of the impact of droughts to the decomposition of organic layers. Our results also imply that the ecosystem modelling community and greenhouse gas inventories should improve understorey litter estimation in the northern latitudes.


Introduction
Soil carbon is a significant component of terrestrial carbon stocks and understanding its dynamics under changing climate is crucial.However, the significance and interactions of different mechanisms for long-term carbon accumulation are still unknown and are therefore often lacking in models.If we want to understand the relationship of different abiotic and environmental factors to soil carbon stocks and their dynamics, we have to combine experimental research with process-based models.One way forward is to establish soil carbon inventories in order to quantify soil carbon stocks and their change.Conventional soil inventories measuring various nutrients, carbon contents, bulk densities and stoniness (e.g.Gamfeldt et al., 2013) allow us to study the distribution of soil carbon across landscapes and correlations between different soil properties.Generally, it is shown that soil carbon inventories are able to produce soil maps and covariates between soil carbon quantities and other variables, such as various nutrients, although the sample size of these inventories is usually not adequate for national-level soil carbon stock change assessment, with few exceptions (e.g. in Swe-den and Germany) (Gamfeldt et al., 2013;Grüneberg et al., 2014).
According to the United Nations climate convention, countries are requested to report annual carbon stock changes of soils under different land uses and under land-use change.The majority of countries apply soil carbon models, like Yasso07 (Tuomi et al., 2011) and CENTURY (Parton et al., 1987) to estimate soil carbon stock changes.These annual submission are reviewed annually and methods should be transparent and verifiable, favouring simple soil models instead of complex ones.
The scientific community is also aiming to predict future soil-climate change feedbacks on a global scale using Earth system models (ESMs).The ESMs are tested against soil carbon measurements in order to evaluate model performance but unfortunately results have been poor.Guenet et al. (2013) present a test where soil carbon stocks predicted by the OR-CHIDEE model were plotted at a plot level against measurements and failed to display any correlation.Similarly, Todd-Brown et al. (2013) concluded that most ESMs are not able to produce measured soil carbon stocks at a grid level.This finding is somewhat alarming due to fact that it is essential to have correct initial carbon stocks with soil models, because carbon stock change estimates depend on them.Initial soil carbon stock estimates are particularly important when carbon stock changes of deforestation events are modelled.
Individual soil carbon models are tested against repeated soil inventories and it has been found that models are able to estimate soil carbon stock change of the same magnitude as was measured (Ortiz et al., 2009;Rantakari et al., 2012).The limitation of this conclusion was that uncertainties of both measurements and model estimates are often higher than actual estimates (Ortiz et al., 2013;Rantakari et al., 2012).While the uncertainties between model output and real measurements reveal whether models agree with data or not, they put less emphasis on whether all the most important soil carbon stock drivers were included in these models.
Simplistic soil carbon models like Yasso07 (Tuomi et al., 2011) are driven only by weather conditions and by litter input; while more complex models, like ROMUL (Chertov et al., 2001) also include the nitrogen cycle and the impact of soil water holding capacity on decomposition.It is evident that soil properties affect soil carbon stocks (Schimel et al., 1994;Six et al., 2002), and therefore they are explicitly included in complex models.For example, in the CEN-TURY model, clay content limits decomposition (Parton et al., 1987) and due to the low specific surface area of clay minerals in the model, clay-rich soils have larger passive soil carbon stocks and lower C : N ratios (Parfitt et al., 1997).In simplistic models, soil properties are omitted.For example, in Yasso07, the effect of soil properties on soil carbon stock accumulation is included only implicitly through the model calibration with large datasets (Tuomi et al., 2011).Although the simpler models lack some predominant drivers of soil carbon accumulation, the strength of these models lies in their easier calibration with data; however, the impact of soil properties, especially nutrient status, on the accuracy of estimated soil carbon stock estimates needs to be re-evaluated in both the CENTURY and Yasso07 models ( Ťupek et al., 2016).
It is well known that decomposition and soil respiration is controlled by water content, whereby in dry soils, lack of water slows down decomposition, while excess water reduces it by limiting oxygen diffusion (Skopp et al., 1990).For boreal forest conditions, Pumpanen et al. (2003) proposed a model whereby maximum soil respiration drops after the relative water content of soil reaches a level of 60 %.The limiting effect of soil moisture on decomposition in dry or water-saturated soils is widely included in models, although the degree of this dependency varies widely between models (Sierra et al., 2015).
In addition to the model structure, precise and accurate estimates of litter input quantity and quality are also essential for successful model applications.Stand-alone soil models rely on forest inventory data or other external estimates for getting correct litter inputs, while ecosystem models utilise plant submodels to describe vegetation productivity and litter inputs.A common feature of ecosystem models and standalone soil models is that often understorey vegetation is neglected during the calibration and application of models, resulting in biased litter inputs.This omission is more critical in boreal landscapes where the contribution of understorey vegetation increases with northerly latitude.For example, Yuan et al. (2014) report that the bryophyte biomass contributes 20-60 % of the total normalized difference vegetation index (NDVI) in high northern latitudes.
In order to assess the necessary drivers for soil carbon models in Finland, we tested the performance of Yasso07 and ROMULv models against measured nation-wide soil C data.The ROMULv refers to a modified ROMUL model version where decomposition rate functions are derived from the model presented by Pumpanen et al. (2003), and a simple volumetric soil water model (Linkosalo et al., 2013) is applied to drive those decomposition functions.
We were specifically interested if the additional complexity of processes introduced by ROMULv improves the soil C stock estimate over Yasso07 model that is presently used by greenhouse gas (GHG) inventories of several countries.The Yasso07 and ROMULv models differed in their time steps (annually vs. daily), determination of the litter quality (litter solubility vs. nitrogen content and litter source) and complexity of drivers (ROMULv including the impact of nitrogen and soil water holding capacity to decomposition).Specifically with these models, the following details were tested.
1. We investigated whether the litter quantity, speciesspecific litter quality and weather data are enough to estimate spatial trends of carbon stocks in the upland soils of Finland.We hypothesise that by accounting for soil water holding capacity, ROMULv would outper- 2. We hypothesised whether improving the estimation of understorey litter input would positively affect accuracy of predicted soil carbon stocks.
We use Yasso07 and ROMULv soil models to estimate steady-state carbon stocks for upland soils in Finland on a spatial 10 × 10 km 2 grid.We ran the Yasso07 model with parameters based on Scandinavian data (Rantakari et al., 2012) and also with parameters based on global data (Tuomi et al., 2011).The parameterisation for the ROMULv model was the same as in the original ROMUL model (Chertov et al., 2001), except for decomposition rate functions depending on soil water content derived from Linkosalo et al. (2013).Yasso07 and ROMULv models run with identically estimated litter quantity, quality and climate data.In addition, the ROMULv model was tested with both constant soil water holding capacity as well as variable water holding capacity based on digital soil maps of Lilja and Nevalainen (2006).Furthermore, we developed new models for the understorey litter input and apply them alongside soil carbon models for improved estimates of spatial variation of litter inputs.Simulated soil carbon stocks are evaluated against measured soil carbon stocks.

Material and methods
Soil carbon simulations were performed on a 10 × 10 km 2 grid across Finland.This grid is used for meteorological data prediction (Venäläinen et al., 2005).Litter input was estimated for the same grid.From the grid, only locations that were on upland soils and forest (according to the Food and Agriculture Organization of the United Nations (FAO) forest definitions) were chosen.This classification is based on Multisource National Forest Inventory products, which combine digital maps (including, e.g.land use and peatlands), forest inventory data and Landsat images (Tomppo et al., 2008).

Tree biomass
Firstly, stem volume maps by tree species from the National Forest Inventory 9 (NFI9, 1996(NFI9, -2003) ) were used, according to Tomppo et al. (2011), to account for large-scale variation of stem volume across Finland.These variations are primarily driven by soil properties, climate, site productivity, forest management techniques and tree species distribution.Secondly, biomass expansion factors (BEFs, Mg m −3 ) were estimated for main tree species groups, namely Scots pine (Pinus sylvestris), Norway spruce (Picea abies) and for broadleaved species (mainly Betula sp.).These BEFs were estimated for each biomass component (foliage, branches, bark, stemwood, stump and woody roots).Biomass was estimated for sample trees in NFI10-based biomass models by Repola (2008Repola ( , 2009)); thereafter, mean BEFs were estimated at a cluster level (a cluster is formed by 10 to 15 field plots) by dividing the sum of given biomass components by the estimated sum of stem volumes.Biomass estimates for trees were based on biomass models, where diameter at breast height, tree height, crown height, an increment of 5 years and bark thickness are used as predictors.To upscale BEFs across Finland, we applied collocated co-kriging (Bivand et al., 2008) by species group, to account for large-scale spatial correlation and co-variation of tree allometry.We used the "gstat" (Pebesma, 2004) package of R (R Core Team, 2014) for estimation.For Scots pine and Norway spruce, we removed linear trends of latitude and longitude (using uniform coordinate system of Finland, YKJ), while for deciduous trees only trends of latitude were removed.For all species groups and components we assumed spherical variogram functions.For the details of the used biomass models, see Appendix 7c by Statistics Finland (2014).Biomass components for each grid point were obtained by multiplying stem volume maps by species with component speciesspecific BEF estimates that were estimated via co-kriging for the same grid.
Biomass of harvest residues and natural mortality were estimated based on forest statistics and NFI data (Ylitalo, 2013).From statistics, we attained an estimate of the stem volume of annual logging and natural mortality by region (forestry centres) and these were subsequently converted to biomass with BEFs.These BEFs were based on a subset of permanent sample plots of NFI9 (1996NFI9 ( -2003) ) and NFI10 (2004)(2005)(2006)(2007)(2008).BEFs for logging were estimated separately from the subset of logged plots and these logging-specific BEFs were used in the estimation of biomass of harvest residues.Furthermore, energy use of stumps and harvest residues were deducted from regional soil inputs, based on regional wood energy use (Ylitalo, 2013).For biomass of natural mortality, BEFs were estimated based on data from those trees that died on permanent sample plots between measurements.Thereafter, the volume of natural mortality was multiplied with corresponding BEFs.This procedure followed principles of Finnish GHG inventory (Statistics Finland, 2014).
Fine root biomass was estimated based on the work of Lehtonen et al. (2016).We selected a simple model formulation, where a natural logarithm of fine root mass was estimated as a function of the natural logarithm of stem volume.See model 1 in Lehtonen et al. (2016) for details (the study has 95 sites with fine root data, and model 1 has an adjusted R 2 of 0.217).This model was used to approximate general fine root mass levels as a function of stem volume for each 10 × 10 km 2 grid point.

Understorey vegetation biomass
In order to estimate the litter input of understorey vegetation to soils, we developed models for vegetation biomass.The relationship between the visually estimated percentage cover of plant species including vascular plants, bryophytes and lichens (projected onto the forest floor using 30×30 cm 2 frames) and their living biomass was studied in 18 forest stands across Finland (Table 1).The plots are part of the International Co-operative Programme on Assessment and Monitoring of Air Pollution Effects on Forests (ICP) intensive monitoring plot network (e.g.Merilä et al., 2014).Altogether, 28 systematically selected biomass samples were taken once from each plot during the period 2002-2009.In total, the sample size was 504 of 0.3×0.3m 2 vegetation plots having detailed biomass measurements by species.Vascular plants were divided into aboveground (shoots) and belowground (rhizomes and roots in organic layer) parts.The study was carried out at the time of maximum biomass and vegetation growth between the end of July and end of August.Half of the plots were located in the north (five in Pinus sylvestris, three in Picea abies and one in Betula pubescens dominated stands) and the other half were in the south (four in Pinus sylvestris, four in Picea abies and one in Betula pendula dominated stands).The site types of the plots ranged from poor to rich fertility level (Table 1).
In order to predict the vegetation biomass, we built linear mixed models based on the relative coverage of five functional plant groups (dwarf shrubs, grasses, herbs, bryophytes and lichens).The biomass of understorey vegetation was estimated using models that correlate vegetation coverage with measured biomass.These models were estimated for the five aforementioned main species groups and for their belowground parts, if applicable.These models were estimated separately for southern and northern Finland.We used a linear mixed model with plot-level random effects using the "lme" command in the "nlme" package (Pinheiro et al., 2012) of the R environment for the estimation (R Core Team, 2014).
Each model was weighted according to the land area of different site types in southern and northern Finland using the weights option in lme.This was done to ensure that the sample of understorey biomass plots did not underestimate the weight of the most common site types (those of medium fertility).The weighting was performed on land areas based on NFI11 data (Ylitalo, 2013).Models were linearised by taking natural logarithms from biomass and coverage.For model parameters, bias correction and their uncertainties, see the Supplement (Table S1, Figs.S1 and S2).
To quantify the biomass of understorey vegetation, we used species-specific coverage measurements of 2501 permanent sample plots from 1995 forest inventory data (Mäkipää and Heikkinen, 2003).We applied collocated cokriging methods to account for the correlation between species groups and thereafter we generalised measured understorey coverage on a 10 × 10 km 2 grid on upland soils across Finland (Bivand et al., 2008).The co-kriging method was similar to that of BEF estimation described above.First, we de-trended all species group coverage (shrubs, herbs and grasses, lichens and bryophytes) by linear trends of latitude and longitude (using uniform coordinate system of Finland, YKJ).After that, we estimated variograms and crossvariograms and applied co-kriging methods to predict coverage for understorey coverage for Finland by using the gstat package of R (R Core Team, 2014) (for R code, variograms and cross-variograms, see the Supplement).For all species groups, we assumed spherical variograms with a common range of 100 km.

Litter input estimation
To estimate litter input from living biomass components to the soil, litter turnover rates were used (Table 2).For needles, we used the detailed 10 × 10 km 2 grid litter turnover rates reported by Tupek et al. (2015).For fine tree roots, we assumed that the life span was 1.18 years for southern Finland and 2 years for northern Finland (Kleja et al., 2008;Leppälammi-Kujansuu et al., 2014).In order to interpolate fine root turnover rates for Finland, we assumed a linear dependency for turnover rate and mean temperature sum (mean for 1981-2010).For areas with a temperature sum (5 • C threshold) higher than 1200 • C, a turnover rate of 85 % was used, and for areas where temperature sums were less than 700 • C, a rate of 50 % was used.Turnover rates were interpolated between temperature sums of 700 and 1200 • C (see the Supplement).
For branch and coarse root litter, the constant turnover rates of the branches were used across Finland (Lehtonen et al., 2004;Muukkonen and Lehtonen, 2004).For bark litter, we assumed a constant ratio to stem biomass according to Viro (1955) and Mälkönen (1977).For fellings and natural mortality, we assumed that all biomass was left to the site, excluding stems for wood used and harvesting residues for energy use.The quantity of wood use was based on regional forestry statistics (Ylitalo, 2013).
The turnover rates of different functional plant groups were partly based on literature (Table 2) but we also estimated some rates from our own biomass samples of understorey vegetation described above.We calculated the proportion of the current-year growth out of the total living biomass of dwarf shrub shoots to estimate the turnover rates for aboveground plant parts.We used the average values of deciduous and evergreen dwarf shrubs.Most of the grasses growing in boreal forests (like Deschampsia flexuosa) are perennial, and we used the ratio between the living and dead biomass to estimate the turnover rate of those species.The aboveground parts of the most herb species (like Maianthemum bifolium) are annual, so we used 100 % for their turnover rate.Furthermore, we used the number of annual growth segments in the green upper parts of the bryophytes to estimate their rate.We assumed that over a long time period the estimate for annual biomass growth corresponds to the amount of litter input to the soil from understorey vegetation.
Nitrogen content of all litter components was derived from Komarov et al. (2007); for details, see the Supplement.

Application of soil carbon models
Carbon stocks were estimated by running Yasso07 and RO-MULv models into a steady state (i.e. a state where carbon input for the model equals carbon flux due to decomposition).For a dynamic model, a steady state is a state for which the model aims with given inputs.Here, model inputs were average climatic conditions  and litter inputs for each grid point, depending on dominant tree species and understorey vegetation coverage plus regional estimates of litter from harvesting and natural mortality (sc.forestry centres).If we assume that this average level of inputs and climate has remained steady over centuries, then our soils should approach steady-state conditions.Litter inputs were given to these models as 10 × 10 km 2 grid points and weather data of that given grid point were then used for model input.For Yasso07, steady state was simulated by running the model 10 000 years, after which relative change of carbon stock was less than 1 : 10 000.Similarly, for ROMULv the model was run for thousands of years, and then we compared consecutive carbon stocks and steady-state stock was obtained when relative change was less than 1 : 10 000.
When evaluating model results for Yasso07, fresh woody litter and recently dead wood litter were excluded from model state variables, while all non-woody material and humus boxes (including more heavily decomposed material from fine-wood and coarse-wood litter) were accounted as soil carbon stock.This separation was done as Yasso07 slows down decomposition of logs of high diameter, which increases soil steady-state carbon stock estimates.Data from Biosoil do not include dead wood masses in soil carbon stocks.For ROMULv, we included all model state variables to the comparison, noting that log size does not affect decomposition.

Climate data
Daily weather data for steady-state simulations are available as kriging estimates since 1961 at a 10 × 10 km 2 resolution across Finland (Venäläinen et al., 2005).Weather data for the grid points on upland soils and on forested land, according to multisource NFI, were included (Tomppo et al., 2008).
Daily weather predictions for a 10 × 10 km 2 grid were aggregated for the Yasso07 model from 1961 to 2012.The average temperature, temperature amplitude and precipitation of each grid cell were provided to Yasso07 to estimate the impact of climate on soil carbon stock steady states.The RO-MULv model was driven by mean daily temperature and precipitation for each grid cell.

Yasso07
The Yasso07 soil carbon model (Tuomi et al., 2011) is driven by litter quantity, litter quality and weather (temperature and precipitation).This model is widely used in the greenhouse gas inventories of European countries (e.g.Finland, Norway and Switzerland).This model has been also successfully coupled with the climate-carbon cycle model ECHAM5/JSBACH (Thum et al., 2011).
The Yasso07 is a simple dynamic model with fluxes and state variables.The model has five compartments: acid, water, ethanol, non-soluble and humus boxes.Division of the litter input according to the litter solubility was species specific and followed that of Finnish GHG inventory, based on the appendix of the Yasso07 manual by Liski et al. (2009).Organic matter flows between these boxes and to the atmosphere variably according to weather conditions (Fig. 1).The model builds on the assumption that organic matter solubility defines litter quality, while decomposition rates during fluxes are estimated numerically, without a priori assumptions.The model was calibrated using a large database of litter and wood decomposition measurements and measurements of age chronosequences of soil carbon stocks (Liski et al., 2005(Liski et al., , 1998;;Rantakari et al., 2012;Tuomi et al., 2011).The Yasso07 soil carbon model parameter values were estimated with Markov chain Monte Carlo (MCMC) methods for which the source code is publicly available through the model website (http://www.syke.fi)(Tuomi et al., 2011).Here, we used the Yasso07 model with Scandinavian parameters (Rantakari et al., 2012) and with global parameters, this being a preliminary version from Tuomi et al. (2011) parameters (practically identical).Yasso07 was run with total litter input (from trees and understorey vegetation) and also with litter excluding that of understorey vegetation.The model was applied with annual time steps.Model has been calibrated with litter input estimates and soil carbon measurements down to 1 m soil depth.See the Supplement for more details on Yasso07.

ROMULv
ROMUL is a soil organic matter decomposition model developed by Chertov et al. (2001).It describes the flux of organic matter through the decomposition process, separated in cohorts of different litter origins: leaves, shoots, trunks, coarse roots, fine roots and ground vegetation.Litter from above the ground (leaves, shoots and stems) falls onto the forest floor, whereas the root litter decomposes in the mineral soil layer.In the final stages, all decomposing matter ends up in a common storage of semi-stable humus residing in the mineral soil layer (Fig. 1).Decomposition rates of each cohort depend on the nitrogen and ash content of the specific litter type, and for all cohorts the soil temperature and water content modify the decomposition rate.Here, we used species-specific nitrogen ratios for different litter fractions; these ratios were the same across the country in a given fraction of the given species.
In this paper, we utilised a version of the ROMUL model where we adopt decomposition rate functions depending on soil water content and the model for soil water dynamics as described in Linkosalo et al. (2013).We call this version of the model ROMULv (where the "v" refers to decomposition rates based on volumetric soil water measures).The soil water holding capacity data were obtained from digital soil maps of Finland (Lilja and Nevalainen, 2006).Total soil water holding capacity data were extracted to a 10×10 km 2 grid, when available.To evaluate the impact of soil water holding capacity variability, we also repeated the ROMULv simulation assuming a constant soil water holding capacity for the whole simulation area.As the ROMULv model segregates the organic and mineral soil layers, we assumed that 20 % of the total soil water holding capacity was in the organic layer and 80 % in mineral soil layer.See the Supplement for more details on ROMULv, the source code and parameter estimates.
We applied the ROMULv model using daily time steps of the environmental variables impacting the decomposition.The estimated annual litter input was evenly distributed for each day of the year.The ROMULv model explicitly simulates the flux of nitrogen through the decomposition process and therefore produces estimates for mineralised N in addition to the soil organic matter and N storages and CO 2 release from the soil.All model parameters, except the decomposition rate, which is dependent on soil water content, were taken from the Chertov et al. ( 2001) paper.Although the ROMULv model does not explicitly define the depth of soil layers, the soil water layer of 1 m has been assumed implicitly when gravimetric water content is estimated.Therefore, we assume that decomposition occurs in that same layer of 1 m.

Biosoil-soil carbon measurements
The Biosoil dataset resulted from EU-funded Forest Focus monitoring program and includes measurements of soil carbon and various other ecosystem variables from 2006.Data consist of 521 sample plots located across Finland.These plots were established as a subset of 3009 permanent sample plots from 1985 to 1986 (see Mäkipää and Heikkinen, 2003).
Biosoil sample plots have radii of 11.28 m (400 m 2 ).Trees and understorey vegetation coverage were measured.Soil carbon samples were taken separately from the litter and humus layers and mineral soil layers at 0-10, 11-20, 21-40 and 41-80 cm depth.In total, 10 or 20 subsamples were taken from the organic layers with a cylinder (diameter = 60 mm) and 10 tubes (diameter = 23 mm).Alternatively, five spade subsamples were taken from the 0-10 cm mineral soil layer, another five spade subsamples were taken from the 11-20 and 21-40 cm layers and one subsample from the 41-80 cm layer.The Viro (1952) rod penetration method was used to quantify the volume of stones and boulders in the surface soil layer.Several properties were analysed from soil samples in the laboratory: soil texture, cation exchange capacity (CEC), base saturation (BS), pH, organic carbon and total nitrogen concentrations.Aqua regia extractable P, Ca, K, Mg, Mn, Pb, Na, Ni, Fe and S were also measured (Derome et al., 2007).Soil carbon stocks were estimated down to 1 m depth by extrapolation of 40-80 cm measurements for the 80-100 cm layer.
Measured soil carbon stocks were grouped into different latitudinal bands forming a gradient in Finland to which we www.geosci-model-dev.net/9/4169/2016/Geosci.Model Dev., 9, 4169-4183, 2016 compared soil carbon estimates from Yasso07 and ROMULv models against measured data.Comparisons were made with two different widths of latitude bands: firstly we used bands of 100 km and thereafter bands of 20 km.Bands were included in the analysis when there were four or more Biosoil observations in a band.We also identified those Biosoil plots that had been under the Littorina Sea 7000-8000 BP (Miettinen, 2004;Sohlenius et al., 1996) using map products produced by the Geological Survey of Finland (Eronen, 1974).We used root mean square error (RMSE) to rank these model applications due to their ability to take into account both accuracy and precision when comparing model estimates and data.
We also studied differences between mean stand and soil properties for two regions in southern Finland (Region 1 with a latitude below 60 • 46 N, and Region 2 with a latitude between 60 • 46 and 62 • 34 N).

Performance of the soil carbon models
The soil carbon stock estimates by Yasso07 model using global parameterisation (Tuomi et al., 2011) were systematically larger than measured data across Finland, whereas those obtained with Scandinavian parameterisation (Rantakari et al., 2012) were in the same magnitude as the Biosoil data (Fig. 2a and b).In addition to the realistic level of soil carbon stock, the model that was based on parameterisation with the Scandinavian data also reproduced decreasing soil carbon stock trend from south to north, as displayed in the Biosoil data.Excluding understorey litter from the model input improved the match between Yasso07 soil carbon stocks simulated using global parameters and Biosoil data, whereas for Yasso07 using Scandinavian parameters, the same exclusion resulted in underestimation of soil carbon stocks, especially in northern Finland (Fig. 2c and d).The ROMULv model predictions generally agreed with Biosoil data when soil water holding capacity was taken into account.The inclusion of soil water holding capacity in RO-MULv introduced high variation in soil carbon stocks between dry and moist grid points in southern Finland (Fig. 2e).When the ROMULv model was driven with constant soil water holding capacity, it was unable to reproduce decreasing soil carbon stocks across Finland and the model underestimated carbon stocks, especially in the south (Fig. 2e and f).Large deviations between the data and the model estimates were also seen for the largest Biosoil soil carbon stocks of the southernmost plots (Fig. 2).
The ROMULv model using the soil water holding capacity was the only model able to reproduce the increase in measured soil carbon stocks at the southernmost plots.All other models predicted a substantial decrease in the soil carbon stocks for the southern region, which was not observed in Table 3. Mean soil and forest stand properties for R1 and R2 and 1.96 times their standard error of mean (SEM).R1 is south of 60 • 46 N, while R2 lies between 60 • 46 and 62 • 34 N northern latitude.P values based on a two-sided t test, where group variances are assumed to be different variables where p value < 0.05 are in bold.Sample sizes were 31 and 127 plots, for southern (R1) and northern (R2) areas, respectively.For mean temperature and precipitation, sample sizes were more than 150 based on the Finnish Meteorological Institute (FMI) weather grid (Venäläinen et al., 2005).Here, sand, silt and clay content are given as a percentage.Base saturation (BS), cation exchange capacity (CEC) and C : N ratios are given separately for organic layer (O) and for mineral soil layer (M).pH is measured in water.Abundance of ground vegetation given as percentage coverage.Basal area of trees (G) and relative shares of deciduous species and Norway spruce from basal area.Lito indicates the proportions of sites that were under the Littorina Sea 8000-7000 BP.Aqua regia extractable K, P, Na, Ni, Ca, Mg, Fe, S and Mn unit mg kg −1 .Topographical wetness index (TWI).Altitude of plots based on 10 m resolution map layer.Amount of carbon in organic and mineral soil horizon.measurements (Fig. 4).Soil properties of the southernmost plots (Region 1, R1, with a latitude below 60 • 46 N) were different compared to soil properties of forests further north (Region 2, R2, with a latitude between 60 • 46 and 62 • 34 N).The southern coast had lower silt content and higher sand content, but simultaneously higher carbon stocks in the organic layer than the other region (Table 3).Also understorey vegetation differed between these regions and the southern coast had lower Sphagnum species and herbs species cover-age, indicating that soils there were drier.We also tested site type distributions between R1 and R2 but the p value from a chi-square test was greater than 0.4, indicating that site fertilities did not differ.This is also supported by the measured C : N ratios, which did not differ between these regions (Table 3).These southern sites were also younger and 29 % of plots were under the Littorina Sea ∼ 7000-8000 BP, while the share of younger soils was only 6 % for the slightly more northern region.We found that Yasso07 model applications excluding litter input from understorey vegetation had the best agreement with the observed latitudinal gradient when evaluating using one-to-one plots (Fig. 3).The modelled and measured soil carbon stock fits were poor for the Yasso07 application with global parameters, including the understorey vegetation litter input and for the ROMULv application not using soil water holding capacity data.These model applications showed no correlation with measurements and also failed to map the south to north soil carbon stock decrease (Fig. 3).The lowest RMSE was obtained with the ROMULv model applied with soil water holding capacity data indicating the best performance among these model applications.

Improved understorey litter input
The amount of litterfall from trees and understorey vegetation showed opposite trends with latitude.Total litter input of trees and other vegetation in southern Finland was 4 times higher than that in northern Finland following the temperature gradient (Fig. 4).However, the understorey vegetation at northern sites displayed much higher litter inputs than those in southern Finland.According to our results, the fall of understorey litter in eastern and northern Lapland was equal to that from trees (Fig. 4).We estimated mean litterfall from understorey vegetation to equate approximately to 0.0473 and 0.0863 kg m −2 of carbon for southern and northern Finland, respectively.Our estimate of mean litter input from trees was 0.1962 and 0.0903 kg m −2 of carbon for southern and northern Finland, respectively.

Discussion
We tested whether litter quality, litter quantity and mean climate are sufficient for estimating spatial trends in soil carbon stocks and if soil texture with low water holding capacity introduces water limitation to decomposition.While testing our hypothesis, we found that Yasso07 and ROMULv models were able to predict carbon stocks of the same magnitude as that of measurements for Finland; however, these models experienced more challenges when their performance was evaluated for smaller regions, like the southern coast of Finland.
We also found that litter input from understorey vegetation is equal to that from trees in northern Finland.This emphasises the large role of understorey vegetation in compensating trees in an ecosystem carbon cycle, especially under lightand nutrient-limited growth conditions.Total litter input varied from 0.4 to 0.01 kg m −2 yr −1 of carbon from southern to northern Finland.These values agree with both net primary production (NPP) estimates for trees that vary between 0.2 and 0.5 kg m −2 yr −1 of carbon for southern Finland by Härkönen et al. (2010) and with measured litterfall values that range from 0.23 to 0.3 kg m −2 yr −1 for the Hyytiälä eddy covariance site (Ilvesniemi et al., 2009).
To sum up, litter input from understorey vegetation has to be quantified properly when soil carbon models are parameterised.This means that model developers should have unbiased estimates of total carbon inputs into the soil system when they use litter estimates and soil carbon stocks to calibrate parameters that affect long-term carbon accumulation.The results of our biomass models for the understorey agreed with previous estimates for southern Finland but for northern Finland our estimates were substantially larger.Yasso07 was parameterised using understorey litter inputs ranging from 40-60 g m −2 of carbon and therefore Yasso07, with updated understorey litter estimates, overestimated the level of soil carbon stocks.Our data for understorey vegetation biomass were mostly from stands that were over 60 years old, and therefore more data for younger sites are needed.Furthermore, the litter input of belowground understorey vegetation was uncertain due to limited data on the life span of roots.More research is therefore needed to confirm the high contribution of dwarf shrub vegetation to total belowground litter input.
When evaluating modelled and measured soil carbon stocks against northern latitude, we could see that two out of six simulations failed to map measured soil carbon stock decreases towards the north (Fig. 4).Five model simulations showed that the southern coast had less carbon than the next region further north, while the Biosoil data showed the opposite.Only the ROMULv model, using soil water holding capacity data from Lilja and Nevalainen (2006) were able to estimate the largest soil carbon stocks for southern Finland, similar to Biosoil measurements (Fig. 2).This indicated that litter quality, litter quantity and climate data are not sufficient when estimating spatial trends of soil carbon stocks.When we evaluated Biosoil data for these regions, the southern sites were better drained and drier that those in the north.Better drainage was indicated by higher sand content, significantly lower silt content and significantly lower Sphagnum and herb vegetation coverage (Table 3).We also found out that the southern coast experiences a 1.3 • C higher mean annual temperature than the next region to the north.Therefore, the organic layer was larger for the southern coast likely due to limited decomposition during dry spells (Table 3).This result indicated that the quantity of precipitation alone was not a sufficient modifier for decomposition but when complemented with soil water holding capacity, average latitudinal results improved.This finding supports the use of models including soil texture and water holding capacity.
One major driver for soil carbon stocks is land-use change.Recent land-use changes could have caused discrepancies between model and data estimates for soil carbon stocks.Therefore, we re-analysed Biosoil plot data (measured in 2006), which were established on permanent sample plots during 1985-1986(Mäkipää and Heikkinen, 2003).According to the permanent sample plot measurements of 1985-1986 only 2 out of 521 soil carbon plots have had a land-use change during the 1970s or early 1980s, indicating that land use has been forestry on 99.5 % of plots at least for the previous 40-50 years.This confirms that recent land-use changes do not invalidate our results.
Our results also support earlier findings, where the time step of the simulation plays a critical role; running models with monthly and annual time steps excludes extreme conditions and may produce biased estimates of carbon stocks simply due to fact that non-linear models are run with mean conditions (Dalsgaard et al., 2016;Rantakari et al., 2012).This is especially critical with soil moisture, which has a bell-shaped relation to decomposition (Sierra et al., 2015;Skopp et al., 1990).Compared to daily time steps of RO-MUL, model simulations with longer time steps (e.g. a year as in Yasso07) exclude both extreme dry and extreme moist conditions, leading to underestimation of steady-state soil carbon stocks.On the other hand, reductions of the decomposition rate during limited and excess water conditions are not well known.Sierra et al. (2015) compared the effect of moisture on decomposition in different models with the largest variability for dry and saturated conditions.They showed that a moisture index of 0.2 can have a decomposition modifier between 0.2 and 1, and similarly with a moisture index of 1 this modifier can be anywhere between 0 and 1, depending on the model.These large discrepancies indicate differences between soil properties of sites used for model calibration and between conceptualisations of soil moisture.
The Yasso07 and ROMULv model simulations with litter input and climate data alone were not enough to reproduce the observed soil carbon stocks.Model application without soil moisture impacts on decomposition failed under conditions where soil drainage played a significant role in limiting decomposition.The use of water holding capacity was critical for accurate soil carbon stock estimation by the RO-MULv model, while Yasso07 performed best when the variation in litter input correlated with that of the observed soil carbon stocks (Fig. 3).The fact that Yasso07 soil carbon stocks (Tuomi et al., 2011) were more accurate when litter input of understorey vegetation was omitted from simulations suggests that the model calibration did not account for the whole range of understorey litter input.This implies that in the future, improved estimates for understorey litter input are needed when Yasso07 or similar models will be parameterised for boreal conditions.In order to improve model performance, shorter time steps complemented with more detailed topography and soil properties are needed to map the impact of extreme events to soil carbon decomposition (e.g.droughts and water logging conditions).However, three model simulations out of six produced relatively accurate estimates of soil carbon stocks compared to the measurement means of smaller regions having an RMSE of less than 1.1 for soil carbon stocks (Fig. 3).This suggests that improved calibration with updated understorey litter and accounting for the soil properties as with ROMULv (using data on water holding capacity) produces model estimates that agree regionally with data.
Our simulations and measurements show that models using only litter quality, litter quantity and weather data underestimate soil carbon stock in regions like southern Finland and this underestimation is due to omission of the impact of droughts to the decomposition of organic layers.Thus, we conclude that GHG inventory methods and soil modules of Earth system models need to be improved by incorporating the impact of soil texture and soil moisture to decomposition.This is a prerequisite for unbiased soil carbon stock and stock change estimates.

Code availability
The Fortran code of the ROMULv is given in the Supplement, while Fortran code for Yasso07 is available through the website http://code.google.com/p/yasso07ui/;for details, see the Supplement.
The Supplement related to this article is available online at doi:10.5194/gmd-9-4169-2016-supplement.

Figure 1 .
Figure 1.Schematic illustration of Yasso07 (left) and ROMULv (right) soil carbon models.Solid arrows indicate fluxes of organic matter, while dashed arrows indicate CO 2 fluxes to atmosphere.

Figure 2 .
Figure 2. Latitudinal trends of measured and modelled soil carbon stocks across Finland.The x axis is the north coordinate according to Finnish YKJ system and y axis is the soil carbon stock Mg ha −1 .Grey dots are individual model estimates and the red line is a second-order LOESS fit.Panel (a) is Yasso07 with Rantakari et al. (2012) parameters, while panel (b) is with Tuomi et al. (2011) parameters.Panels (c) and (d) are same as panels (a) and (b), but without understorey litter input.Panels (e) and (f) show the ROMULv model where (e) includes soil water holding capacity data and (f) includes constant soil water holding capacity.Black dots are means from Biosoil data for each latitude band and whiskers are 1.96 times the standard error of the mean.

Figure 3 .
Figure 3. One-to-one plots for mean model estimates and mean Biosoil measurements of soil carbon stocks for 41 latitudinal bands across Finland.Panel (a) is Yasso07 with Rantakari et al. (2012) parameters, while panel (b) is with Tuomi et al. (2011) parameters.Panels (c) and (d) are same as panels (a) and (b) but without understorey litter input.Panels (e) and (f) show the ROMULv model where (e) is with soil water holding capacity data and (f) is with constant soil water holding capacity.R 2 , slope of the regression and the slope standard error are reported.RMSE is based on the difference between model estimate and measurements.

Table 1 .
Site description of ICP Forests Level II plots studied for understorey vegetation biomass.
Salemaa et al. (2008)orthern Finland, 2 indicates southern Finland); tree species (1 indicates Scots pine, 2 indicates Norway spruce and 3 indicates deciduous trees); site type (1-5, from rich to poor fertility level); forest site types (abbreviations explained in Table1ofSalemaa et al. (2008)); plant species groups in biomass samples: DF indicates dwarf shrubs, GR indicates grasses, HE indicates herbs, BR indicates bryophytes and LI indicates lichens.The "x" indicates biomass sample includes both aboveground and belowground (in organic soil layer) part of vegetation, the "a" indicates sample includes only aboveground vegetation, "-" indicates plant group does not grow in the plot.n indicates number of samples (each sized 30 × 30 cm 2 ) for aboveground and belowground biomass.

Table 2 .
Litter turnover rates for tree and understorey biomass by component, species and region.