Journal metrics

Journal metrics

  • IF value: 4.252 IF 4.252
  • IF 5-year value: 4.890 IF 5-year 4.890
  • CiteScore value: 4.49 CiteScore 4.49
  • SNIP value: 1.539 SNIP 1.539
  • SJR value: 2.404 SJR 2.404
  • IPP value: 4.28 IPP 4.28
  • h5-index value: 40 h5-index 40
  • Scimago H index value: 51 Scimago H index 51
Volume 10, issue 9 | Copyright
Geosci. Model Dev., 10, 3189-3206, 2017
© Author(s) 2017. This work is distributed under
the Creative Commons Attribution 3.0 License.

Model description paper 31 Aug 2017

Model description paper | 31 Aug 2017

eddy4R 0.2.0: a DevOps model for community-extensible processing and analysis of eddy-covariance data based on R, Git, Docker, and HDF5

Stefan Metzger1,2, David Durden1, Cove Sturtevant1, Hongyan Luo1, Natchaya Pingintha-Durden1, Torsten Sachs3, Andrei Serafimovich3, Jörg Hartmann4, Jiahong Li5, Ke Xu2, and Ankur R. Desai2 Stefan Metzger et al.
  • 1National Ecological Observatory Network, Battelle, 1685 38th Street, Boulder, CO 80301, USA
  • 2University of Wisconsin-Madison, Dept. of Atmospheric and Oceanic Sciences, 1225 West Dayton Street, Madison, WI 53706, USA
  • 3GFZ German Research Centre for Geosciences, Telegrafenberg, 14473 Potsdam, Germany
  • 4Alfred Wegener Institute – Helmholtz Centre for Polar and Marine Research, Am Handelshafen 12, 27570 Bremerhaven, Germany
  • 5LI-COR Biosciences, 4647 Superior Street, Lincoln, NE 68504, USA

Abstract. Large differences in instrumentation, site setup, data format, and operating system stymie the adoption of a universal computational environment for processing and analyzing eddy-covariance (EC) data. This results in limited software applicability and extensibility in addition to often substantial inconsistencies in flux estimates. Addressing these concerns, this paper presents the systematic development of portable, reproducible, and extensible EC software achieved by adopting a development and systems operation (DevOps) approach. This software development model is used for the creation of the eddy4R family of EC code packages in the open-source R language for statistical computing. These packages are community developed, iterated via the Git distributed version control system, and wrapped into a portable and reproducible Docker filesystem that is independent of the underlying host operating system. The HDF5 hierarchical data format then provides a streamlined mechanism for highly compressed and fully self-documented data ingest and output.

The usefulness of the DevOps approach was evaluated for three test applications. First, the resultant EC processing software was used to analyze standard flux tower data from the first EC instruments installed at a National Ecological Observatory (NEON) field site. Second, through an aircraft test application, we demonstrate the modular extensibility of eddy4R to analyze EC data from other platforms. Third, an intercomparison with commercial-grade software showed excellent agreement (R2 = 1.0 for CO2 flux). In conjunction with this study, a Docker image containing the first two eddy4R packages and an executable example workflow, as well as first NEON EC data products are released publicly. We conclude by describing the work remaining to arrive at the automated generation of science-grade EC fluxes and benefits to the science community at large.

This software development model is applicable beyond EC and more generally builds the capacity to deploy complex algorithms developed by scientists in an efficient and scalable manner. In addition, modularity permits meeting project milestones while retaining extensibility with time.

Download & links
Publications Copernicus
Short summary
We apply the development and systems operations software development model to create the eddy4R–Docker open-source, flexible, and modular eddy-covariance data processing environment. Test applications to aircraft and tower data, as well as a software cross validation demonstrate its efficiency and consistency. Key improvements in accessibility, extensibility, and reproducibility build the foundation for deploying complex scientific algorithms in an effective and scalable manner.
We apply the development and systems operations software development model to create the...