Journal cover Journal topic
Geoscientific Model Development An interactive open-access journal of the European Geosciences Union
Journal topic

Journal metrics

Journal metrics

  • IF value: 5.154 IF 5.154
  • IF 5-year value: 5.697 IF 5-year
    5.697
  • CiteScore value: 5.56 CiteScore
    5.56
  • SNIP value: 1.761 SNIP 1.761
  • IPP value: 5.30 IPP 5.30
  • SJR value: 3.164 SJR 3.164
  • Scimago H <br class='hide-on-tablet hide-on-mobile'>index value: 59 Scimago H
    index 59
  • h5-index value: 49 h5-index 49
Volume 8, issue 4
Geosci. Model Dev., 8, 1033–1046, 2015
https://doi.org/10.5194/gmd-8-1033-2015
© Author(s) 2015. This work is distributed under
the Creative Commons Attribution 3.0 License.
Geosci. Model Dev., 8, 1033–1046, 2015
https://doi.org/10.5194/gmd-8-1033-2015
© Author(s) 2015. This work is distributed under
the Creative Commons Attribution 3.0 License.

Development and technical paper 08 Apr 2015

Development and technical paper | 08 Apr 2015

An approach to enhance pnetCDF performance in environmental modeling applications

D. C. Wong1,**, C. E. Yang2,**, J. S. Fu2, K. Wong2, and Y. Gao2,* D. C. Wong et al.
  • 1U.S. Environmental Protection Agency, Research Triangle Park, NC, USA
  • 2University of Tennessee, Knoxville, TN, USA
  • *now at: Pacific Northwest National Laboratory, Richland, WA, USA
  • **These authors contributed equally to this work.

Abstract. Data intensive simulations are often limited by their I/O (input/output) performance, and "novel" techniques need to be developed in order to overcome this limitation. The software package pnetCDF (parallel network Common Data Form), which works with parallel file systems, was developed to address this issue by providing parallel I/O capability. This study examines the performance of an application-level data aggregation approach which performs data aggregation along either row or column dimension of MPI (Message Passing Interface) processes on a spatially decomposed domain, and then applies the pnetCDF parallel I/O paradigm. The test was done with three different domain sizes which represent small, moderately large, and large data domains, using a small-scale Community Multiscale Air Quality model (CMAQ) mock-up code. The examination includes comparing I/O performance with traditional serial I/O technique, straight application of pnetCDF, and the data aggregation along row and column dimension before applying pnetCDF. After the comparison, "optimal" I/O configurations of this application-level data aggregation approach were quantified. Data aggregation along the row dimension (pnetCDFcr) works better than along the column dimension (pnetCDFcc) although it may perform slightly worse than the straight pnetCDF method with a small number of processors. When the number of processors becomes larger, pnetCDFcr outperforms pnetCDF significantly. If the number of processors keeps increasing, pnetCDF reaches a point where the performance is even worse than the serial I/O technique. This new technique has also been tested for a real application where it performs two times better than the straight pnetCDF paradigm.

Publications Copernicus
Download
Citation