Journal metrics

Journal metrics

  • IF value: 4.252 IF 4.252
  • IF 5-year value: 4.890 IF 5-year 4.890
  • CiteScore value: 4.49 CiteScore 4.49
  • SNIP value: 1.539 SNIP 1.539
  • SJR value: 2.404 SJR 2.404
  • IPP value: 4.28 IPP 4.28
  • h5-index value: 40 h5-index 40
  • Scimago H index value: 51 Scimago H index 51
Volume 11, issue 8 | Copyright
Geosci. Model Dev., 11, 3447-3464, 2018
https://doi.org/10.5194/gmd-11-3447-2018
© Author(s) 2018. This work is distributed under
the Creative Commons Attribution 4.0 License.

Development and technical paper 27 Aug 2018

Development and technical paper | 27 Aug 2018

Portable multi- and many-core performance for finite-difference or finite-element codes – application to the free-surface component of NEMO (NEMOLite2D 1.0)

Andrew R. Porter1, Jeremy Appleyard2, Mike Ashworth1, Rupert W. Ford1, Jason Holt3, Hedong Liu3, and Graham D. Riley4 Andrew R. Porter et al.
  • 1Science and Technology Facilities Council, Daresbury Laboratory, Warrington, UK
  • 2NVIDIA Corporation, Green Park, Reading, UK
  • 3National Oceanography Centre, Liverpool, UK
  • 4School of Computer Science, University of Manchester, Manchester, UK

Abstract. We present an approach which we call PSyKAl that is designed to achieve portable performance for parallel finite-difference, finite-volume, and finite-element earth-system models. In PSyKAl the code related to the underlying science is formally separated from code related to parallelization and single-core optimizations. This separation of concerns allows scientists to code their science independently of the underlying hardware architecture and for optimization specialists to be able to tailor the code for a particular machine, independently of the science code. We have taken the free-surface part of the NEMO ocean model and created a new shallow-water model named NEMOLite2D. In doing this we have a code which is of a manageable size and yet which incorporates elements of full ocean models (input/output, boundary conditions, etc.). We have then manually constructed a PSyKAl version of this code and investigated the transformations that must be applied to the middle, PSy, layer in order to achieve good performance, both serial and parallel. We have produced versions of the PSy layer parallelized with both OpenMP and OpenACC; in both cases we were able to leave the natural-science parts of the code unchanged while achieving good performance on both multi-core CPUs and GPUs. In quantifying whether or not the obtained performance is good we also consider the limitations of the basic roofline model and improve on it by generating kernel-specific CPU ceilings.

Download & links
Publications Copernicus
Download
Short summary
Developing computer models in the earth-system domain is a complex and expensive process that can have a duration measured in years. The supercomputers required to run these models, however, are evolving fast with a proliferation of technologies and associated programming models. As a result there is a need that models be "performance portable" between different supercomputers. This paper investigates a way of doing this through a separation of the concerns of performance and natural science.
Developing computer models in the earth-system domain is a complex and expensive process that...
Citation
Share