1Sony Computer Science Laboratory, Paris, France
2Met Office, Exeter, UK
3University of Oxford, Oxford, UK
*now at: Laboratoire de Recherche en Informatique, Orsay, France
**now at: Laboratoire de Météorologie Dynamiqe, IPSL, CNRS/UPMC, Paris, France
***now at: School of Geography, Politics and Sociology, Newcastle University, Newcastle, UK
Received: 10 May 2011 – Published in Geosci. Model Dev. Discuss.: 17 Jun 2011
Abstract. We have optimised the atmospheric radiation algorithm of the FAMOUS climate model on several hardware platforms. The optimisation involved translating the Fortran code to C and restructuring the algorithm around the computation of a single air column. Instead of the existing MPI-based domain decomposition, we used a task queue and a thread pool to schedule the computation of individual columns on the available processors. Finally, four air columns are packed together in a single data structure and computed simultaneously using Single Instruction Multiple Data operations.
Revised: 10 Sep 2011 – Accepted: 12 Sep 2011 – Published: 27 Sep 2011
The modified algorithm runs more than 50 times faster on the CELL's Synergistic Processing Element than on its main PowerPC processing element. On Intel-compatible processors, the new radiation code runs 4 times faster. On the tested graphics processor, using OpenCL, we find a speed-up of more than 2.5 times as compared to the original code on the main CPU. Because the radiation code takes more than 60 % of the total CPU time, FAMOUS executes more than twice as fast. Our version of the algorithm returns bit-wise identical results, which demonstrates the robustness of our approach. We estimate that this project required around two and a half man-years of work.
Citation: Hanappe, P., Beurivé, A., Laguzet, F., Steels, L., Bellouin, N., Boucher, O., Yamazaki, Y. H., Aina, T., and Allen, M.: FAMOUS, faster: using parallel computing techniques to accelerate the FAMOUS/HadCM3 climate model with a focus on the radiative transfer algorithm, Geosci. Model Dev., 4, 835-844, doi:10.5194/gmd-4-835-2011, 2011.