Journal cover Journal topic
Geoscientific Model Development An interactive open-access journal of the European Geosciences Union
Geosci. Model Dev., 11, 2033-2048, 2018
© Author(s) 2018. This work is distributed under
the Creative Commons Attribution 4.0 License.
Methods for assessment of models
04 Jun 2018
Cluster-based analysis of multi-model climate ensembles
Richard Hyde1, Ryan Hossaini1, and Amber A. Leeson2,1 1Lancaster Environment Centre, Lancaster University, Lancaster, LA1 4WA, UK
2Data Science Institute, Lancaster University, Lancaster, LA1 4WA, UK
Abstract. Clustering – the automated grouping of similar data – can provide powerful and unique insight into large and complex data sets, in a fast and computationally efficient manner. While clustering has been used in a variety of fields (from medical image processing to economics), its application within atmospheric science has been fairly limited to date, and the potential benefits of the application of advanced clustering techniques to climate data (both model output and observations) has yet to be fully realised. In this paper, we explore the specific application of clustering to a multi-model climate ensemble. We hypothesise that clustering techniques can provide (a) a flexible, data-driven method of testing model–observation agreement and (b) a mechanism with which to identify model development priorities. We focus our analysis on chemistry–climate model (CCM) output of tropospheric ozone – an important greenhouse gas – from the recent Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP). Tropospheric column ozone from the ACCMIP ensemble was clustered using the Data Density based Clustering (DDC) algorithm. We find that a multi-model mean (MMM) calculated using members of the most-populous cluster identified at each location offers a reduction of up to  ∼  20 % in the global absolute mean bias between the MMM and an observed satellite-based tropospheric ozone climatology, with respect to a simple, all-model MMM. On a spatial basis, the bias is reduced at  ∼  62 % of all locations, with the largest bias reductions occurring in the Northern Hemisphere – where ozone concentrations are relatively large. However, the bias is unchanged at 9 % of all locations and increases at 29 %, particularly in the Southern Hemisphere. The latter demonstrates that although cluster-based subsampling acts to remove outlier model data, such data may in fact be closer to observed values in some locations. We further demonstrate that clustering can provide a viable and useful framework in which to assess and visualise model spread, offering insight into geographical areas of agreement among models and a measure of diversity across an ensemble. Finally, we discuss caveats of the clustering techniques and note that while we have focused on tropospheric ozone, the principles underlying the cluster-based MMMs are applicable to other prognostic variables from climate models.
Citation: Hyde, R., Hossaini, R., and Leeson, A. A.: Cluster-based analysis of multi-model climate ensembles, Geosci. Model Dev., 11, 2033-2048,, 2018.
Publications Copernicus
Short summary
Clustering, the automated grouping of similar data, can provide powerful insight into large/complex data. We demonstrate the benefits of clustering applied to output from climate model inter-comparison initiatives. We focus on modelled tropospheric ozone from the ACCMIP project. Cluster-based subsampling of the model ensemble can (i) remove outlier data on a grid-cell basis, reducing model–observation bias and (ii) provide a useful framework in which to investigate and visualise model diversity.
Clustering, the automated grouping of similar data, can provide powerful insight into...