Identifying weather patterns that frequently lead to extreme weather events
is a crucial first step in understanding how they may vary under different
climate change scenarios. Here, we propose an automated method for
recognizing atmospheric rivers (ARs) in climate data using topological data
analysis and machine learning. The method provides useful information about
topological features (shape characteristics) and statistics of ARs. We
illustrate this method by applying it to outputs of version
The importance of understanding the behavior of extreme weather events in
a changing climate cannot be overstated. A first step towards this
challenging goal is to identify extreme events in large datasets. Identifying
such events remains an important challenge for the climate science community
for the following reasons:
The identification process is critical in calculating statistics, including the frequency,
location and intensity of extreme weather events under different climate change scenarios. It is the first step in evaluating how well a climate model captures physical features
of extreme events and characterizing their changes under global warming. As high-performance computational technology continues to advance, there is an ever-increasing amount
of data from climate model output, reanalysis products and observations that demands
rapid and automated detection and characterization of extreme events.
This study is part of ongoing efforts to provide automated methods that are
able to identify extreme weather and climate events in large climate datasets
Extreme precipitation events in midlatitudes are often associated with
atmospheric rivers (ARs). Since the early 1990s, there has been a growing
interest in studying ARs
Sample images of atmospheric rivers – long filamentary structures
reaching the west coast of the United States:
The first challenge in extreme event detection is to construct a quantitative
definition of the event
Some recent efforts focus on alternative approaches to characterize and
detect extreme events, such as deep learning methods for pattern recognition
In this paper, we present an alternative approach to AR pattern recognition
based on topological data analysis (TDA)
Data sources (climate model and reanalysis datasets).
The key contributions of this paper are as follows: (i) we propose a novel method to identify ARs that is free from threshold selection, and (ii) we show that the framework of using TDA to extract topological feature descriptors and a ML classifier (SVM) provides high accuracy in recognizing AR patterns in both climate model output and reanalysis datasets across a range of spatial and temporal resolutions.
Block diagram of the AR pattern recognition method. Stage
The rest of the paper is organized as follows: Sect. 2 describes datasets, the topological feature descriptors of ARs and non-ARs, the TDA algorithm and SVM classifier in more detail; Sect. 3 shows the results obtained with discussion; and Sect. 4 presents conclusions and future work.
In this study, we use both climate model simulation output generated by
version 5.1 of the Community Atmosphere Model CAM5.1 data are
provided by the Lawrence Berkeley National Laboratory (LBNL), Berkeley;
National Energy Research Scientific Computing Center. MERRA-2 data are
provided by the University of California, San Diego (UCSD); Center for
Western Weather and Water Extremes (CW3E).
The CAM5.1 climate model output is available at For the CAM5.1, this
variable is called TMQ. For the MERRA-2 reanalysis data, it is called
IWV. It is also called prw in the CF protocols. ML models tend to
perform better when they have more training data.
Training a machine learning classifier, such as a SVM (see Sect.
This subsection describes the two stages of the atmospheric river pattern
recognition method (see Fig.
Topology is the
branch of mathematics studying properties of geometric objects (e.g., 2-D
grids) that are preserved under continuous deformations.
An illustration of four-connected neighborhood that is defined in the
latitude–longitude grid in the plane with real coordinates. For example, each
of the nodes
The aim of this stage is to automatically characterize AR and non-AR events
in raw climate data. Most existing methods have been designed to use
thresholds for identification of ARs
An illustration of the connected regions in the superlevel set
(defined in Eq. 2) that are split into three pieces at value
An illustration of finding connected AR regions over a specified
search sector. In this example, the search for ARs is bounded by the latitude
of the Hawaiian Islands (yellow line) and the west coast of North America
(green line).
Climate model output or reanalysis data may be represented as mapping from
the grid to a set of real values, which in our case is the IWV over
Every node (grid point) has four neighbors in the grid (except boundary
nodes). In terms of point coordinates in the plane, the node at
Following the threshold-free approach in TDA, the evolution of connected
regions in a superlevel set is monitored at every value
Suppose there are three connected regions
The above discussed approach of connected regions can be achieved by the TDA
algorithm based on Union-Find data structure (U-F)
There are five main operations used in our TDA algorithm: form a new connected region and add the region to the data structure; assign the right connected region to a given grid point; check if the connected regions intersect a specified geographical
location on the grid; e.g., we examine connected regions that intersect the
west coast of North America and the latitude of the Hawaiian Islands, as
shown in Fig. merge two
regions containing at least one same node into one new connected region, as
shown in Fig. track the
evolution of a connected region (number of grid points in it) as IWV is
varied.
The extracted information of evolution of connected regions is encoded in
evolution plots. The plots show the recorded number of grid points
in the region as values of IWV are systematically decreasing, as in
Fig.
Creating an input matrix for the machine learning method: the mapped
evolution plots into
Support vector machine is a widely used machine learning method for binary
classification (recognition) task
Assume a training set of instance-label pairs
The penalty parameter of the error term takes only values greater than zero
An example of linear SVM that finds the optimal hyperplane
For this study, a radial basis function (RBF) kernel is chosen as it has been
shown to achieve the best results in many applications. The RBF is defined as
follows
In this subsection, we define the evaluation metrics that we use to assess the
reliability of our AR pattern recognition method: classification accuracy
score, confusion matrix, precision score and sensitivity score. Also, we
explain the preprocessing step of the input to the SVM
classifier to address the issues of imbalanced data
Classification accuracy score is the ratio of correct predictions of
ARs to total predictions made by the machine learning classifier (in
percent). Training accuracy is the classification accuracy obtained
by applying the classifier on the training data, while testing
accuracy is the classification accuracy for the testing data. We present the
classification accuracy scores for our method in
Sect.
A confusion matrix is a clear way to present the classification
results of ARs with regard to testing accuracy of the machine learning
classifier. The matrix has two rows and two columns, as shown in
Table
A confusion matrix (error matrix) is a way to present the performance of the method (typically, testing accuracy). The matrix reports the number of (i) false positives – cases when the model indicates that an AR exists, when it does not in the ground truth; (ii) false negatives – cases when the model indicates that an AR does not exist, while in fact it does in the ground truth; (iii) true positives – cases when the model indicates that an AR exists, when it does in the ground truth; (iv) true negatives – cases when the model indicates that an AR does not exist, when it does not in the ground truth.
Precision score is a measure of the classifier's repeatability or
reproducibility of ARs and can be computed using a confusion matrix. The
score is the ratio of true positives to the sum of true
positives and false positives. It is shown in
Table
Sensitivity score is the proportion of actual ARs which are
correctly identified as ARs by the classifier. The score is the ratio of
true positives to the sum of true positives and
false negatives. It is shown in Table
Data normalization (standardization) is a way of adjusting measured
values to a common scale (i.e.,
Balancing the data is motivated by the
imbalanced class problem, which is that each class of event (ARs and non-ARs)
is not equally represented in the dataset. This poses a problem because SVMs
tend to overfit the majority class. We circumvent this problem by
resampling
This section presents results from applying the proposed AR recognition
method on test datasets. The tests have been done on CAM5.1 simulation output
and the MERRA-2 reanalysis product. A summary of the data and their spatial and
temporal resolutions is in Table
TDA provides a unique way of characterizing
weather events in a dataset. Figure
An example of normalized plots of averaged topological feature
descriptors for 200 km spatial resolution and daily temporal resolution of
the CAM5.1 simulation data. Note that the averaged plots for the ARs and
non-ARs are very similar and it is hard to differentiate them by eye.
However, a ML model can be trained to distinguish these two categories of
events, by transforming the data into a high-dimensional space where a unique
hyperplane exists that cleanly separates the two event categories (see
Sect.
Figures
Normalized evolution plots of averaged (red curves) and 100
arbitrarily selected topological feature descriptors of ARs (blue curves;
Normalized evolution plots of averaged (red curves) and 100
arbitrarily selected topological feature descriptors of ARs (blue curves;
The same analyses using topological feature descriptors have been done for
all other datasets listed in Table
We now evaluate the performance and reliability of the proposed AR
recognition method by measuring the classification accuracy (as defined in
Sect.
Training accuracy measures how well the model learns from training data (25 % of dataset), i.e., ground truth data labeled with ARs and non-ARs. Testing accuracy measures how well the method performs on a “held-out” dataset (75 % of dataset).
Classification accuracy score estimated by the SVM classifier for 3-hourly temporal resolution of the CAM5.1 model with three different spatial resolutions. The table also shows the number of snapshots (number of events for both categories: ARs and non-ARs).
Classification accuracy score estimated by the SVM classifier for daily temporal resolution of the CAM5.1 model with three different spatial resolutions. The table also shows the number of snapshots (number of events for both categories: ARs and non-ARs).
Classification accuracy score estimated by the SVM classifier for 3-hourly temporal resolution and 50 km spatial resolution of the MERRA-2 reanalysis. The table also shows the number of snapshots (number of events for both categories: ARs and non-ARs).
Table
Sample images of events from the testing set showing a typical
failure mode of the proposed method: examples of ARs misclassified as non-ARs
(false negatives). The figure shows IWV (
Sample images of events from the testing set showing a typical
failure mode of the proposed method: examples of non-ARs misclassified as ARs
(false positives). The figure shows IWV (
In Table
Confusion matrix of the method for testing set – the CAM5.1 data (3-hourly, 25 km), which show the number of correctly classified (diagonal) and incorrectly classified events (off-diagonal).
Precision and sensitivity scores (described in
Sect.
Table
In summary, the model has consistently high classification accuracy for ARs (77 %–91 %) across a broad set of spatial and temporal resolutions, illustrating that the combination of topological data analysis and machine learning is an effective and efficient threshold-free strategy for detecting ARs in large climate datasets. We note that the ML model is biased by the ground truth data produced by TECA using the threshold-based criteria for AR identification. Characterizing the influence of using different ground truth data is beyond the scope of this study.
In this section, we examine some limitations of the proposed method. We investigate some typical failure modes further by examining snapshots of misclassified events. Then, we use the confusion matrix along with precision and sensitivity scores to quantify how accurately and precisely the model is able to classify events by comparing against ground truth data.
Figure
Figure
In Appendix B, we present confusion matrices of the method for different spatial and temporal resolutions of the CAM5.1 model and MERRA-2 reanalysis product.
Table
In this paper, we propose a novel and automated method for recognizing AR patterns in large climate datasets. The method combines TDA with ML, both of which are powerful tools that the climate science community often does not use.
We show that the proposed method is reliable, robust and performs well by
testing it on a wide range of spatial and temporal resolutions of CAM5.1
climate model output as well as the MERRA-2 reanalysis product. The ground
truth labels are obtained using TECA
Despite background noise, low-intensity AR signals and the existence of other events within the 2-D snapshots, our method is shown to work well. The method tends to perform better for lower-resolution data and we speculate that this is because high-resolution simulations tend to produce noisier spatial patterns, which tend to confuse the machine learning model more easily than low-resolution simulations.
The key advantage of the topological feature descriptors used in this work is that it is a threshold-free method that succinctly encapsulates the most important topological features of ARs. We anticipate that because the method is threshold-free (there is no need to determine any threshold criteria for the TDA step), when the spatial resolution of the climate model changes, there is no parameter retuning, unlike in the case of heuristic methods used by most other AR-detection methods. An application of this method to different climate change scenarios without any tuning will be explored in future work.
Further, it is a much faster method than, for example, using convolutional
neural networks
In future work, we will test our method on direct observations via satellite
images. We also plan to test the proposed method in different climate
scenarios, in order to test the method's sensitivity to biases in the
training data. Further, we anticipate that the method can be made more robust
by (i) employing a full “persistence” concept from TDA and (ii) training
SVM on ground truth data that are not biased by fixed threshold criteria.
This study shows that the TDA and ML framework could be an effective way to
characterize and identify a wide range of other weather and climate
phenomena, such as blocking events and jet streams. As the TDA step is not
restricted to a 2-D scalar field on a grid, it is also possible to apply to
higher-dimensional or multivariate fields. A similar TDA-based approach has
successfully been applied to data skeletonization
Source code is available at GitHub:
This Appendix contains additional evolution plots mentioned in
Sect.
Normalized evolution plots of averaged (red curves) and 100
arbitrarily selected topological feature descriptors of ARs (blue curves;
This Appendix includes the rest of the confusion matrices (tables) that were
considered in Sect.
Confusion matrix of the method for the testing set – the MERRA-2 data (3-hourly, 50 km). It shows the number of correctly recognized (the diagonal) and the number of incorrectly classified events (off-diagonal).
Confusion matrix of the method for the testing set – the CAM5.1 data (3-hourly, 100 km). It shows the number of correctly recognized (the diagonal) and the number of incorrectly classified events (off-diagonal).
Confusion matrix of the method for the testing set – the CAM5.1 data (3-hourly, 200 km). It shows the number of correctly recognized (the diagonal) and the number of incorrectly classified events (off-diagonal).
Confusion matrix of the method for the testing set – the CAM5.1 data (daily, 25 km). It shows the number of correctly recognized (the diagonal) and the number of incorrectly classified events (off-diagonal).
Confusion matrix of the method for the testing set – the CAM5.1 data (daily, 100 km). It shows the number of correctly recognized (the diagonal) and the number of incorrectly classified events (off-diagonal).
Confusion matrix of the method for the testing set – the CAM5.1 data (daily, 200 km). It shows the number of correctly recognized (the diagonal) and the number of incorrectly classified events (off-diagonal).
The supplement related to this article is available online at:
GM conceived the method, performed the computations, analyzed the data and wrote the manuscript. KK coordinated the work and assisted in the development of the overall content included in this article. MW assisted in the development of the overall content included in this article and contributed to the interpretation of the results. VK assisted in the development of the overall content included in this article and shared his expertise in topological data analysis. P supervised the work, assisted in the development of the overall content included in this article and shared his expertise in machine learning.
The authors declare that they have no conflict of interest.
This document was prepared as an account of work partially sponsored by the United States Government. While this document is believed to contain correct information, neither the United States Government nor any agency thereof, nor the Regents of the University of California, nor any of their employees, makes any warranty, express or implied, or assumes any legal responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by its trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof, or the Regents of the University of California. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof or the Regents of the University of California.
Grzegorz Muszynski and Vitaliy Kurlin would like to acknowledge Intel for supporting the IPCC at the University of Liverpool. We also thank Dmitriy Morozov and Burlen Loring from the Computational Research Division at Lawrence Berkeley National Laboratory for valuable discussions and sharing their expertise on computational mathematics.
Karthik Kashinath was supported by the Intel Big Data Center, and Michael Wehner was supported by the Regional and Global Climate Modeling Program of the Office of Biological and Environmental Research in the Department of Energy Office of Science under contract no. DE-AC02-05CH11231. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the US Department of Energy under contract no. DE-AC02-05CH11231.Edited by: James Annan Reviewed by: Soulivanh Thao and one anonymous referee