-
A Convolutional Neural Network-based Ensemble Post-processing with Data Augmentation for Tropical Cyclone Precipitation Forecasts
Authors:
Sing-Wen Chen,
Joyce Juang,
Charlotte Wang,
Hui-Ling Chang,
Jing-Shan Hong,
Chuhsing Kate Hsiao
Abstract:
Heavy precipitation from tropical cyclones (TCs) may result in disasters, such as floods and landslides, leading to substantial economic damage and loss of life. Prediction of TC precipitation based on ensemble post-processing procedures using machine learning (ML) approaches has received considerable attention for its flexibility in modeling and its computational power in managing complex models.…
▽ More
Heavy precipitation from tropical cyclones (TCs) may result in disasters, such as floods and landslides, leading to substantial economic damage and loss of life. Prediction of TC precipitation based on ensemble post-processing procedures using machine learning (ML) approaches has received considerable attention for its flexibility in modeling and its computational power in managing complex models. However, when applying ML techniques to TC precipitation for a specific area, the available observation data are typically insufficient for comprehensive training, validation, and testing of the ML model, primarily due to the rapid movement of TCs. We propose to use the convolutional neural network (CNN) as a deep ML model to leverage the spatial information of precipitation. The proposed model has three distinct features that differentiate it from traditional CNNs applied in meteorology. First, it utilizes data augmentation to alleviate challenges posed by the small sample size. Second, it contains geographical and dynamic variables to account for area-specific features and the relative distance between the study area and the moving TC. Third, it applies unequal weights to accommodate the temporal structure in the training data when calculating the objective function. The proposed CNN-all model is then illustrated with the TC Soudelor's impact on Taiwan. Soudelor was the strongest TC of the 2015 Pacific typhoon season. The results show that the inclusion of augmented data and dynamic variables improves the prediction of heavy precipitation. The proposed CNN-all outperforms traditional CNN models, based on the continuous probability skill score (CRPSS), probability plots, and reliability diagram. The proposed model has the potential to be utilized in a wide range of meteorological studies.
△ Less
Submitted 15 September, 2024;
originally announced September 2024.
-
Multi-criteria Similarity-based Anomaly Detection using Pareto Depth Analysis
Authors:
Ko-Jen Hsiao,
Kevin S. Xu,
Jeff Calder,
Alfred O. Hero III
Abstract:
We consider the problem of identifying patterns in a data set that exhibit anomalous behavior, often referred to as anomaly detection. Similarity-based anomaly detection algorithms detect abnormally large amounts of similarity or dissimilarity, e.g.~as measured by nearest neighbor Euclidean distances between a test sample and the training samples. In many application domains there may not exist a…
▽ More
We consider the problem of identifying patterns in a data set that exhibit anomalous behavior, often referred to as anomaly detection. Similarity-based anomaly detection algorithms detect abnormally large amounts of similarity or dissimilarity, e.g.~as measured by nearest neighbor Euclidean distances between a test sample and the training samples. In many application domains there may not exist a single dissimilarity measure that captures all possible anomalous patterns. In such cases, multiple dissimilarity measures can be defined, including non-metric measures, and one can test for anomalies by scalarizing using a non-negative linear combination of them. If the relative importance of the different dissimilarity measures are not known in advance, as in many anomaly detection applications, the anomaly detection algorithm may need to be executed multiple times with different choices of weights in the linear combination. In this paper, we propose a method for similarity-based anomaly detection using a novel multi-criteria dissimilarity measure, the Pareto depth. The proposed Pareto depth analysis (PDA) anomaly detection algorithm uses the concept of Pareto optimality to detect anomalies under multiple criteria without having to run an algorithm multiple times with different choices of weights. The proposed PDA approach is provably better than using linear combinations of the criteria and shows superior performance on experiments with synthetic and real data sets.
△ Less
Submitted 20 August, 2015;
originally announced August 2015.
-
Pareto-depth for Multiple-query Image Retrieval
Authors:
Ko-Jen Hsiao,
Jeff Calder,
Alfred O. Hero III
Abstract:
Most content-based image retrieval systems consider either one single query, or multiple queries that include the same object or represent the same semantic information. In this paper we consider the content-based image retrieval problem for multiple query images corresponding to different image semantics. We propose a novel multiple-query information retrieval algorithm that combines the Pareto f…
▽ More
Most content-based image retrieval systems consider either one single query, or multiple queries that include the same object or represent the same semantic information. In this paper we consider the content-based image retrieval problem for multiple query images corresponding to different image semantics. We propose a novel multiple-query information retrieval algorithm that combines the Pareto front method (PFM) with efficient manifold ranking (EMR). We show that our proposed algorithm outperforms state of the art multiple-query retrieval algorithms on real-world image databases. We attribute this performance improvement to concavity properties of the Pareto fronts, and prove a theoretical result that characterizes the asymptotic concavity of the fronts.
△ Less
Submitted 20 February, 2014;
originally announced February 2014.
-
Multi-criteria Anomaly Detection using Pareto Depth Analysis
Authors:
Ko-Jen Hsiao,
Kevin S. Xu,
Jeff Calder,
Alfred O. Hero III
Abstract:
We consider the problem of identifying patterns in a data set that exhibit anomalous behavior, often referred to as anomaly detection. In most anomaly detection algorithms, the dissimilarity between data samples is calculated by a single criterion, such as Euclidean distance. However, in many cases there may not exist a single dissimilarity measure that captures all possible anomalous patterns. In…
▽ More
We consider the problem of identifying patterns in a data set that exhibit anomalous behavior, often referred to as anomaly detection. In most anomaly detection algorithms, the dissimilarity between data samples is calculated by a single criterion, such as Euclidean distance. However, in many cases there may not exist a single dissimilarity measure that captures all possible anomalous patterns. In such a case, multiple criteria can be defined, and one can test for anomalies by scalarizing the multiple criteria using a linear combination of them. If the importance of the different criteria are not known in advance, the algorithm may need to be executed multiple times with different choices of weights in the linear combination. In this paper, we introduce a novel non-parametric multi-criteria anomaly detection method using Pareto depth analysis (PDA). PDA uses the concept of Pareto optimality to detect anomalies under multiple criteria without having to run an algorithm multiple times with different choices of weights. The proposed PDA approach scales linearly in the number of criteria and is provably better than linear combinations of the criteria.
△ Less
Submitted 7 January, 2013; v1 submitted 17 October, 2011;
originally announced October 2011.