Search | arXiv e-print repository

Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection

Authors: Ali Karami, Thi Kieu Khanh Ho, Narges Armanfard

Abstract: Skeleton-based video anomaly detection (SVAD) is a crucial task in computer vision. Accurately identifying abnormal patterns or events enables operators to promptly detect suspicious activities, thereby enhancing safety. Achieving this demands a comprehensive understanding of human motions, both at body and region levels, while also accounting for the wide variations of performing a single action.… ▽ More Skeleton-based video anomaly detection (SVAD) is a crucial task in computer vision. Accurately identifying abnormal patterns or events enables operators to promptly detect suspicious activities, thereby enhancing safety. Achieving this demands a comprehensive understanding of human motions, both at body and region levels, while also accounting for the wide variations of performing a single action. However, existing studies fail to simultaneously address these crucial properties. This paper introduces a novel, practical and lightweight framework, namely Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection (GiCiSAD) to overcome the challenges associated with SVAD. GiCiSAD consists of three novel modules: the Graph Attention-based Forecasting module to capture the spatio-temporal dependencies inherent in the data, the Graph-level Jigsaw Puzzle Maker module to distinguish subtle region-level discrepancies between normal and abnormal motions, and the Graph-based Conditional Diffusion model to generate a wide spectrum of human motions. Extensive experiments on four widely used skeleton-based video datasets show that GiCiSAD outperforms existing methods with significantly fewer training parameters, establishing it as the new state-of-the-art. △ Less

Submitted 30 August, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

Comments: Accepted at the Winter Conference on Applications of Computer Vision (WACV). 17 pages, 6 figures, 6 tables

arXiv:2310.12294 [pdf, other]

Open-Set Multivariate Time-Series Anomaly Detection

Authors: Thomas Lai, Thi Kieu Khanh Ho, Narges Armanfard

Abstract: Numerous methods for time-series anomaly detection (TSAD) have emerged in recent years, most of which are unsupervised and assume that only normal samples are available during the training phase, due to the challenge of obtaining abnormal data in real-world scenarios. Still, limited samples of abnormal data are often available, albeit they are far from representative of all possible anomalies. Sup… ▽ More Numerous methods for time-series anomaly detection (TSAD) have emerged in recent years, most of which are unsupervised and assume that only normal samples are available during the training phase, due to the challenge of obtaining abnormal data in real-world scenarios. Still, limited samples of abnormal data are often available, albeit they are far from representative of all possible anomalies. Supervised methods can be utilized to classify normal and seen anomalies, but they tend to overfit to the seen anomalies present during training, hence, they fail to generalize to unseen anomalies. We propose the first algorithm to address the open-set TSAD problem, called Multivariate Open-Set Time-Series Anomaly Detector (MOSAD), that leverages only a few shots of labeled anomalies during the training phase in order to achieve superior anomaly detection performance compared to both supervised and unsupervised TSAD algorithms. MOSAD is a novel multi-head TSAD framework with a shared representation space and specialized heads, including the Generative head, the Discriminative head, and the Anomaly-Aware Contrastive head. The latter produces a superior representation space for anomaly detection compared to conventional supervised contrastive learning. Extensive experiments on three real-world datasets establish MOSAD as a new state-of-the-art in the TSAD field. △ Less

Submitted 7 August, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

Comments: Accepted to ECAI-2024

arXiv:2308.12563 [pdf, other]

Contaminated Multivariate Time-Series Anomaly Detection with Spatio-Temporal Graph Conditional Diffusion Models

Authors: Thi Kieu Khanh Ho, Narges Armanfard

Abstract: Mainstream unsupervised anomaly detection algorithms often excel in academic datasets, yet their real-world performance is restricted due to the controlled experimental conditions involving clean training data. Addressing the challenge of training with noise, a prevalent issue in practical anomaly detection, is frequently overlooked. In a pioneering endeavor, this study delves into the realm of la… ▽ More Mainstream unsupervised anomaly detection algorithms often excel in academic datasets, yet their real-world performance is restricted due to the controlled experimental conditions involving clean training data. Addressing the challenge of training with noise, a prevalent issue in practical anomaly detection, is frequently overlooked. In a pioneering endeavor, this study delves into the realm of label-level noise within sensory time-series anomaly detection (TSAD). This paper presents a novel and practical end-to-end unsupervised TSAD when the training data is contaminated with anomalies. The introduced approach, called TSAD-C, is devoid of access to abnormality labels during the training phase. TSAD-C encompasses three core modules: a Decontaminator to rectify anomalies (aka noise) present during training, a Long-range Variable Dependency Modeling module to capture long-term intra- and inter-variable dependencies within the decontaminated data that is considered as a surrogate of the pure normal data, and an Anomaly Scoring module to detect anomalies from all types. Our extensive experiments conducted on four reliable and diverse datasets conclusively demonstrate that TSAD-C surpasses existing methodologies, thus establishing a new state-of-the-art in the TSAD field. △ Less

Submitted 9 May, 2025; v1 submitted 24 August, 2023; originally announced August 2023.

Comments: Accepted to The Conference on Uncertainty in Artificial Intelligence (UAI 2025)

arXiv:2302.00058 [pdf, other]

Graph Anomaly Detection in Time Series: A Survey

Authors: Thi Kieu Khanh Ho, Ali Karami, Narges Armanfard

Abstract: With the recent advances in technology, a wide range of systems continue to collect a large amount of data over time and thus generate time series. Time-Series Anomaly Detection (TSAD) is an important task in various time-series applications such as e-commerce, cybersecurity, vehicle maintenance, and healthcare monitoring. However, this task is very challenging as it requires considering both the… ▽ More With the recent advances in technology, a wide range of systems continue to collect a large amount of data over time and thus generate time series. Time-Series Anomaly Detection (TSAD) is an important task in various time-series applications such as e-commerce, cybersecurity, vehicle maintenance, and healthcare monitoring. However, this task is very challenging as it requires considering both the intra-variable dependency (relationships within a variable over time) and the inter-variable dependency (relationships between multiple variables) existing in time-series data. Recent graph-based approaches have made impressive progress in tackling the challenges of this field. In this survey, we conduct a comprehensive and up-to-date review of TSAD using graphs, referred to as G-TSAD. First, we explore the significant potential of graph representation for time-series data and and its contributions to facilitating anomaly detection. Then, we review state-of-the-art graph anomaly detection techniques, mostly leveraging deep learning architectures, in the context of time series. For each method, we discuss its strengths, limitations, and the specific applications where it excels. Finally, we address both the technical and application challenges currently facing the field, and suggest potential future directions for advancing research and improving practical outcomes. △ Less

Submitted 29 April, 2025; v1 submitted 31 January, 2023; originally announced February 2023.

Comments: Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI)

arXiv:2209.00525 [pdf]

Complexity of Representations in Deep Learning

Authors: Tin Kam Ho

Abstract: Deep neural networks use multiple layers of functions to map an object represented by an input vector progressively to different representations, and with sufficient training, eventually to a single score for each class that is the output of the final decision function. Ideally, in this output space, the objects of different classes achieve maximum separation. Motivated by the need to better under… ▽ More Deep neural networks use multiple layers of functions to map an object represented by an input vector progressively to different representations, and with sufficient training, eventually to a single score for each class that is the output of the final decision function. Ideally, in this output space, the objects of different classes achieve maximum separation. Motivated by the need to better understand the inner working of a deep neural network, we analyze the effectiveness of the learned representations in separating the classes from a data complexity perspective. Using a simple complexity measure, a popular benchmarking task, and a well-known architecture design, we show how the data complexity evolves through the network, how it changes during training, and how it is impacted by the network design and the availability of training samples. We discuss the implications of the observations and the potentials for further studies. △ Less

Submitted 1 September, 2022; originally announced September 2022.

Journal ref: Proceedings of the 26th International Conference on Pattern Recognition (ICPR 2022), August 21-25, 2022, Montréal, Québec, Canada

arXiv:2208.07448 [pdf, other]

Self-Supervised Learning for Anomalous Channel Detection in EEG Graphs: Application to Seizure Analysis

Authors: Thi Kieu Khanh Ho, Narges Armanfard

Abstract: Electroencephalogram (EEG) signals are effective tools towards seizure analysis where one of the most important challenges is accurate detection of seizure events and brain regions in which seizure happens or initiates. However, all existing machine learning-based algorithms for seizure analysis require access to the labeled seizure data while acquiring labeled data is very labor intensive, expens… ▽ More Electroencephalogram (EEG) signals are effective tools towards seizure analysis where one of the most important challenges is accurate detection of seizure events and brain regions in which seizure happens or initiates. However, all existing machine learning-based algorithms for seizure analysis require access to the labeled seizure data while acquiring labeled data is very labor intensive, expensive, as well as clinicians dependent given the subjective nature of the visual qualitative interpretation of EEG signals. In this paper, we propose to detect seizure channels and clips in a self-supervised manner where no access to the seizure data is needed. The proposed method considers local structural and contextual information embedded in EEG graphs by employing positive and negative sub-graphs. We train our method through minimizing contrastive and generative losses. The employ of local EEG sub-graphs makes the algorithm an appropriate choice when accessing to the all EEG channels is impossible due to complications such as skull fractures. We conduct an extensive set of experiments on the largest seizure dataset and demonstrate that our proposed framework outperforms the state-of-the-art methods in the EEG-based seizure study. The proposed method is the only study that requires no access to the seizure data in its training phase, yet establishes a new state-of-the-art to the field, and outperforms all related supervised methods. △ Less

Submitted 15 January, 2023; v1 submitted 15 August, 2022; originally announced August 2022.

Comments: Accepted at AAAI-23 (Oral)

arXiv:2205.05173 [pdf, other]

doi 10.1016/j.neunet.2024.106106

Self-Supervised Anomaly Detection in Computer Vision and Beyond: A Survey and Outlook

Authors: Hadi Hojjati, Thi Kieu Khanh Ho, Narges Armanfard

Abstract: Anomaly detection (AD) plays a crucial role in various domains, including cybersecurity, finance, and healthcare, by identifying patterns or events that deviate from normal behaviour. In recent years, significant progress has been made in this field due to the remarkable growth of deep learning models. Notably, the advent of self-supervised learning has sparked the development of novel AD algorith… ▽ More Anomaly detection (AD) plays a crucial role in various domains, including cybersecurity, finance, and healthcare, by identifying patterns or events that deviate from normal behaviour. In recent years, significant progress has been made in this field due to the remarkable growth of deep learning models. Notably, the advent of self-supervised learning has sparked the development of novel AD algorithms that outperform the existing state-of-the-art approaches by a considerable margin. This paper aims to provide a comprehensive review of the current methodologies in self-supervised anomaly detection. We present technical details of the standard methods and discuss their strengths and drawbacks. We also compare the performance of these models against each other and other state-of-the-art anomaly detection models. Finally, the paper concludes with a discussion of future directions for self-supervised anomaly detection, including the development of more effective and efficient algorithms and the integration of these techniques with other related fields, such as multi-modal learning. △ Less

Submitted 23 January, 2024; v1 submitted 10 May, 2022; originally announced May 2022.

Comments: 18 pages, 4 figures, 5 tables

Journal ref: Neural Networks, Volume 172, April 2024, 106106

arXiv:2002.01412 [pdf, other]

Iterative Data Programming for Expanding Text Classification Corpora

Authors: Neil Mallinar, Abhishek Shah, Tin Kam Ho, Rajendra Ugrani, Ayush Gupta

Abstract: Real-world text classification tasks often require many labeled training examples that are expensive to obtain. Recent advancements in machine teaching, specifically the data programming paradigm, facilitate the creation of training data sets quickly via a general framework for building weak models, also known as labeling functions, and denoising them through ensemble learning techniques. We prese… ▽ More Real-world text classification tasks often require many labeled training examples that are expensive to obtain. Recent advancements in machine teaching, specifically the data programming paradigm, facilitate the creation of training data sets quickly via a general framework for building weak models, also known as labeling functions, and denoising them through ensemble learning techniques. We present a fast, simple data programming method for augmenting text data sets by generating neighborhood-based weak models with minimal supervision. Furthermore, our method employs an iterative procedure to identify sparsely distributed examples from large volumes of unlabeled data. The iterative data programming techniques improve newer weak models as more labeled data is confirmed with human-in-loop. We show empirical results on sentence classification tasks, including those from a task of improving intent recognition in conversational agents. △ Less

Submitted 4 February, 2020; originally announced February 2020.

Comments: 6 pages, 2 figures, In Proceedings of the AAAI Conference on Artificial Intelligence 2020 (IAAI Technical Track: Emerging Papers)

arXiv:1812.06176 [pdf, other]

Bootstrapping Conversational Agents With Weak Supervision

Authors: Neil Mallinar, Abhishek Shah, Rajendra Ugrani, Ayush Gupta, Manikandan Gurusankar, Tin Kam Ho, Q. Vera Liao, Yunfeng Zhang, Rachel K. E. Bellamy, Robert Yates, Chris Desmarais, Blake McGregor

Abstract: Many conversational agents in the market today follow a standard bot development framework which requires training intent classifiers to recognize user input. The need to create a proper set of training examples is often the bottleneck in the development process. In many occasions agent developers have access to historical chat logs that can provide a good quantity as well as coverage of training… ▽ More Many conversational agents in the market today follow a standard bot development framework which requires training intent classifiers to recognize user input. The need to create a proper set of training examples is often the bottleneck in the development process. In many occasions agent developers have access to historical chat logs that can provide a good quantity as well as coverage of training examples. However, the cost of labeling them with tens to hundreds of intents often prohibits taking full advantage of these chat logs. In this paper, we present a framework called \textit{search, label, and propagate} (SLP) for bootstrapping intents from existing chat logs using weak supervision. The framework reduces hours to days of labeling effort down to minutes of work by using a search engine to find examples, then relies on a data programming approach to automatically expand the labels. We report on a user study that shows positive user feedback for this new approach to build conversational agents, and demonstrates the effectiveness of using data programming for auto-labeling. While the system is developed for training conversational agents, the framework has broader application in significantly reducing labeling effort for training text classifiers. △ Less

Submitted 14 December, 2018; originally announced December 2018.

Comments: 6 pages, 3 figures, 1 table, Accepted for publication in IAAI 2019

arXiv:1808.03591 [pdf, other]

How Complex is your classification problem? A survey on measuring classification complexity

Authors: Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, Tin K. Ho

Abstract: Characteristics extracted from the training datasets of classification problems have proven to be effective predictors in a number of meta-analyses. Among them, measures of classification complexity can be used to estimate the difficulty in separating the data points into their expected classes. Descriptors of the spatial distribution of the data and estimates of the shape and size of the decision… ▽ More Characteristics extracted from the training datasets of classification problems have proven to be effective predictors in a number of meta-analyses. Among them, measures of classification complexity can be used to estimate the difficulty in separating the data points into their expected classes. Descriptors of the spatial distribution of the data and estimates of the shape and size of the decision boundary are among the known measures for this characterization. This information can support the formulation of new data-driven pre-processing and pattern recognition techniques, which can in turn be focused on challenges highlighted by such characteristics of the problems. This paper surveys and analyzes measures which can be extracted from the training datasets in order to characterize the complexity of the respective classification problems. Their use in recent literature is also reviewed and discussed, allowing to prospect opportunities for future work in the area. Finally, descriptions are given on an R package named Extended Complexity Library (ECoL) that implements a set of complexity measures and is made publicly available. △ Less

Submitted 30 December, 2020; v1 submitted 10 August, 2018; originally announced August 2018.

Comments: Survey paper

arXiv:cs/0402021 [pdf, ps, other]

A Numerical Example on the Principles of Stochastic Discrimination

Authors: Tin Kam Ho

Abstract: Studies on ensemble methods for classification suffer from the difficulty of modeling the complementary strengths of the components. Kleinberg's theory of stochastic discrimination (SD) addresses this rigorously via mathematical notions of enrichment, uniformity, and projectability of an ensemble. We explain these concepts via a very simple numerical example that captures the basic principles of… ▽ More Studies on ensemble methods for classification suffer from the difficulty of modeling the complementary strengths of the components. Kleinberg's theory of stochastic discrimination (SD) addresses this rigorously via mathematical notions of enrichment, uniformity, and projectability of an ensemble. We explain these concepts via a very simple numerical example that captures the basic principles of the SD theory and method. We focus on a fundamental symmetry in point set covering that is the key observation leading to the foundation of the theory. We believe a better understanding of the SD method will lead to developments of better tools for analyzing other ensemble methods. △ Less

Submitted 11 February, 2004; originally announced February 2004.

Comments: Proceedings of the 7th Course on Ensemble Methods for Learning Machines at the International School on Neural Nets ``E.R. Caianiello''

ACM Class: I.5.0

arXiv:cs/0402020 [pdf, ps, other]

Geometrical Complexity of Classification Problems

Authors: Tin Kam Ho

Abstract: Despite encouraging recent progresses in ensemble approaches, classification methods seem to have reached a plateau in development. Further advances depend on a better understanding of geometrical and topological characteristics of point sets in high-dimensional spaces, the preservation of such characteristics under feature transformations and sampling processes, and their interaction with geome… ▽ More Despite encouraging recent progresses in ensemble approaches, classification methods seem to have reached a plateau in development. Further advances depend on a better understanding of geometrical and topological characteristics of point sets in high-dimensional spaces, the preservation of such characteristics under feature transformations and sampling processes, and their interaction with geometrical models used in classifiers. We discuss an attempt to measure such properties from data sets and relate them to classifier accuracies. △ Less

Submitted 11 February, 2004; originally announced February 2004.

Comments: Proceedings of the 7th Course on Ensemble Methods for Learning Machines at the International School on Neural Nets ``E.R. Caianiello''

ACM Class: I.5.0

Showing 1–12 of 12 results for author: Ho, T K