-
Vacant Holes for Unsupervised Detection of the Outliers in Compact Latent Representation
Authors:
Misha Glazunov,
Apostolis Zarras
Abstract:
Detection of the outliers is pivotal for any machine learning model deployed and operated in real-world. It is essential for the Deep Neural Networks that were shown to be overconfident with such inputs. Moreover, even deep generative models that allow estimation of the probability density of the input fail in achieving this task. In this work, we concentrate on the specific type of these models:…
▽ More
Detection of the outliers is pivotal for any machine learning model deployed and operated in real-world. It is essential for the Deep Neural Networks that were shown to be overconfident with such inputs. Moreover, even deep generative models that allow estimation of the probability density of the input fail in achieving this task. In this work, we concentrate on the specific type of these models: Variational Autoencoders (VAEs). First, we unveil a significant theoretical flaw in the assumption of the classical VAE model. Second, we enforce an accommodating topological property to the image of the deep neural mapping to the latent space: compactness to alleviate the flaw and obtain the means to provably bound the image within the determined limits by squeezing both inliers and outliers together. We enforce compactness using two approaches: (i) Alexandroff extension and (ii) fixed Lipschitz continuity constant on the mapping of the encoder of the VAEs. Finally and most importantly, we discover that the anomalous inputs predominantly tend to land on the vacant latent holes within the compact space, enabling their successful identification. For that reason, we introduce a specifically devised score for hole detection and evaluate the solution against several baseline benchmarks achieving promising results.
△ Less
Submitted 16 June, 2023;
originally announced June 2023.
-
Do Bayesian Variational Autoencoders Know What They Don't Know?
Authors:
Misha Glazunov,
Apostolis Zarras
Abstract:
The problem of detecting the Out-of-Distribution (OoD) inputs is of paramount importance for Deep Neural Networks. It has been previously shown that even Deep Generative Models that allow estimating the density of the inputs may not be reliable and often tend to make over-confident predictions for OoDs, assigning to them a higher density than to the in-distribution data. This over-confidence in a…
▽ More
The problem of detecting the Out-of-Distribution (OoD) inputs is of paramount importance for Deep Neural Networks. It has been previously shown that even Deep Generative Models that allow estimating the density of the inputs may not be reliable and often tend to make over-confident predictions for OoDs, assigning to them a higher density than to the in-distribution data. This over-confidence in a single model can be potentially mitigated with Bayesian inference over the model parameters that take into account epistemic uncertainty. This paper investigates three approaches to Bayesian inference: stochastic gradient Markov chain Monte Carlo, Bayes by Backpropagation, and Stochastic Weight Averaging-Gaussian. The inference is implemented over the weights of the deep neural networks that parameterize the likelihood of the Variational Autoencoder. We empirically evaluate the approaches against several benchmarks that are often used for OoD detection: estimation of the marginal likelihood utilizing sampled model ensemble, typicality test, disagreement score, and Watanabe-Akaike Information Criterion. Finally, we introduce two simple scores that demonstrate the state-of-the-art performance.
△ Less
Submitted 29 December, 2022;
originally announced December 2022.
-
#MeTooMaastricht: Building a chatbot to assist survivors of sexual harassment
Authors:
Tobias Bauer,
Emre Devrim,
Misha Glazunov,
William Lopez Jaramillo,
Balaganesh Mohan,
Gerasimos Spanakis
Abstract:
Inspired by the recent social movement of #MeToo, we are building a chatbot to assist survivors of sexual harassment cases (designed for the city of Maastricht but can easily be extended). The motivation behind this work is twofold: properly assist survivors of such events by directing them to appropriate institutions that can offer them help and increase the incident documentation so as to gather…
▽ More
Inspired by the recent social movement of #MeToo, we are building a chatbot to assist survivors of sexual harassment cases (designed for the city of Maastricht but can easily be extended). The motivation behind this work is twofold: properly assist survivors of such events by directing them to appropriate institutions that can offer them help and increase the incident documentation so as to gather more data about harassment cases which are currently under reported. We break down the problem into three data science/machine learning components: harassment type identification (treated as a classification problem), spatio-temporal information extraction (treated as Named Entity Recognition problem) and dialogue with the users (treated as a slot-filling based chatbot). We are able to achieve a success rate of more than 98% for the identification of a harassment-or-not case and around 80% for the specific type harassment identification. Locations and dates are identified with more than 90% accuracy and time occurrences prove more challenging with almost 80%. Finally, initial validation of the chatbot shows great potential for the further development and deployment of such a beneficial for the whole society tool.
△ Less
Submitted 6 September, 2019;
originally announced September 2019.
-
Foundations of scientific research (Foundations of Research Activities)
Authors:
N. M. Glazunov
Abstract:
During years 2008 to 2011 author gives several courses on Foundations of Scientific Research at Computer Science Faculty of the National Aviation University in Kiev. This text presents material to lectures of the courses. It consists of 18 sections and some ideas of the manual can be seen from their titles. These include: General notions about scientific research. Ontologies and upper ontologies.…
▽ More
During years 2008 to 2011 author gives several courses on Foundations of Scientific Research at Computer Science Faculty of the National Aviation University in Kiev. This text presents material to lectures of the courses. It consists of 18 sections and some ideas of the manual can be seen from their titles. These include: General notions about scientific research. Ontologies and upper ontologies. Ontologies of object domains. Examples of Research Activity. Some Notions of the Theory of Finite and Discrete Sets. Algebraic Operations and Algebraic Structures. Elements of the Theory of Graphs and Nets. Scientific activity on the example of Information and its investigation. Scientific research in Artificial Intelligence. Compilers and compilation. Objective, Concepts and History of Computer security. Methodological and categorical apparatus of scientific research. Methodology and methods of scientific research. Scientific idea and significance of scientific research. Forms of scientific knowledge organization and principles of scientific research. Theoretical study, applied study and creativity. Types of scientific research: theoretical study, applied study. Types of scientific research: forms of representation of material. Some sections of the text contain enough material to lectures, but in some cases these are sketchs without references to Foundations of Research Activities. Really this is the first version of the manual and author plans to edit, modify and extend the version. Some reasons impose the author to post it as e-print. . Author compiled material from many sources and hope that it gives various points of view on Foundations of Research Activities.
△ Less
Submitted 1 December, 2012;
originally announced December 2012.