Gradient-based Discrete Sampling with Automatic Cyclical Scheduling
Authors:
Patrick Pynadath,
Riddhiman Bhattacharya,
Arun Hariharan,
Ruqi Zhang
Abstract:
Discrete distributions, particularly in high-dimensional deep models, are often highly multimodal due to inherent discontinuities. While gradient-based discrete sampling has proven effective, it is susceptible to becoming trapped in local modes due to the gradient information. To tackle this challenge, we propose an automatic cyclical scheduling, designed for efficient and accurate sampling in mul…
▽ More
Discrete distributions, particularly in high-dimensional deep models, are often highly multimodal due to inherent discontinuities. While gradient-based discrete sampling has proven effective, it is susceptible to becoming trapped in local modes due to the gradient information. To tackle this challenge, we propose an automatic cyclical scheduling, designed for efficient and accurate sampling in multimodal discrete distributions. Our method contains three key components: (1) a cyclical step size schedule where large steps discover new modes and small steps exploit each mode; (2) a cyclical balancing schedule, ensuring "balanced" proposals for given step sizes and high efficiency of the Markov chain; and (3) an automatic tuning scheme for adjusting the hyperparameters in the cyclical schedules, allowing adaptability across diverse datasets with minimal tuning. We prove the non-asymptotic convergence and inference guarantee for our method in general discrete distributions. Extensive experiments demonstrate the superiority of our method in sampling complex multimodal discrete distributions.
△ Less
Submitted 24 October, 2024; v1 submitted 27 February, 2024;
originally announced February 2024.
CAMLPAD: Cybersecurity Autonomous Machine Learning Platform for Anomaly Detection
Authors:
Ayush Hariharan,
Ankit Gupta,
Trisha Pal
Abstract:
As machine learning and cybersecurity continue to explode in the context of the digital ecosystem, the complexity of cybersecurity data combined with complicated and evasive machine learning algorithms leads to vast difficulties in designing an end to end system for intelligent, automatic anomaly classification. On the other hand, traditional systems use elementary statistics techniques and are of…
▽ More
As machine learning and cybersecurity continue to explode in the context of the digital ecosystem, the complexity of cybersecurity data combined with complicated and evasive machine learning algorithms leads to vast difficulties in designing an end to end system for intelligent, automatic anomaly classification. On the other hand, traditional systems use elementary statistics techniques and are often inaccurate, leading to weak centralized data analysis platforms. In this paper, we propose a novel system that addresses these two problems, titled CAMLPAD, for Cybersecurity Autonomous Machine Learning Platform for Anomaly Detection. The CAMLPAD systems streamlined, holistic approach begins with retrieving a multitude of different species of cybersecurity data in real time using elasticsearch, then running several machine learning algorithms, namely Isolation Forest, Histogram Based Outlier Score (HBOS), Cluster Based Local Outlier Factor (CBLOF), and K Means Clustering, to process the data. Next, the calculated anomalies are visualized using Kibana and are assigned an outlier score, which serves as an indicator for whether an alert should be sent to the system administrator that there are potential anomalies in the network. After comprehensive testing of our platform in a simulated environment, the CAMLPAD system achieved an adjusted rand score of 95 percent, exhibiting the reliable accuracy and precision of the system. All in all, the CAMLPAD system provides an accurate, streamlined approach to real time cybersecurity anomaly detection, delivering a novel solution that has the potential to revolutionize the cybersecurity sector.
△ Less
Submitted 23 July, 2019;
originally announced July 2019.