-
Adaptive Additive Parameter Updates of Vision Transformers for Few-Shot Continual Learning
Authors:
Kyle Stein,
Andrew Arash Mahyari,
Guillermo Francia III,
Eman El-Sheikh
Abstract:
Integrating new class information without losing previously acquired knowledge remains a central challenge in artificial intelligence, often referred to as catastrophic forgetting. Few-shot class incremental learning (FSCIL) addresses this by first training a model on a robust dataset of base classes and then incrementally adapting it in successive sessions using only a few labeled examples per no…
▽ More
Integrating new class information without losing previously acquired knowledge remains a central challenge in artificial intelligence, often referred to as catastrophic forgetting. Few-shot class incremental learning (FSCIL) addresses this by first training a model on a robust dataset of base classes and then incrementally adapting it in successive sessions using only a few labeled examples per novel class. However, this approach is prone to overfitting on the limited new data, which can compromise overall performance and exacerbate forgetting. In this work, we propose a simple yet effective novel FSCIL framework that leverages a frozen Vision Transformer (ViT) backbone augmented with parameter-efficient additive updates. Our approach freezes the pre-trained ViT parameters and selectively injects trainable weights into the self-attention modules via an additive update mechanism. This design updates only a small subset of parameters to accommodate new classes without sacrificing the representations learned during the base session. By fine-tuning a limited number of parameters, our method preserves the generalizable features in the frozen ViT while reducing the risk of overfitting. Furthermore, as most parameters remain fixed, the model avoids overwriting previously learned knowledge when small novel data batches are introduced. Extensive experiments on benchmark datasets demonstrate that our approach yields state-of-the-art performance compared to baseline FSCIL methods.
△ Less
Submitted 3 May, 2025; v1 submitted 11 April, 2025;
originally announced April 2025.
-
Transductive One-Shot Learning Meet Subspace Decomposition
Authors:
Kyle Stein,
Andrew A. Mahyari,
Guillermo Francia III,
Eman El-Sheikh
Abstract:
One-shot learning focuses on adapting pretrained models to recognize newly introduced and unseen classes based on a single labeled image. While variations of few-shot and zero-shot learning exist, one-shot learning remains a challenging yet crucial problem due to its ability to generalize knowledge to unseen classes from just one human-annotated image. In this paper, we introduce a transductive on…
▽ More
One-shot learning focuses on adapting pretrained models to recognize newly introduced and unseen classes based on a single labeled image. While variations of few-shot and zero-shot learning exist, one-shot learning remains a challenging yet crucial problem due to its ability to generalize knowledge to unseen classes from just one human-annotated image. In this paper, we introduce a transductive one-shot learning approach that employs subspace decomposition to utilize the information from labeled images in the support set and unlabeled images in the query set. These images are decomposed into a linear combination of latent variables representing primitives captured by smaller subspaces. By representing images in the query set as linear combinations of these latent primitives, we can propagate the label from a single image in the support set to query images that share similar combinations of primitives. Through a comprehensive quantitative analysis across various neural network feature extractors and datasets, we demonstrate that our approach can effectively generalize to novel classes from just one labeled image.
△ Less
Submitted 20 May, 2025; v1 submitted 31 March, 2025;
originally announced April 2025.
-
Visual Adaptive Prompting for Compositional Zero-Shot Learning
Authors:
Kyle Stein,
Arash Mahyari,
Guillermo Francia,
Eman El-Sheikh
Abstract:
Vision-Language Models (VLMs) have demonstrated impressive capabilities in learning joint representations of visual and textual data, making them powerful tools for tasks such as Compositional Zero-Shot Learning (CZSL). CZSL requires models to generalize to novel combinations of visual primitives-such as attributes and objects-that were not explicitly encountered during training. Recent works in p…
▽ More
Vision-Language Models (VLMs) have demonstrated impressive capabilities in learning joint representations of visual and textual data, making them powerful tools for tasks such as Compositional Zero-Shot Learning (CZSL). CZSL requires models to generalize to novel combinations of visual primitives-such as attributes and objects-that were not explicitly encountered during training. Recent works in prompting for CZSL have focused on modifying inputs for the text encoder, often using static prompts that do not change across varying visual contexts. However, these approaches struggle to fully capture varying visual contexts, as they focus on text adaptation rather than leveraging visual features for compositional reasoning. To address this, we propose Visual Adaptive Prompting System (VAPS) that leverages a learnable visual prompt repository and similarity-based retrieval mechanism within the framework of VLMs to bridge the gap between semantic and visual features. Our method introduces a dynamic visual prompt repository mechanism that selects the most relevant attribute and object prompts based on the visual features of the image. Our proposed system includes a visual prompt adapter that encourages the model to learn a more generalizable embedding space. Experiments on three CZSL benchmarks, across both closed and open-world scenarios, demonstrate state-of-the-art results.
△ Less
Submitted 2 May, 2025; v1 submitted 27 February, 2025;
originally announced February 2025.
-
Proactive Adversarial Defense: Harnessing Prompt Tuning in Vision-Language Models to Detect Unseen Backdoored Images
Authors:
Kyle Stein,
Andrew Arash Mahyari,
Guillermo Francia,
Eman El-Sheikh
Abstract:
Backdoor attacks pose a critical threat by embedding hidden triggers into inputs, causing models to misclassify them into target labels. While extensive research has focused on mitigating these attacks in object recognition models through weight fine-tuning, much less attention has been given to detecting backdoored samples directly. Given the vast datasets used in training, manual inspection for…
▽ More
Backdoor attacks pose a critical threat by embedding hidden triggers into inputs, causing models to misclassify them into target labels. While extensive research has focused on mitigating these attacks in object recognition models through weight fine-tuning, much less attention has been given to detecting backdoored samples directly. Given the vast datasets used in training, manual inspection for backdoor triggers is impractical, and even state-of-the-art defense mechanisms fail to fully neutralize their impact. To address this gap, we introduce a groundbreaking method to detect unseen backdoored images during both training and inference. Leveraging the transformative success of prompt tuning in Vision Language Models (VLMs), our approach trains learnable text prompts to differentiate clean images from those with hidden backdoor triggers. Experiments demonstrate the exceptional efficacy of this method, achieving an impressive average accuracy of 86% across two renowned datasets for detecting unseen backdoor triggers, establishing a new standard in backdoor defense.
△ Less
Submitted 7 April, 2025; v1 submitted 11 December, 2024;
originally announced December 2024.
-
Toward a Real-Time Digital Twin Framework for Infection Mitigation During Air Travel
Authors:
Ashok Srinivasan,
Satkkeerthi Sriram,
Sirish Namilae,
Andrew Arash Mahyari
Abstract:
Pedestrian dynamics simulates the fine-scaled trajectories of individuals in a crowd. It has been used to suggest public health interventions to reduce infection risk in important components of air travel, such as during boarding and in airport security lines. Due to inherent variability in human behavior, it is difficult to generalize simulation results to new geographic, cultural, or temporal co…
▽ More
Pedestrian dynamics simulates the fine-scaled trajectories of individuals in a crowd. It has been used to suggest public health interventions to reduce infection risk in important components of air travel, such as during boarding and in airport security lines. Due to inherent variability in human behavior, it is difficult to generalize simulation results to new geographic, cultural, or temporal contexts. A digital twin, relying on real-time data, such as video feeds, can resolve this limitation. This paper addresses the following critical gaps in knowledge required for a digital twin. (1) Pedestrian dynamics models currently lack accurate representations of collision avoidance behavior when two moving pedestrians try to avoid collisions. (2) It is not known whether data assimilation techniques designed for physical systems are effective for pedestrian dynamics. We address the first limitation by training a model with data from offline video feeds of collision avoidance to simulate these trajectories realistically, using symbolic regression to identify unknown functional forms. We address the second limitation by showing that pedestrian dynamics with data assimilation can predict pedestrian trajectories with sufficient accuracy. These results promise to enable the development of a digital twin for pedestrian movement in airports that can help with real-time crowd management to reduce health risks.
△ Less
Submitted 17 October, 2024;
originally announced October 2024.
-
Packet Inspection Transformer: A Self-Supervised Journey to Unseen Malware Detection with Few Samples
Authors:
Kyle Stein,
Arash Mahyari,
Guillermo Francia III,
Eman El-Sheikh
Abstract:
As networks continue to expand and become more interconnected, the need for novel malware detection methods becomes more pronounced. Traditional security measures are increasingly inadequate against the sophistication of modern cyber attacks. Deep Packet Inspection (DPI) has been pivotal in enhancing network security, offering an in-depth analysis of network traffic that surpasses conventional mon…
▽ More
As networks continue to expand and become more interconnected, the need for novel malware detection methods becomes more pronounced. Traditional security measures are increasingly inadequate against the sophistication of modern cyber attacks. Deep Packet Inspection (DPI) has been pivotal in enhancing network security, offering an in-depth analysis of network traffic that surpasses conventional monitoring techniques. DPI not only examines the metadata of network packets, but also dives into the actual content being carried within the packet payloads, providing a comprehensive view of the data flowing through networks. While the integration of advanced deep learning techniques with DPI has introduced modern methodologies into malware detection and network traffic classification, state-of-the-art supervised learning approaches are limited by their reliance on large amounts of annotated data and their inability to generalize to novel, unseen malware threats. To address these limitations, this paper leverages the recent advancements in self-supervised learning (SSL) and few-shot learning (FSL). Our proposed self-supervised approach trains a transformer via SSL to learn the embedding of packet content, including payload, from vast amounts of unlabeled data by masking portions of packets, leading to a learned representation that generalizes to various downstream tasks. Once the representation is extracted from the packets, they are used to train a malware detection algorithm. The representation obtained from the transformer is then used to adapt the malware detector to novel types of attacks using few-shot learning approaches. Our experimental results demonstrate that our method achieves classification accuracies of up to 94.76% on the UNSW-NB15 dataset and 83.25% on the CIC-IoT23 dataset.
△ Less
Submitted 21 February, 2025; v1 submitted 26 September, 2024;
originally announced September 2024.
-
Towards Novel Malicious Packet Recognition: A Few-Shot Learning Approach
Authors:
Kyle Stein,
Andrew A. Mahyari,
Guillermo Francia III,
Eman El-Sheikh
Abstract:
As the complexity and connectivity of networks increase, the need for novel malware detection approaches becomes imperative. Traditional security defenses are becoming less effective against the advanced tactics of today's cyberattacks. Deep Packet Inspection (DPI) has emerged as a key technology in strengthening network security, offering detailed analysis of network traffic that goes beyond simp…
▽ More
As the complexity and connectivity of networks increase, the need for novel malware detection approaches becomes imperative. Traditional security defenses are becoming less effective against the advanced tactics of today's cyberattacks. Deep Packet Inspection (DPI) has emerged as a key technology in strengthening network security, offering detailed analysis of network traffic that goes beyond simple metadata analysis. DPI examines not only the packet headers but also the payload content within, offering a thorough insight into the data traversing the network. This study proposes a novel approach that leverages a large language model (LLM) and few-shot learning to accurately recognizes novel, unseen malware types with few labels samples. Our proposed approach uses a pretrained LLM on known malware types to extract the embeddings from packets. The embeddings are then used alongside few labeled samples of an unseen malware type. This technique is designed to acclimate the model to different malware representations, further enabling it to generate robust embeddings for each trained and unseen classes. Following the extraction of embeddings from the LLM, few-shot learning is utilized to enhance performance with minimal labeled data. Our evaluation, which utilized two renowned datasets, focused on identifying malware types within network traffic and Internet of Things (IoT) environments. Our approach shows promising results with an average accuracy of 86.35% and F1-Score of 86.40% on different malware types across the two datasets.
△ Less
Submitted 17 September, 2024;
originally announced September 2024.
-
Harnessing the Power of LLMs in Source Code Vulnerability Detection
Authors:
Andrew A Mahyari
Abstract:
Software vulnerabilities, caused by unintentional flaws in source code, are a primary root cause of cyberattacks. Static analysis of source code has been widely used to detect these unintentional defects introduced by software developers. Large Language Models (LLMs) have demonstrated human-like conversational abilities due to their capacity to capture complex patterns in sequential data, such as…
▽ More
Software vulnerabilities, caused by unintentional flaws in source code, are a primary root cause of cyberattacks. Static analysis of source code has been widely used to detect these unintentional defects introduced by software developers. Large Language Models (LLMs) have demonstrated human-like conversational abilities due to their capacity to capture complex patterns in sequential data, such as natural languages. In this paper, we harness LLMs' capabilities to analyze source code and detect known vulnerabilities. To ensure the proposed vulnerability detection method is universal across multiple programming languages, we convert source code to LLVM IR and train LLMs on these intermediate representations. We conduct extensive experiments on various LLM architectures and compare their accuracy. Our comprehensive experiments on real-world and synthetic codes from NVD and SARD demonstrate high accuracy in identifying source code vulnerabilities.
△ Less
Submitted 6 August, 2024;
originally announced August 2024.
-
A Transformer-Based Framework for Payload Malware Detection and Classification
Authors:
Kyle Stein,
Arash Mahyari,
Guillermo Francia III,
Eman El-Sheikh
Abstract:
As malicious cyber threats become more sophisticated in breaching computer networks, the need for effective intrusion detection systems (IDSs) becomes crucial. Techniques such as Deep Packet Inspection (DPI) have been introduced to allow IDSs analyze the content of network packets, providing more context for identifying potential threats. IDSs traditionally rely on using anomaly-based and signatur…
▽ More
As malicious cyber threats become more sophisticated in breaching computer networks, the need for effective intrusion detection systems (IDSs) becomes crucial. Techniques such as Deep Packet Inspection (DPI) have been introduced to allow IDSs analyze the content of network packets, providing more context for identifying potential threats. IDSs traditionally rely on using anomaly-based and signature-based detection techniques to detect unrecognized and suspicious activity. Deep learning techniques have shown great potential in DPI for IDSs due to their efficiency in learning intricate patterns from the packet content being transmitted through the network. In this paper, we propose a revolutionary DPI algorithm based on transformers adapted for the purpose of detecting malicious traffic with a classifier head. Transformers learn the complex content of sequence data and generalize them well to similar scenarios thanks to their self-attention mechanism. Our proposed method uses the raw payload bytes that represent the packet contents and is deployed as man-in-the-middle. The payload bytes are used to detect malicious packets and classify their types. Experimental results on the UNSW-NB15 and CIC-IOT23 datasets demonstrate that our transformer-based model is effective in distinguishing malicious from benign traffic in the test dataset, attaining an average accuracy of 79\% using binary classification and 72\% on the multi-classification experiment, both using solely payload bytes.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
A Hierarchical Deep Neural Network for Detecting Lines of Codes with Vulnerabilities
Authors:
Arash Mahyari
Abstract:
Software vulnerabilities, caused by unintentional flaws in source codes, are the main root cause of cyberattacks. Source code static analysis has been used extensively to detect the unintentional defects, i.e. vulnerabilities, introduced into the source codes by software developers. In this paper, we propose a deep learning approach to detect vulnerabilities from their LLVM IR representations base…
▽ More
Software vulnerabilities, caused by unintentional flaws in source codes, are the main root cause of cyberattacks. Source code static analysis has been used extensively to detect the unintentional defects, i.e. vulnerabilities, introduced into the source codes by software developers. In this paper, we propose a deep learning approach to detect vulnerabilities from their LLVM IR representations based on the techniques that have been used in natural language processing. The proposed approach uses a hierarchical process to first identify source codes with vulnerabilities, and then it identifies the lines of codes that contribute to the vulnerability within the detected source codes. This proposed two-step approach reduces the false alarm of detecting vulnerable lines. Our extensive experiment on real-world and synthetic codes collected in NVD and SARD shows high accuracy (about 98\%) in detecting source code vulnerabilities.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
Real-Time Learning from An Expert in Deep Recommendation Systems with Marginal Distance Probability Distribution
Authors:
Arash Mahyari,
Peter Pirolli,
Jacqueline A. LeBlanc
Abstract:
Recommendation systems play an important role in today's digital world. They have found applications in various applications such as music platforms, e.g., Spotify, and movie streaming services, e.g., Netflix. Less research effort has been devoted to physical exercise recommendation systems. Sedentary lifestyles have become the major driver of several diseases as well as healthcare costs. In this…
▽ More
Recommendation systems play an important role in today's digital world. They have found applications in various applications such as music platforms, e.g., Spotify, and movie streaming services, e.g., Netflix. Less research effort has been devoted to physical exercise recommendation systems. Sedentary lifestyles have become the major driver of several diseases as well as healthcare costs. In this paper, we develop a recommendation system for daily exercise activities to users based on their history, profile and similar users. The developed recommendation system uses a deep recurrent neural network with user-profile attention and temporal attention mechanisms.
Moreover, exercise recommendation systems are significantly different from streaming recommendation systems in that we are not able to collect click feedback from the participants in exercise recommendation systems. Thus, we propose a real-time, expert-in-the-loop active learning procedure. The active learners calculate the uncertainty of the recommender at each time step for each user and ask an expert for a recommendation when the certainty is low. In this paper, we derive the probability distribution function of marginal distance, and use it to determine when to ask experts for feedback. Our experimental results on a mHealth dataset show improved accuracy after incorporating the real-time active learner with the recommendation system.
△ Less
Submitted 4 April, 2022; v1 submitted 12 October, 2021;
originally announced October 2021.
-
Policy Augmentation: An Exploration Strategy for Faster Convergence of Deep Reinforcement Learning Algorithms
Authors:
Arash Mahyari
Abstract:
Despite advancements in deep reinforcement learning algorithms, developing an effective exploration strategy is still an open problem. Most existing exploration strategies either are based on simple heuristics, or require the model of the environment, or train additional deep neural networks to generate imagination-augmented paths. In this paper, a revolutionary algorithm, called Policy Augmentati…
▽ More
Despite advancements in deep reinforcement learning algorithms, developing an effective exploration strategy is still an open problem. Most existing exploration strategies either are based on simple heuristics, or require the model of the environment, or train additional deep neural networks to generate imagination-augmented paths. In this paper, a revolutionary algorithm, called Policy Augmentation, is introduced. Policy Augmentation is based on a newly developed inductive matrix completion method. The proposed algorithm augments the values of unexplored state-action pairs, helping the agent take actions that will result in high-value returns while the agent is in the early episodes. Training deep reinforcement learning algorithms with high-value rollouts leads to the faster convergence of deep reinforcement learning algorithms. Our experiments show the superior performance of Policy Augmentation. The code can be found at: https://github.com/arashmahyari/PolicyAugmentation.
△ Less
Submitted 9 February, 2021;
originally announced February 2021.
-
Physical Exercise Recommendation and Success Prediction Using Interconnected Recurrent Neural Networks
Authors:
Arash Mahyari,
Peter Pirolli
Abstract:
Unhealthy behaviors, e.g., physical inactivity and unhealthful food choice, are the primary healthcare cost drivers in developed countries. Pervasive computational, sensing, and communication technology provided by smartphones and smartwatches have made it possible to support individuals in their everyday lives to develop healthier lifestyles. In this paper, we propose an exercise recommendation s…
▽ More
Unhealthy behaviors, e.g., physical inactivity and unhealthful food choice, are the primary healthcare cost drivers in developed countries. Pervasive computational, sensing, and communication technology provided by smartphones and smartwatches have made it possible to support individuals in their everyday lives to develop healthier lifestyles. In this paper, we propose an exercise recommendation system that also predicts individual success rates. The system, consisting of two inter-connected recurrent neural networks (RNNs), uses the history of workouts to recommend the next workout activity for each individual. The system then predicts the probability of successful completion of the predicted activity by the individual. The prediction accuracy of this interconnected-RNN model is assessed on previously published data from a four-week mobile health experiment and is shown to improve upon previous predictions from a computational cognitive model.
△ Less
Submitted 27 January, 2021; v1 submitted 1 October, 2020;
originally announced October 2020.
-
Domain Adaptation for Robot Predictive Maintenance Systems
Authors:
Arash Golibagh Mahyari,
Thomas Locker
Abstract:
Industrial robots play an increasingly important role in a growing number of fields. For example, robotics is used to increase productivity while reducing costs in various aspects of manufacturing. Since robots are often set up in production lines, the breakdown of a single robot has a negative impact on the entire process, in the worst case bringing the whole line to a halt until the issue is res…
▽ More
Industrial robots play an increasingly important role in a growing number of fields. For example, robotics is used to increase productivity while reducing costs in various aspects of manufacturing. Since robots are often set up in production lines, the breakdown of a single robot has a negative impact on the entire process, in the worst case bringing the whole line to a halt until the issue is resolved, leading to substantial financial losses due to the unforeseen downtime. Therefore, predictive maintenance systems based on the internal signals of robots have gained attention as an essential component of robotics service offerings. The main shortcoming of existing predictive maintenance algorithms is that the extracted features typically differ significantly from the learnt model when the operation of the robot changes, incurring false alarms. In order to mitigate this problem, predictive maintenance algorithms require the model to be retrained with normal data of the new operation. In this paper, we propose a novel solution based on transfer learning to pass the knowledge of the trained model from one operation to another in order to prevent the need for retraining and to eliminate such false alarms. The deployment of the proposed unsupervised transfer learning algorithm on real-world datasets demonstrates that the algorithm can not only distinguish between operation and mechanical condition change, it further yields a sharper deviation from the trained model in case of a mechanical condition change and thus detects mechanical issues with higher confidence.
△ Less
Submitted 24 February, 2020; v1 submitted 23 September, 2018;
originally announced September 2018.
-
Simultaneous Sparse Approximation and Common Component Extraction using Fast Distributed Compressive Sensing
Authors:
Arash Golibagh Mahyari,
Selin Aviyente
Abstract:
Simultaneous sparse approximation is a generalization of the standard sparse approximation, for simultaneously representing a set of signals using a common sparsity model. Generalizing the compressive sensing concept to the simultaneous sparse approximation yields distributed compressive sensing (DCS). DCS finds the sparse representation of multiple correlated signals using the common + innovation…
▽ More
Simultaneous sparse approximation is a generalization of the standard sparse approximation, for simultaneously representing a set of signals using a common sparsity model. Generalizing the compressive sensing concept to the simultaneous sparse approximation yields distributed compressive sensing (DCS). DCS finds the sparse representation of multiple correlated signals using the common + innovation signal model. However, DCS is not efficient for joint recovery of a large number of signals since it requires large memory and computational time. In this paper, we propose a new hierarchical algorithm to implement the jointly sparse recovery framework of DCS more efficiently. The proposed algorithm is applied to video background extraction problem, where the background corresponds to the common sparse activity across frames.
△ Less
Submitted 10 April, 2016; v1 submitted 10 October, 2015;
originally announced October 2015.
-
Identification of Dynamic functional brain network states Through Tensor Decomposition
Authors:
Arash Golibagh Mahyari,
Selin Aviyente
Abstract:
With the advances in high resolution neuroimaging, there has been a growing interest in the detection of functional brain connectivity. Complex network theory has been proposed as an attractive mathematical representation of functional brain networks. However, most of the current studies of functional brain networks have focused on the computation of graph theoretic indices for static networks, i.…
▽ More
With the advances in high resolution neuroimaging, there has been a growing interest in the detection of functional brain connectivity. Complex network theory has been proposed as an attractive mathematical representation of functional brain networks. However, most of the current studies of functional brain networks have focused on the computation of graph theoretic indices for static networks, i.e. long-time averages of connectivity networks. It is well-known that functional connectivity is a dynamic process and the construction and reorganization of the networks is key to understanding human cognition. Therefore, there is a growing need to track dynamic functional brain networks and identify time intervals over which the network is quasi-stationary. In this paper, we present a tensor decomposition based method to identify temporally invariant 'network states' and find a common topographic representation for each state. The proposed methods are applied to electroencephalogram (EEG) data during the study of error-related negativity (ERN).
△ Less
Submitted 1 October, 2014;
originally announced October 2014.