-
ASVSim (AirSim for Surface Vehicles): A High-Fidelity Simulation Framework for Autonomous Surface Vehicle Research
Authors:
Bavo Lesy,
Siemen Herremans,
Robin Kerstens,
Jan Steckel,
Walter Daems,
Siegfried Mercelis,
Ali Anwar
Abstract:
The transport industry has recently shown significant interest in unmanned surface vehicles (USVs), specifically for port and inland waterway transport. These systems can improve operational efficiency and safety, which is especially relevant in the European Union, where initiatives such as the Green Deal are driving a shift towards increased use of inland waterways. At the same time, a shortage o…
▽ More
The transport industry has recently shown significant interest in unmanned surface vehicles (USVs), specifically for port and inland waterway transport. These systems can improve operational efficiency and safety, which is especially relevant in the European Union, where initiatives such as the Green Deal are driving a shift towards increased use of inland waterways. At the same time, a shortage of qualified personnel is accelerating the adoption of autonomous solutions. However, there is a notable lack of open-source, high-fidelity simulation frameworks and datasets for developing and evaluating such solutions. To address these challenges, we introduce AirSim For Surface Vehicles (ASVSim), an open-source simulation framework specifically designed for autonomous shipping research in inland and port environments. The framework combines simulated vessel dynamics with marine sensor simulation capabilities, including radar and camera systems and supports the generation of synthetic datasets for training computer vision models and reinforcement learning agents. Built upon Cosys-AirSim, ASVSim provides a comprehensive platform for developing autonomous navigation algorithms and generating synthetic datasets. The simulator supports research of both traditional control methods and deep learning-based approaches. Through limited experiments, we demonstrate the potential of the simulator in these research areas. ASVSim is provided as an open-source project under the MIT license, making autonomous navigation research accessible to a larger part of the ocean engineering community.
△ Less
Submitted 27 June, 2025;
originally announced June 2025.
-
TDiR: Transformer based Diffusion for Image Restoration Tasks
Authors:
Abbas Anwar,
Mohammad Shullar,
Ali Arshad Nasir,
Mudassir Masood,
Saeed Anwar
Abstract:
Images captured in challenging environments often experience various forms of degradation, including noise, color cast, blur, and light scattering. These effects significantly reduce image quality, hindering their applicability in downstream tasks such as object detection, mapping, and classification. Our transformer-based diffusion model was developed to address image restoration tasks, aiming to…
▽ More
Images captured in challenging environments often experience various forms of degradation, including noise, color cast, blur, and light scattering. These effects significantly reduce image quality, hindering their applicability in downstream tasks such as object detection, mapping, and classification. Our transformer-based diffusion model was developed to address image restoration tasks, aiming to improve the quality of degraded images. This model was evaluated against existing deep learning methodologies across multiple quality metrics for underwater image enhancement, denoising, and deraining on publicly available datasets. Our findings demonstrate that the diffusion model, combined with transformers, surpasses current methods in performance. The results of our model highlight the efficacy of diffusion models and transformers in improving the quality of degraded images, consequently expanding their utility in downstream tasks that require high-fidelity visual data.
△ Less
Submitted 25 June, 2025;
originally announced June 2025.
-
ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations
Authors:
Jiahui Zhang,
Yusen Luo,
Abrar Anwar,
Sumedh Anand Sontakke,
Joseph J Lim,
Jesse Thomason,
Erdem Biyik,
Jesse Zhang
Abstract:
We introduce ReWiND, a framework for learning robot manipulation tasks solely from language instructions without per-task demonstrations. Standard reinforcement learning (RL) and imitation learning methods require expert supervision through human-designed reward functions or demonstrations for every new task. In contrast, ReWiND starts from a small demonstration dataset to learn: (1) a data-effici…
▽ More
We introduce ReWiND, a framework for learning robot manipulation tasks solely from language instructions without per-task demonstrations. Standard reinforcement learning (RL) and imitation learning methods require expert supervision through human-designed reward functions or demonstrations for every new task. In contrast, ReWiND starts from a small demonstration dataset to learn: (1) a data-efficient, language-conditioned reward function that labels the dataset with rewards, and (2) a language-conditioned policy pre-trained with offline RL using these rewards. Given an unseen task variation, ReWiND fine-tunes the pre-trained policy using the learned reward function, requiring minimal online interaction. We show that ReWiND's reward model generalizes effectively to unseen tasks, outperforming baselines by up to 2.4x in reward generalization and policy alignment metrics. Finally, we demonstrate that ReWiND enables sample-efficient adaptation to new tasks, beating baselines by 2x in simulation and improving real-world pretrained bimanual policies by 5x, taking a step towards scalable, real-world robot learning. See website at https://rewind-reward.github.io/.
△ Less
Submitted 16 May, 2025;
originally announced May 2025.
-
Safety Aware Task Planning via Large Language Models in Robotics
Authors:
Azal Ahmad Khan,
Michael Andrev,
Muhammad Ali Murtaza,
Sergio Aguilera,
Rui Zhang,
Jie Ding,
Seth Hutchinson,
Ali Anwar
Abstract:
The integration of large language models (LLMs) into robotic task planning has unlocked better reasoning capabilities for complex, long-horizon workflows. However, ensuring safety in LLM-driven plans remains a critical challenge, as these models often prioritize task completion over risk mitigation. This paper introduces SAFER (Safety-Aware Framework for Execution in Robotics), a multi-LLM framewo…
▽ More
The integration of large language models (LLMs) into robotic task planning has unlocked better reasoning capabilities for complex, long-horizon workflows. However, ensuring safety in LLM-driven plans remains a critical challenge, as these models often prioritize task completion over risk mitigation. This paper introduces SAFER (Safety-Aware Framework for Execution in Robotics), a multi-LLM framework designed to embed safety awareness into robotic task planning. SAFER employs a Safety Agent that operates alongside the primary task planner, providing safety feedback. Additionally, we introduce LLM-as-a-Judge, a novel metric leveraging LLMs as evaluators to quantify safety violations within generated task plans. Our framework integrates safety feedback at multiple stages of execution, enabling real-time risk assessment, proactive error correction, and transparent safety evaluation. We also integrate a control framework using Control Barrier Functions (CBFs) to ensure safety guarantees within SAFER's task planning. We evaluated SAFER against state-of-the-art LLM planners on complex long-horizon tasks involving heterogeneous robotic agents, demonstrating its effectiveness in reducing safety violations while maintaining task efficiency. We also verify the task planner and safety planner through actual hardware experiments involving multiple robots and a human.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
FLStore: Efficient Federated Learning Storage for non-training workloads
Authors:
Ahmad Faraz Khan,
Samuel Fountain,
Ahmed M. Abdelmoniem,
Ali R. Butt,
Ali Anwar
Abstract:
Federated Learning (FL) is an approach for privacy-preserving Machine Learning (ML), enabling model training across multiple clients without centralized data collection. With an aggregator server coordinating training, aggregating model updates, and storing metadata across rounds. In addition to training, a substantial part of FL systems are the non-training workloads such as scheduling, personali…
▽ More
Federated Learning (FL) is an approach for privacy-preserving Machine Learning (ML), enabling model training across multiple clients without centralized data collection. With an aggregator server coordinating training, aggregating model updates, and storing metadata across rounds. In addition to training, a substantial part of FL systems are the non-training workloads such as scheduling, personalization, clustering, debugging, and incentivization. Most existing systems rely on the aggregator to handle non-training workloads and use cloud services for data storage. This results in high latency and increased costs as non-training workloads rely on large volumes of metadata, including weight parameters from client updates, hyperparameters, and aggregated updates across rounds, making the situation even worse. We propose FLStore, a serverless framework for efficient FL non-training workloads and storage. FLStore unifies the data and compute planes on a serverless cache, enabling locality-aware execution via tailored caching policies to reduce latency and costs. Per our evaluations, compared to cloud object store based aggregator server FLStore reduces per request average latency by 71% and costs by 92.45%, with peak improvements of 99.7% and 98.8%, respectively. Compared to an in-memory cloud cache based aggregator server, FLStore reduces average latency by 64.6% and costs by 98.83%, with peak improvements of 98.8% and 99.6%, respectively. FLStore integrates seamlessly with existing FL frameworks with minimal modifications, while also being fault-tolerant and highly scalable.
△ Less
Submitted 28 February, 2025;
originally announced March 2025.
-
LADs: Leveraging LLMs for AI-Driven DevOps
Authors:
Ahmad Faraz Khan,
Azal Ahmad Khan,
Anas Mohamed,
Haider Ali,
Suchithra Moolinti,
Sabaat Haroon,
Usman Tahir,
Mattia Fazzini,
Ali R. Butt,
Ali Anwar
Abstract:
Automating cloud configuration and deployment remains a critical challenge due to evolving infrastructures, heterogeneous hardware, and fluctuating workloads. Existing solutions lack adaptability and require extensive manual tuning, leading to inefficiencies and misconfigurations. We introduce LADs, the first LLM-driven framework designed to tackle these challenges by ensuring robustness, adaptabi…
▽ More
Automating cloud configuration and deployment remains a critical challenge due to evolving infrastructures, heterogeneous hardware, and fluctuating workloads. Existing solutions lack adaptability and require extensive manual tuning, leading to inefficiencies and misconfigurations. We introduce LADs, the first LLM-driven framework designed to tackle these challenges by ensuring robustness, adaptability, and efficiency in automated cloud management. Instead of merely applying existing techniques, LADs provides a principled approach to configuration optimization through in-depth analysis of what optimization works under which conditions. By leveraging Retrieval-Augmented Generation, Few-Shot Learning, Chain-of-Thought, and Feedback-Based Prompt Chaining, LADs generates accurate configurations and learns from deployment failures to iteratively refine system settings. Our findings reveal key insights into the trade-offs between performance, cost, and scalability, helping practitioners determine the right strategies for different deployment scenarios. For instance, we demonstrate how prompt chaining-based adaptive feedback loops enhance fault tolerance in multi-tenant environments and how structured log analysis with example shots improves configuration accuracy. Through extensive evaluations, LADs reduces manual effort, optimizes resource utilization, and improves system reliability. By open-sourcing LADs, we aim to drive further innovation in AI-powered DevOps automation.
△ Less
Submitted 28 February, 2025;
originally announced February 2025.
-
Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing
Authors:
Qi Le,
Enmao Diao,
Ziyan Wang,
Xinran Wang,
Jie Ding,
Li Yang,
Ali Anwar
Abstract:
We introduce Probe Pruning (PP), a novel framework for online, dynamic, structured pruning of Large Language Models (LLMs) applied in a batch-wise manner. PP leverages the insight that not all samples and tokens contribute equally to the model's output, and probing a small portion of each batch effectively identifies crucial weights, enabling tailored dynamic pruning for different batches. It comp…
▽ More
We introduce Probe Pruning (PP), a novel framework for online, dynamic, structured pruning of Large Language Models (LLMs) applied in a batch-wise manner. PP leverages the insight that not all samples and tokens contribute equally to the model's output, and probing a small portion of each batch effectively identifies crucial weights, enabling tailored dynamic pruning for different batches. It comprises three main stages: probing, history-informed pruning, and full inference. In the probing stage, PP selects a small yet crucial set of hidden states, based on residual importance, to run a few model layers ahead. During the history-informed pruning stage, PP strategically integrates the probing states with historical states. Subsequently, it structurally prunes weights based on the integrated states and the PP importance score, a metric developed specifically to assess the importance of each weight channel in maintaining performance. In the final stage, full inference is conducted on the remaining weights. A major advantage of PP is its compatibility with existing models, as it operates without requiring additional neural network modules or fine-tuning. Comprehensive evaluations of PP on LLaMA-2/3 and OPT models reveal that even minimal probing-using just 1.5% of FLOPs-can substantially enhance the efficiency of structured pruning of LLMs. For instance, when evaluated on LLaMA-2-7B with WikiText2, PP achieves a 2.56 times lower ratio of performance degradation per unit of runtime reduction compared to the state-of-the-art method at a 40% pruning ratio. Our code is available at https://github.com/Qi-Le1/Probe_Pruning.
△ Less
Submitted 21 February, 2025;
originally announced February 2025.
-
Efficient Evaluation of Multi-Task Robot Policies With Active Experiment Selection
Authors:
Abrar Anwar,
Rohan Gupta,
Zain Merchant,
Sayan Ghosh,
Willie Neiswanger,
Jesse Thomason
Abstract:
Evaluating learned robot control policies to determine their physical task-level capabilities costs experimenter time and effort. The growing number of policies and tasks exacerbates this issue. It is impractical to test every policy on every task multiple times; each trial requires a manual environment reset, and each task change involves re-arranging objects or even changing robots. Naively sele…
▽ More
Evaluating learned robot control policies to determine their physical task-level capabilities costs experimenter time and effort. The growing number of policies and tasks exacerbates this issue. It is impractical to test every policy on every task multiple times; each trial requires a manual environment reset, and each task change involves re-arranging objects or even changing robots. Naively selecting a random subset of tasks and policies to evaluate is a high-cost solution with unreliable, incomplete results. In this work, we formulate robot evaluation as an active testing problem. We propose to model the distribution of robot performance across all tasks and policies as we sequentially execute experiments. Tasks often share similarities that can reveal potential relationships in policy behavior, and we show that natural language is a useful prior in modeling these relationships between tasks. We then leverage this formulation to reduce the experimenter effort by using a cost-aware expected information gain heuristic to efficiently select informative trials. Our framework accommodates both continuous and discrete performance outcomes. We conduct experiments on existing evaluation data from real robots and simulations. By prioritizing informative trials, our framework reduces the cost of calculating evaluation metrics for robot policies across many tasks.
△ Less
Submitted 13 February, 2025;
originally announced February 2025.
-
M3PT: A Transformer for Multimodal, Multi-Party Social Signal Prediction with Person-aware Blockwise Attention
Authors:
Yiming Tang,
Abrar Anwar,
Jesse Thomason
Abstract:
Understanding social signals in multi-party conversations is important for human-robot interaction and artificial social intelligence. Social signals include body pose, head pose, speech, and context-specific activities like acquiring and taking bites of food when dining. Past work in multi-party interaction tends to build task-specific models for predicting social signals. In this work, we addres…
▽ More
Understanding social signals in multi-party conversations is important for human-robot interaction and artificial social intelligence. Social signals include body pose, head pose, speech, and context-specific activities like acquiring and taking bites of food when dining. Past work in multi-party interaction tends to build task-specific models for predicting social signals. In this work, we address the challenge of predicting multimodal social signals in multi-party settings in a single model. We introduce M3PT, a causal transformer architecture with modality and temporal blockwise attention masking to simultaneously process multiple social cues across multiple participants and their temporal interactions. We train and evaluate M3PT on the Human-Human Commensality Dataset (HHCD), and demonstrate that using multiple modalities improves bite timing and speaking status prediction. Source code: https://github.com/AbrarAnwar/masked-social-signals/.
△ Less
Submitted 2 February, 2025; v1 submitted 23 January, 2025;
originally announced January 2025.
-
RDD4D: 4D Attention-Guided Road Damage Detection And Classification
Authors:
Asma Alkalbani,
Muhammad Saqib,
Ahmed Salim Alrawahi,
Abbas Anwar,
Chandarnath Adak,
Saeed Anwar
Abstract:
Road damage detection and assessment are crucial components of infrastructure maintenance. However, current methods often struggle with detecting multiple types of road damage in a single image, particularly at varying scales. This is due to the lack of road datasets with various damage types having varying scales. To overcome this deficiency, first, we present a novel dataset called Diverse Road…
▽ More
Road damage detection and assessment are crucial components of infrastructure maintenance. However, current methods often struggle with detecting multiple types of road damage in a single image, particularly at varying scales. This is due to the lack of road datasets with various damage types having varying scales. To overcome this deficiency, first, we present a novel dataset called Diverse Road Damage Dataset (DRDD) for road damage detection that captures the diverse road damage types in individual images, addressing a crucial gap in existing datasets. Then, we provide our model, RDD4D, that exploits Attention4D blocks, enabling better feature refinement across multiple scales. The Attention4D module processes feature maps through an attention mechanism combining positional encoding and "Talking Head" components to capture local and global contextual information. In our comprehensive experimental analysis comparing various state-of-the-art models on our proposed, our enhanced model demonstrated superior performance in detecting large-sized road cracks with an Average Precision (AP) of 0.458 and maintained competitive performance with an overall AP of 0.445. Moreover, we also provide results on the CrackTinyNet dataset; our model achieved around a 0.21 increase in performance. The code, model weights, dataset, and our results are available on \href{https://github.com/msaqib17/Road_Damage_Detection}{https://github.com/msaqib17/Road\_Damage\_Detection}.
△ Less
Submitted 6 January, 2025;
originally announced January 2025.
-
A comprehensive review of datasets and deep learning techniques for vision in Unmanned Surface Vehicles
Authors:
Linh Trinh,
Siegfried Mercelis,
Ali Anwar
Abstract:
Unmanned Surface Vehicles (USVs) have emerged as a major platform in maritime operations, capable of supporting a wide range of applications. USVs can help reduce labor costs, increase safety, save energy, and allow for difficult unmanned tasks in harsh maritime environments. With the rapid development of USVs, many vision tasks such as detection and segmentation become increasingly important. Dat…
▽ More
Unmanned Surface Vehicles (USVs) have emerged as a major platform in maritime operations, capable of supporting a wide range of applications. USVs can help reduce labor costs, increase safety, save energy, and allow for difficult unmanned tasks in harsh maritime environments. With the rapid development of USVs, many vision tasks such as detection and segmentation become increasingly important. Datasets play an important role in encouraging and improving the research and development of reliable vision algorithms for USVs. In this regard, a large number of recent studies have focused on the release of vision datasets for USVs. Along with the development of datasets, a variety of deep learning techniques have also been studied, with a focus on USVs. However, there is a lack of a systematic review of recent studies in both datasets and vision techniques to provide a comprehensive picture of the current development of vision on USVs, including limitations and trends. In this study, we provide a comprehensive review of both USV datasets and deep learning techniques for vision tasks. Our review was conducted using a large number of vision datasets from USVs. We elaborate several challenges and potential opportunities for research and development in USV vision based on a thorough analysis of current datasets and deep learning techniques.
△ Less
Submitted 2 December, 2024;
originally announced December 2024.
-
Evaluating Robustness of Reinforcement Learning Algorithms for Autonomous Shipping
Authors:
Bavo Lesy,
Ali Anwar,
Siegfried Mercelis
Abstract:
Recently, there has been growing interest in autonomous shipping due to its potential to improve maritime efficiency and safety. The use of advanced technologies, such as artificial intelligence, can address the current navigational and operational challenges in autonomous shipping. In particular, inland waterway transport (IWT) presents a unique set of challenges, such as crowded waterways and va…
▽ More
Recently, there has been growing interest in autonomous shipping due to its potential to improve maritime efficiency and safety. The use of advanced technologies, such as artificial intelligence, can address the current navigational and operational challenges in autonomous shipping. In particular, inland waterway transport (IWT) presents a unique set of challenges, such as crowded waterways and variable environmental conditions. In such dynamic settings, the reliability and robustness of autonomous shipping solutions are critical factors for ensuring safe operations. This paper examines the robustness of benchmark deep reinforcement learning (RL) algorithms, implemented for IWT within an autonomous shipping simulator, and their ability to generate effective motion planning policies. We demonstrate that a model-free approach can achieve an adequate policy in the simulator, successfully navigating port environments never encountered during training. We focus particularly on Soft-Actor Critic (SAC), which we show to be inherently more robust to environmental disturbances compared to MuZero, a state-of-the-art model-based RL algorithm. In this paper, we take a significant step towards developing robust, applied RL frameworks that can be generalized to various vessel types and navigate complex port- and inland environments and scenarios.
△ Less
Submitted 7 November, 2024;
originally announced November 2024.
-
MAP: Multi-Human-Value Alignment Palette
Authors:
Xinran Wang,
Qi Le,
Ammar Ahmed,
Enmao Diao,
Yi Zhou,
Nathalie Baracaldo,
Jie Ding,
Ali Anwar
Abstract:
Ensuring that generative AI systems align with human values is essential but challenging, especially when considering multiple human values and their potential trade-offs. Since human values can be personalized and dynamically change over time, the desirable levels of value alignment vary across different ethnic groups, industry sectors, and user cohorts. Within existing frameworks, it is hard to…
▽ More
Ensuring that generative AI systems align with human values is essential but challenging, especially when considering multiple human values and their potential trade-offs. Since human values can be personalized and dynamically change over time, the desirable levels of value alignment vary across different ethnic groups, industry sectors, and user cohorts. Within existing frameworks, it is hard to define human values and align AI systems accordingly across different directions simultaneously, such as harmlessness, helpfulness, and positiveness. To address this, we develop a novel, first-principle approach called Multi-Human-Value Alignment Palette (MAP), which navigates the alignment across multiple human values in a structured and reliable way. MAP formulates the alignment problem as an optimization task with user-defined constraints, which define human value targets. It can be efficiently solved via a primal-dual approach, which determines whether a user-defined alignment target is achievable and how to achieve it. We conduct a detailed theoretical analysis of MAP by quantifying the trade-offs between values, the sensitivity to constraints, the fundamental connection between multi-value alignment and sequential alignment, and proving that linear weighted rewards are sufficient for multi-value alignment. Extensive experiments demonstrate MAP's ability to align multiple values in a principled manner while delivering strong empirical performance across various tasks.
△ Less
Submitted 24 October, 2024;
originally announced October 2024.
-
In-Context Ensemble Learning from Pseudo Labels Improves Video-Language Models for Low-Level Workflow Understanding
Authors:
Moucheng Xu,
Evangelos Chatzaroulas,
Luc McCutcheon,
Abdul Ahad,
Hamzah Azeem,
Janusz Marecki,
Ammar Anwar
Abstract:
A Standard Operating Procedure (SOP) defines a low-level, step-by-step written guide for a business software workflow. SOP generation is a crucial step towards automating end-to-end software workflows. Manually creating SOPs can be time-consuming. Recent advancements in large video-language models offer the potential for automating SOP generation by analyzing recordings of human demonstrations. Ho…
▽ More
A Standard Operating Procedure (SOP) defines a low-level, step-by-step written guide for a business software workflow. SOP generation is a crucial step towards automating end-to-end software workflows. Manually creating SOPs can be time-consuming. Recent advancements in large video-language models offer the potential for automating SOP generation by analyzing recordings of human demonstrations. However, current large video-language models face challenges with zero-shot SOP generation. In this work, we first explore in-context learning with video-language models for SOP generation. We then propose an exploration-focused strategy called In-Context Ensemble Learning, to aggregate pseudo labels of multiple possible paths of SOPs. The proposed in-context ensemble learning as well enables the models to learn beyond its context window limit with an implicit consistency regularisation. We report that in-context learning helps video-language models to generate more temporally accurate SOP, and the proposed in-context ensemble learning can consistently enhance the capabilities of the video-language models in SOP generation.
△ Less
Submitted 20 October, 2024; v1 submitted 24 September, 2024;
originally announced September 2024.
-
ReMEmbR: Building and Reasoning Over Long-Horizon Spatio-Temporal Memory for Robot Navigation
Authors:
Abrar Anwar,
John Welsh,
Joydeep Biswas,
Soha Pouya,
Yan Chang
Abstract:
Navigating and understanding complex environments over extended periods of time is a significant challenge for robots. People interacting with the robot may want to ask questions like where something happened, when it occurred, or how long ago it took place, which would require the robot to reason over a long history of their deployment. To address this problem, we introduce a Retrieval-augmented…
▽ More
Navigating and understanding complex environments over extended periods of time is a significant challenge for robots. People interacting with the robot may want to ask questions like where something happened, when it occurred, or how long ago it took place, which would require the robot to reason over a long history of their deployment. To address this problem, we introduce a Retrieval-augmented Memory for Embodied Robots, or ReMEmbR, a system designed for long-horizon video question answering for robot navigation. To evaluate ReMEmbR, we introduce the NaVQA dataset where we annotate spatial, temporal, and descriptive questions to long-horizon robot navigation videos. ReMEmbR employs a structured approach involving a memory building and a querying phase, leveraging temporal information, spatial information, and images to efficiently handle continuously growing robot histories. Our experiments demonstrate that ReMEmbR outperforms LLM and VLM baselines, allowing ReMEmbR to achieve effective long-horizon reasoning with low latency. Additionally, we deploy ReMEmbR on a robot and show that our approach can handle diverse queries. The dataset, code, videos, and other material can be found at the following link: https://nvidia-ai-iot.github.io/remembr
△ Less
Submitted 20 September, 2024;
originally announced September 2024.
-
Towards Scalable Quantum Networks
Authors:
Connor Howe,
Mohsin Aziz,
Ali Anwar
Abstract:
This paper presents a comprehensive study on the scalability challenges and opportunities in quantum communication networks, with the goal of determining parameters that impact networks most as well as the trends that appear when scaling networks. We design simulations of quantum networks comprised of router nodes made up of trapped-ion qubits, separated by quantum repeaters in the form of Bell St…
▽ More
This paper presents a comprehensive study on the scalability challenges and opportunities in quantum communication networks, with the goal of determining parameters that impact networks most as well as the trends that appear when scaling networks. We design simulations of quantum networks comprised of router nodes made up of trapped-ion qubits, separated by quantum repeaters in the form of Bell State Measurement (BSM) nodes. Such networks hold the promise of securely sharing quantum information and enabling high-power distributed quantum computing. Despite the promises, quantum networks encounter scalability issues due to noise and operational errors. Through a modular approach, our research aims to surmount these challenges, focusing on effects from scaling node counts and separation distances while monitoring low-quality communication arising from decoherence effects. We aim to pinpoint the critical features within networks essential for advancing scalable, large-scale quantum computing systems. Our findings underscore the impact of several network parameters on scalability, highlighting a critical insight into the trade-offs between the number of repeaters and the quality of entanglement generated. This paper lays the groundwork for future explorations into optimized quantum network designs and protocols.
△ Less
Submitted 12 September, 2024;
originally announced September 2024.
-
HERL: Tiered Federated Learning with Adaptive Homomorphic Encryption using Reinforcement Learning
Authors:
Jiaxang Tang,
Zeshan Fayyaz,
Mohammad A. Salahuddin,
Raouf Boutaba,
Zhi-Li Zhang,
Ali Anwar
Abstract:
Federated Learning is a well-researched approach for collaboratively training machine learning models across decentralized data while preserving privacy. However, integrating Homomorphic Encryption to ensure data confidentiality introduces significant computational and communication overheads, particularly in heterogeneous environments where clients have varying computational capacities and securi…
▽ More
Federated Learning is a well-researched approach for collaboratively training machine learning models across decentralized data while preserving privacy. However, integrating Homomorphic Encryption to ensure data confidentiality introduces significant computational and communication overheads, particularly in heterogeneous environments where clients have varying computational capacities and security needs. In this paper, we propose HERL, a Reinforcement Learning-based approach that uses Q-Learning to dynamically optimize encryption parameters, specifically the polynomial modulus degree, $N$, and the coefficient modulus, $q$, across different client tiers. Our proposed method involves first profiling and tiering clients according to the chosen clustering approach, followed by dynamically selecting the most suitable encryption parameters using an RL-agent. Experimental results demonstrate that our approach significantly reduces the computational overhead while maintaining utility and a high level of security. Empirical results show that HERL improves utility by 17%, reduces the convergence time by up to 24%, and increases convergence efficiency by up to 30%, with minimal security loss.
△ Less
Submitted 11 September, 2024;
originally announced September 2024.
-
Personalized Federated Learning Techniques: Empirical Analysis
Authors:
Azal Ahmad Khan,
Ahmad Faraz Khan,
Haider Ali,
Ali Anwar
Abstract:
Personalized Federated Learning (pFL) holds immense promise for tailoring machine learning models to individual users while preserving data privacy. However, achieving optimal performance in pFL often requires a careful balancing act between memory overhead costs and model accuracy. This paper delves into the trade-offs inherent in pFL, offering valuable insights for selecting the right algorithms…
▽ More
Personalized Federated Learning (pFL) holds immense promise for tailoring machine learning models to individual users while preserving data privacy. However, achieving optimal performance in pFL often requires a careful balancing act between memory overhead costs and model accuracy. This paper delves into the trade-offs inherent in pFL, offering valuable insights for selecting the right algorithms for diverse real-world scenarios. We empirically evaluate ten prominent pFL techniques across various datasets and data splits, uncovering significant differences in their performance. Our study reveals interesting insights into how pFL methods that utilize personalized (local) aggregation exhibit the fastest convergence due to their efficiency in communication and computation. Conversely, fine-tuning methods face limitations in handling data heterogeneity and potential adversarial attacks while multi-objective learning methods achieve higher accuracy at the cost of additional training and resource consumption. Our study emphasizes the critical role of communication efficiency in scaling pFL, demonstrating how it can significantly affect resource usage in real-world deployments.
△ Less
Submitted 10 September, 2024;
originally announced September 2024.
-
DynamicFL: Federated Learning with Dynamic Communication Resource Allocation
Authors:
Qi Le,
Enmao Diao,
Xinran Wang,
Vahid Tarokh,
Jie Ding,
Ali Anwar
Abstract:
Federated Learning (FL) is a collaborative machine learning framework that allows multiple users to train models utilizing their local data in a distributed manner. However, considerable statistical heterogeneity in local data across devices often leads to suboptimal model performance compared with independently and identically distributed (IID) data scenarios. In this paper, we introduce DynamicF…
▽ More
Federated Learning (FL) is a collaborative machine learning framework that allows multiple users to train models utilizing their local data in a distributed manner. However, considerable statistical heterogeneity in local data across devices often leads to suboptimal model performance compared with independently and identically distributed (IID) data scenarios. In this paper, we introduce DynamicFL, a new FL framework that investigates the trade-offs between global model performance and communication costs for two widely adopted FL methods: Federated Stochastic Gradient Descent (FedSGD) and Federated Averaging (FedAvg). Our approach allocates diverse communication resources to clients based on their data statistical heterogeneity, considering communication resource constraints, and attains substantial performance enhancements compared to uniform communication resource allocation. Notably, our method bridges the gap between FedSGD and FedAvg, providing a flexible framework leveraging communication heterogeneity to address statistical heterogeneity in FL. Through extensive experiments, we demonstrate that DynamicFL surpasses current state-of-the-art methods with up to a 10% increase in model accuracy, demonstrating its adaptability and effectiveness in tackling data statistical heterogeneity challenges.
△ Less
Submitted 8 September, 2024;
originally announced September 2024.
-
Addressing Heterogeneity in Federated Learning: Challenges and Solutions for a Shared Production Environment
Authors:
Tatjana Legler,
Vinit Hegiste,
Ahmed Anwar,
Martin Ruskowski
Abstract:
Federated learning (FL) has emerged as a promising approach to training machine learning models across decentralized data sources while preserving data privacy, particularly in manufacturing and shared production environments. However, the presence of data heterogeneity variations in data distribution, quality, and volume across different or clients and production sites, poses significant challeng…
▽ More
Federated learning (FL) has emerged as a promising approach to training machine learning models across decentralized data sources while preserving data privacy, particularly in manufacturing and shared production environments. However, the presence of data heterogeneity variations in data distribution, quality, and volume across different or clients and production sites, poses significant challenges to the effectiveness and efficiency of FL. This paper provides a comprehensive overview of heterogeneity in FL within the context of manufacturing, detailing the types and sources of heterogeneity, including non-independent and identically distributed (non-IID) data, unbalanced data, variable data quality, and statistical heterogeneity. We discuss the impact of these types of heterogeneity on model training and review current methodologies for mitigating their adverse effects. These methodologies include personalized and customized models, robust aggregation techniques, and client selection techniques. By synthesizing existing research and proposing new strategies, this paper aims to provide insight for effectively managing data heterogeneity in FL, enhancing model robustness, and ensuring fair and efficient training across diverse environments. Future research directions are also identified, highlighting the need for adaptive and scalable solutions to further improve the FL paradigm in the context of Industry 4.0.
△ Less
Submitted 18 August, 2024;
originally announced August 2024.
-
Empathic Responding for Digital Interpersonal Emotion Regulation via Content Recommendation
Authors:
Akriti Verma,
Shama Islam,
Valeh Moghaddam,
Adnan Anwar,
Sharon Horwood
Abstract:
Interpersonal communication plays a key role in managing people's emotions, especially on digital platforms. Studies have shown that people use social media and consume online content to regulate their emotions and find support for rest and recovery. However, these platforms are not designed for emotion regulation, which limits their effectiveness in this regard. To address this issue, we propose…
▽ More
Interpersonal communication plays a key role in managing people's emotions, especially on digital platforms. Studies have shown that people use social media and consume online content to regulate their emotions and find support for rest and recovery. However, these platforms are not designed for emotion regulation, which limits their effectiveness in this regard. To address this issue, we propose an approach to enhance Interpersonal Emotion Regulation (IER) on online platforms through content recommendation. The objective is to empower users to regulate their emotions while actively or passively engaging in online platforms by crafting media content that aligns with IER strategies, particularly empathic responding. The proposed recommendation system is expected to blend system-initiated and user-initiated emotion regulation, paving the way for real-time IER practices on digital media platforms. To assess the efficacy of this approach, a mixed-method research design is used, including the analysis of text-based social media data and a user survey. Digital applications has served as facilitators in this process, given the widespread recognition of digital media applications for Digital Emotion Regulation (DER). The study collects 37.5K instances of user posts and interactions on Reddit over a year to design a Contextual Multi-Armed Bandits (CMAB) based recommendation system using features from user activity and preferences. The experimentation shows that the empathic recommendations generated by the proposed recommendation system are preferred by users over widely accepted ER strategies such as distraction and avoidance.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
FedAD-Bench: A Unified Benchmark for Federated Unsupervised Anomaly Detection in Tabular Data
Authors:
Ahmed Anwar,
Brian Moser,
Dayananda Herurkar,
Federico Raue,
Vinit Hegiste,
Tatjana Legler,
Andreas Dengel
Abstract:
The emergence of federated learning (FL) presents a promising approach to leverage decentralized data while preserving privacy. Furthermore, the combination of FL and anomaly detection is particularly compelling because it allows for detecting rare and critical anomalies (usually also rare in locally gathered data) in sensitive data from multiple sources, such as cybersecurity and healthcare. Howe…
▽ More
The emergence of federated learning (FL) presents a promising approach to leverage decentralized data while preserving privacy. Furthermore, the combination of FL and anomaly detection is particularly compelling because it allows for detecting rare and critical anomalies (usually also rare in locally gathered data) in sensitive data from multiple sources, such as cybersecurity and healthcare. However, benchmarking the performance of anomaly detection methods in FL environments remains an underexplored area. This paper introduces FedAD-Bench, a unified benchmark for evaluating unsupervised anomaly detection algorithms within the context of FL. We systematically analyze and compare the performance of recent deep learning anomaly detection models under federated settings, which were typically assessed solely in centralized settings. FedAD-Bench encompasses diverse datasets and metrics to provide a holistic evaluation. Through extensive experiments, we identify key challenges such as model aggregation inefficiencies and metric unreliability. We present insights into FL's regularization effects, revealing scenarios in which it outperforms centralized approaches due to its inherent ability to mitigate overfitting. Our work aims to establish a standardized benchmark to guide future research and development in federated anomaly detection, promoting reproducibility and fair comparison across studies.
△ Less
Submitted 8 August, 2024;
originally announced August 2024.
-
Enhancing Cognitive Workload Classification Using Integrated LSTM Layers and CNNs for fNIRS Data Analysis
Authors:
Mehshan Ahmed Khan,
Houshyar Asadi,
Mohammad Reza Chalak Qazani,
Adetokunbo Arogbonlo,
Siamak Pedrammehr,
Adnan Anwar,
Asim Bhatti,
Saeid Nahavandi,
Chee Peng Lim
Abstract:
Functional near-infrared spectroscopy (fNIRS) is employed as a non-invasive method to monitor functional brain activation by capturing changes in the concentrations of oxygenated haemoglobin (HbO) and deoxygenated haemo-globin (HbR). Various machine learning classification techniques have been utilized to distinguish cognitive states. However, conventional machine learning methods, although simple…
▽ More
Functional near-infrared spectroscopy (fNIRS) is employed as a non-invasive method to monitor functional brain activation by capturing changes in the concentrations of oxygenated haemoglobin (HbO) and deoxygenated haemo-globin (HbR). Various machine learning classification techniques have been utilized to distinguish cognitive states. However, conventional machine learning methods, although simpler to implement, undergo a complex pre-processing phase before network training and demonstrate reduced accuracy due to inadequate data preprocessing. Additionally, previous research in cog-nitive load assessment using fNIRS has predominantly focused on differ-sizeentiating between two levels of mental workload. These studies mainly aim to classify low and high levels of cognitive load or distinguish between easy and difficult tasks. To address these limitations associated with conven-tional methods, this paper conducts a comprehensive exploration of the im-pact of Long Short-Term Memory (LSTM) layers on the effectiveness of Convolutional Neural Networks (CNNs) within deep learning models. This is to address the issues related to spatial features overfitting and lack of tem-poral dependencies in CNN in the previous studies. By integrating LSTM layers, the model can capture temporal dependencies in the fNIRS data, al-lowing for a more comprehensive understanding of cognitive states. The primary objective is to assess how incorporating LSTM layers enhances the performance of CNNs. The experimental results presented in this paper demonstrate that the integration of LSTM layers with Convolutional layers results in an increase in the accuracy of deep learning models from 97.40% to 97.92%.
△ Less
Submitted 22 July, 2024;
originally announced July 2024.
-
Decentralized Federated Anomaly Detection in Smart Grids: A P2P Gossip Approach
Authors:
Muhammad Akbar Husnoo,
Adnan Anwar,
Md Enamul Haque,
A. N. Mahmood
Abstract:
The increasing security and privacy concerns in the Smart Grid sector have led to a significant demand for robust intrusion detection systems within critical smart grid infrastructure. To address the challenges posed by privacy preservation and decentralized power system zones with distinct data ownership, Federated Learning (FL) has emerged as a promising privacy-preserving solution which facilit…
▽ More
The increasing security and privacy concerns in the Smart Grid sector have led to a significant demand for robust intrusion detection systems within critical smart grid infrastructure. To address the challenges posed by privacy preservation and decentralized power system zones with distinct data ownership, Federated Learning (FL) has emerged as a promising privacy-preserving solution which facilitates collaborative training of attack detection models without necessitating the sharing of raw data. However, FL presents several implementation limitations in the power system domain due to its heavy reliance on a centralized aggregator and the risks of privacy leakage during model update transmission. To overcome these technical bottlenecks, this paper introduces a novel decentralized federated anomaly detection scheme based on two main gossip protocols namely Random Walk and Epidemic. Our findings indicate that the Random Walk protocol exhibits superior performance compared to the Epidemic protocol, highlighting its efficacy in decentralized federated learning environments. Experimental validation of the proposed framework utilizing publicly available industrial control systems datasets demonstrates superior attack detection accuracy while safeguarding data confidentiality and mitigating the impact of communication latency and stragglers. Furthermore, our approach yields a notable 35% improvement in training time compared to conventional FL, underscoring the efficacy and robustness of our decentralized learning method.
△ Less
Submitted 9 January, 2025; v1 submitted 20 July, 2024;
originally announced July 2024.
-
Improving classification of road surface conditions via road area extraction and contrastive learning
Authors:
Linh Trinh,
Ali Anwar,
Siegfried Mercelis
Abstract:
Maintaining roads is crucial to economic growth and citizen well-being because roads are a vital means of transportation. In various countries, the inspection of road surfaces is still done manually, however, to automate it, research interest is now focused on detecting the road surface defects via the visual data. While, previous research has been focused on deep learning methods which tend to pr…
▽ More
Maintaining roads is crucial to economic growth and citizen well-being because roads are a vital means of transportation. In various countries, the inspection of road surfaces is still done manually, however, to automate it, research interest is now focused on detecting the road surface defects via the visual data. While, previous research has been focused on deep learning methods which tend to process the entire image and leads to heavy computational cost. In this study, we focus our attention on improving the classification performance while keeping the computational cost of our solution low. Instead of processing the whole image, we introduce a segmentation model to only focus the downstream classification model to the road surface in the image. Furthermore, we employ contrastive learning during model training to improve the road surface condition classification. Our experiments on the public RTK dataset demonstrate a significant improvement in our proposed method when compared to previous works.
△ Less
Submitted 19 July, 2024;
originally announced July 2024.
-
Data selection method for assessment of autonomous vehicles
Authors:
Linh Trinh,
Ali Anwar,
Siegfried Mercelis
Abstract:
As the popularity of autonomous vehicles has grown, many standards and regulators, such as ISO, NHTSA, and Euro NCAP, require safety validation to ensure a sufficient level of safety before deploying them in the real world. Manufacturers gather a large amount of public road data for this purpose. However, the majority of these validation activities are done manually by humans. Furthermore, the dat…
▽ More
As the popularity of autonomous vehicles has grown, many standards and regulators, such as ISO, NHTSA, and Euro NCAP, require safety validation to ensure a sufficient level of safety before deploying them in the real world. Manufacturers gather a large amount of public road data for this purpose. However, the majority of these validation activities are done manually by humans. Furthermore, the data used to validate each driving feature may differ. As a result, it is essential to have an efficient data selection method that can be used flexibly and dynamically for verification and validation while also accelerating the validation process. In this paper, we present a data selection method that is practical, flexible, and efficient for assessment of autonomous vehicles. Our idea is to optimize the similarity between the metadata distribution of the selected data and a predefined metadata distribution that is expected for validation. Our experiments on the large dataset BDD100K show that our method can perform data selection tasks efficiently. These results demonstrate that our methods are highly reliable and can be used to select appropriate data for the validation of various safety functions.
△ Less
Submitted 28 October, 2024; v1 submitted 16 July, 2024;
originally announced July 2024.
-
Multiple data sources and domain generalization learning method for road surface defect classification
Authors:
Linh Trinh,
Ali Anwar,
Siegfried Mercelis
Abstract:
Roads are an essential mode of transportation, and maintaining them is critical to economic growth and citizen well-being. With the continued advancement of AI, road surface inspection based on camera images has recently been extensively researched and can be performed automatically. However, because almost all of the deep learning methods for detecting road surface defects were optimized for a sp…
▽ More
Roads are an essential mode of transportation, and maintaining them is critical to economic growth and citizen well-being. With the continued advancement of AI, road surface inspection based on camera images has recently been extensively researched and can be performed automatically. However, because almost all of the deep learning methods for detecting road surface defects were optimized for a specific dataset, they are difficult to apply to a new, previously unseen dataset. Furthermore, there is a lack of research on training an efficient model using multiple data sources. In this paper, we propose a method for classifying road surface defects using camera images. In our method, we propose a scheme for dealing with the invariance of multiple data sources while training a model on multiple data sources. Furthermore, we present a domain generalization training algorithm for developing a generalized model that can work with new, completely unseen data sources without requiring model updates. We validate our method using an experiment with six data sources corresponding to six countries from the RDD2022 dataset. The results show that our method can efficiently classify road surface defects on previously unseen data.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
Generating Contextually-Relevant Navigation Instructions for Blind and Low Vision People
Authors:
Zain Merchant,
Abrar Anwar,
Emily Wang,
Souti Chattopadhyay,
Jesse Thomason
Abstract:
Navigating unfamiliar environments presents significant challenges for blind and low-vision (BLV) individuals. In this work, we construct a dataset of images and goals across different scenarios such as searching through kitchens or navigating outdoors. We then investigate how grounded instruction generation methods can provide contextually-relevant navigational guidance to users in these instance…
▽ More
Navigating unfamiliar environments presents significant challenges for blind and low-vision (BLV) individuals. In this work, we construct a dataset of images and goals across different scenarios such as searching through kitchens or navigating outdoors. We then investigate how grounded instruction generation methods can provide contextually-relevant navigational guidance to users in these instances. Through a sighted user study, we demonstrate that large pretrained language models can produce correct and useful instructions perceived as beneficial for BLV users. We also conduct a survey and interview with 4 BLV users and observe useful insights on preferences for different instructions based on the scenario.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Contrast Sets for Evaluating Language-Guided Robot Policies
Authors:
Abrar Anwar,
Rohan Gupta,
Jesse Thomason
Abstract:
Robot evaluations in language-guided, real world settings are time-consuming and often sample only a small space of potential instructions across complex scenes. In this work, we introduce contrast sets for robotics as an approach to make small, but specific, perturbations to otherwise independent, identically distributed (i.i.d.) test instances. We investigate the relationship between experimente…
▽ More
Robot evaluations in language-guided, real world settings are time-consuming and often sample only a small space of potential instructions across complex scenes. In this work, we introduce contrast sets for robotics as an approach to make small, but specific, perturbations to otherwise independent, identically distributed (i.i.d.) test instances. We investigate the relationship between experimenter effort to carry out an evaluation and the resulting estimated test performance as well as the insights that can be drawn from performance on perturbed instances. We use the relative performance change of different contrast set perturbations to characterize policies at reduced experimenter effort in both a simulated manipulation task and a physical robot vision-and-language navigation task. We encourage the use of contrast set evaluations as a more informative alternative to small scale, i.i.d. demonstrations on physical robots, and as a scalable alternative to industry-scale real world evaluations.
△ Less
Submitted 25 October, 2024; v1 submitted 19 June, 2024;
originally announced June 2024.
-
MedCalc-Bench: Evaluating Large Language Models for Medical Calculations
Authors:
Nikhil Khandekar,
Qiao Jin,
Guangzhi Xiong,
Soren Dunn,
Serina S Applebaum,
Zain Anwar,
Maame Sarfo-Gyamfi,
Conrad W Safranek,
Abid A Anwar,
Andrew Zhang,
Aidan Gilson,
Maxwell B Singer,
Amisha Dave,
Andrew Taylor,
Aidong Zhang,
Qingyu Chen,
Zhiyong Lu
Abstract:
As opposed to evaluating computation and logic-based reasoning, current benchmarks for evaluating large language models (LLMs) in medicine are primarily focused on question-answering involving domain knowledge and descriptive reasoning. While such qualitative capabilities are vital to medical diagnosis, in real-world scenarios, doctors frequently use clinical calculators that follow quantitative e…
▽ More
As opposed to evaluating computation and logic-based reasoning, current benchmarks for evaluating large language models (LLMs) in medicine are primarily focused on question-answering involving domain knowledge and descriptive reasoning. While such qualitative capabilities are vital to medical diagnosis, in real-world scenarios, doctors frequently use clinical calculators that follow quantitative equations and rule-based reasoning paradigms for evidence-based decision support. To this end, we propose MedCalc-Bench, a first-of-its-kind dataset focused on evaluating the medical calculation capability of LLMs. MedCalc-Bench contains an evaluation set of over 1000 manually reviewed instances from 55 different medical calculation tasks. Each instance in MedCalc-Bench consists of a patient note, a question requesting to compute a specific medical value, a ground truth answer, and a step-by-step explanation showing how the answer is obtained. While our evaluation results show the potential of LLMs in this area, none of them are effective enough for clinical settings. Common issues include extracting the incorrect entities, not using the correct equation or rules for a calculation task, or incorrectly performing the arithmetic for the computation. We hope our study highlights the quantitative knowledge and reasoning gaps in LLMs within medical settings, encouraging future improvements of LLMs for various clinical calculation tasks.
△ Less
Submitted 30 June, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Robust Model-Based Reinforcement Learning with an Adversarial Auxiliary Model
Authors:
Siemen Herremans,
Ali Anwar,
Siegfried Mercelis
Abstract:
Reinforcement learning has demonstrated impressive performance in various challenging problems such as robotics, board games, and classical arcade games. However, its real-world applications can be hindered by the absence of robustness and safety in the learned policies. More specifically, an RL agent that trains in a certain Markov decision process (MDP) often struggles to perform well in nearly…
▽ More
Reinforcement learning has demonstrated impressive performance in various challenging problems such as robotics, board games, and classical arcade games. However, its real-world applications can be hindered by the absence of robustness and safety in the learned policies. More specifically, an RL agent that trains in a certain Markov decision process (MDP) often struggles to perform well in nearly identical MDPs. To address this issue, we employ the framework of Robust MDPs (RMDPs) in a model-based setting and introduce a novel learned transition model. Our method specifically incorporates an auxiliary pessimistic model, updated adversarially, to estimate the worst-case MDP within a Kullback-Leibler uncertainty set. In comparison to several existing works, our work does not impose any additional conditions on the training environment, such as the need for a parametric simulator. To test the effectiveness of the proposed pessimistic model in enhancing policy robustness, we integrate it into a practical RL algorithm, called Robust Model-Based Policy Optimization (RMBPO). Our experimental results indicate a notable improvement in policy robustness on high-dimensional MuJoCo control tasks, with the auxiliary model enhancing the performance of the learned policy in distorted MDPs. We further explore the learned deviation between the proposed auxiliary world model and the nominal model, to examine how pessimism is achieved. By learning a pessimistic world model and demonstrating its role in improving policy robustness, our research contributes towards making (model-based) RL more robust.
△ Less
Submitted 1 July, 2024; v1 submitted 14 June, 2024;
originally announced June 2024.
-
Federated Learning for Blind Image Super-Resolution
Authors:
Brian B. Moser,
Ahmed Anwar,
Federico Raue,
Stanislav Frolov,
Andreas Dengel
Abstract:
Traditional blind image SR methods need to model real-world degradations precisely. Consequently, current research struggles with this dilemma by assuming idealized degradations, which leads to limited applicability to actual user data. Moreover, the ideal scenario - training models on data from the targeted user base - presents significant privacy concerns. To address both challenges, we propose…
▽ More
Traditional blind image SR methods need to model real-world degradations precisely. Consequently, current research struggles with this dilemma by assuming idealized degradations, which leads to limited applicability to actual user data. Moreover, the ideal scenario - training models on data from the targeted user base - presents significant privacy concerns. To address both challenges, we propose to fuse image SR with federated learning, allowing real-world degradations to be directly learned from users without invading their privacy. Furthermore, it enables optimization across many devices without data centralization. As this fusion is underexplored, we introduce new benchmarks specifically designed to evaluate new SR methods in this federated setting. By doing so, we employ known degradation modeling techniques from SR research. However, rather than aiming to mirror real degradations, our benchmarks use these degradation models to simulate the variety of degradations found across clients within a distributed user base. This distinction is crucial as it circumvents the need to precisely model real-world degradations, which limits contemporary blind image SR research. Our proposed benchmarks investigate blind image SR under new aspects, namely differently distributed degradation types among users and varying user numbers. We believe new methods tested within these benchmarks will perform more similarly in an application, as the simulated scenario addresses the variety while federated learning enables the training on actual degradations.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
Can a Machine be Conscious? Towards Universal Criteria for Machine Consciousness
Authors:
Nur Aizaan Anwar,
Cosmin Badea
Abstract:
As artificially intelligent systems become more anthropomorphic and pervasive, and their potential impact on humanity more urgent, discussions about the possibility of machine consciousness have significantly intensified, and it is sometimes seen as 'the holy grail'. Many concerns have been voiced about the ramifications of creating an artificial conscious entity. This is compounded by a marked la…
▽ More
As artificially intelligent systems become more anthropomorphic and pervasive, and their potential impact on humanity more urgent, discussions about the possibility of machine consciousness have significantly intensified, and it is sometimes seen as 'the holy grail'. Many concerns have been voiced about the ramifications of creating an artificial conscious entity. This is compounded by a marked lack of consensus around what constitutes consciousness and by an absence of a universal set of criteria for determining consciousness. By going into depth on the foundations and characteristics of consciousness, we propose five criteria for determining whether a machine is conscious, which can also be applied more generally to any entity. This paper aims to serve as a primer and stepping stone for researchers of consciousness, be they in philosophy, computer science, medicine, or any other field, to further pursue this holy grail of philosophy, neuroscience and artificial intelligence.
△ Less
Submitted 30 April, 2024; v1 submitted 19 April, 2024;
originally announced April 2024.
-
Fin-Fed-OD: Federated Outlier Detection on Financial Tabular Data
Authors:
Dayananda Herurkar,
Sebastian Palacio,
Ahmed Anwar,
Joern Hees,
Andreas Dengel
Abstract:
Anomaly detection in real-world scenarios poses challenges due to dynamic and often unknown anomaly distributions, requiring robust methods that operate under an open-world assumption. This challenge is exacerbated in practical settings, where models are employed by private organizations, precluding data sharing due to privacy and competitive concerns. Despite potential benefits, the sharing of an…
▽ More
Anomaly detection in real-world scenarios poses challenges due to dynamic and often unknown anomaly distributions, requiring robust methods that operate under an open-world assumption. This challenge is exacerbated in practical settings, where models are employed by private organizations, precluding data sharing due to privacy and competitive concerns. Despite potential benefits, the sharing of anomaly information across organizations is restricted. This paper addresses the question of enhancing outlier detection within individual organizations without compromising data confidentiality. We propose a novel method leveraging representation learning and federated learning techniques to improve the detection of unknown anomalies. Specifically, our approach utilizes latent representations obtained from client-owned autoencoders to refine the decision boundary of inliers. Notably, only model parameters are shared between organizations, preserving data privacy. The efficacy of our proposed method is evaluated on two standard financial tabular datasets and an image dataset for anomaly detection in a distributed setting. The results demonstrate a strong improvement in the classification of unknown outliers during the inference phase for each organization's model.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
ColA: Collaborative Adaptation with Gradient Learning
Authors:
Enmao Diao,
Qi Le,
Suya Wu,
Xinran Wang,
Ali Anwar,
Jie Ding,
Vahid Tarokh
Abstract:
A primary function of back-propagation is to compute both the gradient of hidden representations and parameters for optimization with gradient descent. Training large models requires high computational costs due to their vast parameter sizes. While Parameter-Efficient Fine-Tuning (PEFT) methods aim to train smaller auxiliary models to save computational space, they still present computational over…
▽ More
A primary function of back-propagation is to compute both the gradient of hidden representations and parameters for optimization with gradient descent. Training large models requires high computational costs due to their vast parameter sizes. While Parameter-Efficient Fine-Tuning (PEFT) methods aim to train smaller auxiliary models to save computational space, they still present computational overheads, especially in Fine-Tuning as a Service (FTaaS) for numerous users. We introduce Collaborative Adaptation (ColA) with Gradient Learning (GL), a parameter-free, model-agnostic fine-tuning approach that decouples the computation of the gradient of hidden representations and parameters. In comparison to PEFT methods, ColA facilitates more cost-effective FTaaS by offloading the computation of the gradient to low-cost devices. We also provide a theoretical analysis of ColA and experimentally demonstrate that ColA can perform on par or better than existing PEFT methods on various benchmarks.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Training a Vision Language Model as Smartphone Assistant
Authors:
Nicolai Dorka,
Janusz Marecki,
Ammar Anwar
Abstract:
Addressing the challenge of a digital assistant capable of executing a wide array of user tasks, our research focuses on the realm of instruction-based mobile device control. We leverage recent advancements in large language models (LLMs) and present a visual language model (VLM) that can fulfill diverse tasks on mobile devices. Our model functions by interacting solely with the user interface (UI…
▽ More
Addressing the challenge of a digital assistant capable of executing a wide array of user tasks, our research focuses on the realm of instruction-based mobile device control. We leverage recent advancements in large language models (LLMs) and present a visual language model (VLM) that can fulfill diverse tasks on mobile devices. Our model functions by interacting solely with the user interface (UI). It uses the visual input from the device screen and mimics human-like interactions, encompassing gestures such as tapping and swiping. This generality in the input and output space allows our agent to interact with any application on the device. Unlike previous methods, our model operates not only on a single screen image but on vision-language sentences created from sequences of past screenshots along with corresponding actions. Evaluating our method on the challenging Android in the Wild benchmark demonstrates its promising efficacy and potential.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Augmented Reality based Simulated Data (ARSim) with multi-view consistency for AV perception networks
Authors:
Aqeel Anwar,
Tae Eun Choe,
Zian Wang,
Sanja Fidler,
Minwoo Park
Abstract:
Detecting a diverse range of objects under various driving scenarios is essential for the effectiveness of autonomous driving systems. However, the real-world data collected often lacks the necessary diversity presenting a long-tail distribution. Although synthetic data has been utilized to overcome this issue by generating virtual scenes, it faces hurdles such as a significant domain gap and the…
▽ More
Detecting a diverse range of objects under various driving scenarios is essential for the effectiveness of autonomous driving systems. However, the real-world data collected often lacks the necessary diversity presenting a long-tail distribution. Although synthetic data has been utilized to overcome this issue by generating virtual scenes, it faces hurdles such as a significant domain gap and the substantial efforts required from 3D artists to create realistic environments. To overcome these challenges, we present ARSim, a fully automated, comprehensive, modular framework designed to enhance real multi-view image data with 3D synthetic objects of interest. The proposed method integrates domain adaptation and randomization strategies to address covariate shift between real and simulated data by inferring essential domain attributes from real data and employing simulation-based randomization for other attributes. We construct a simplified virtual scene using real data and strategically place 3D synthetic assets within it. Illumination is achieved by estimating light distribution from multiple images capturing the surroundings of the vehicle. Camera parameters from real data are employed to render synthetic assets in each frame. The resulting augmented multi-view consistent dataset is used to train a multi-camera perception network for autonomous vehicles. Experimental results on various AV perception tasks demonstrate the superior performance of networks trained on the augmented dataset.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Feel the Bite: Robot-Assisted Inside-Mouth Bite Transfer using Robust Mouth Perception and Physical Interaction-Aware Control
Authors:
Rajat Kumar Jenamani,
Daniel Stabile,
Ziang Liu,
Abrar Anwar,
Katherine Dimitropoulou,
Tapomayukh Bhattacharjee
Abstract:
Robot-assisted feeding can greatly enhance the lives of those with mobility limitations. Modern feeding systems can pick up and position food in front of a care recipient's mouth for a bite. However, many with severe mobility constraints cannot lean forward and need direct inside-mouth food placement. This demands precision, especially for those with restricted mouth openings, and appropriately re…
▽ More
Robot-assisted feeding can greatly enhance the lives of those with mobility limitations. Modern feeding systems can pick up and position food in front of a care recipient's mouth for a bite. However, many with severe mobility constraints cannot lean forward and need direct inside-mouth food placement. This demands precision, especially for those with restricted mouth openings, and appropriately reacting to various physical interactions - incidental contacts as the utensil moves inside, impulsive contacts due to sudden muscle spasms, deliberate tongue maneuvers by the person being fed to guide the utensil, and intentional bites. In this paper, we propose an inside-mouth bite transfer system that addresses these challenges with two key components: a multi-view mouth perception pipeline robust to tool occlusion, and a control mechanism that employs multimodal time-series classification to discern and react to different physical interactions. We demonstrate the efficacy of these individual components through two ablation studies. In a full system evaluation, our system successfully fed 13 care recipients with diverse mobility challenges. Participants consistently emphasized the comfort and safety of our inside-mouth bite transfer system, and gave it high technology acceptance ratings - underscoring its transformative potential in real-world scenarios. Supplementary materials and videos can be found at http://emprise.cs.cornell.edu/bitetransfer/ .
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
MeanCache: User-Centric Semantic Caching for LLM Web Services
Authors:
Waris Gill,
Mohamed Elidrisi,
Pallavi Kalapatapu,
Ammar Ahmed,
Ali Anwar,
Muhammad Ali Gulzar
Abstract:
Large Language Models (LLMs) like ChatGPT and Llama have revolutionized natural language processing and search engine dynamics. However, these models incur exceptionally high computational costs. For instance, GPT-3 consists of 175 billion parameters, where inference demands billions of floating-point operations. Caching is a natural solution to reduce LLM inference costs on repeated queries, whic…
▽ More
Large Language Models (LLMs) like ChatGPT and Llama have revolutionized natural language processing and search engine dynamics. However, these models incur exceptionally high computational costs. For instance, GPT-3 consists of 175 billion parameters, where inference demands billions of floating-point operations. Caching is a natural solution to reduce LLM inference costs on repeated queries, which constitute about 31% of the total queries. However, existing caching methods are incapable of finding semantic similarities among LLM queries nor do they operate on contextual queries, leading to unacceptable false hit-and-miss rates. This paper introduces MeanCache, a user-centric semantic cache for LLM-based services that identifies semantically similar queries to determine cache hit or miss. Using MeanCache, the response to a user's semantically similar query can be retrieved from a local cache rather than re-querying the LLM, thus reducing costs, service provider load, and environmental impact. MeanCache leverages Federated Learning (FL) to collaboratively train a query similarity model without violating user privacy. By placing a local cache in each user's device and using FL, MeanCache reduces the latency and costs and enhances model performance, resulting in lower false hit rates. MeanCache also encodes context chains for every cached query, offering a simple yet highly effective mechanism to discern contextual query responses from standalone. Our experiments benchmarked against the state-of-the-art caching method, reveal that MeanCache attains an approximately 17% higher F-score and a 20% increase in precision during semantic cache hit-and-miss decisions while performing even better on contextual queries. It also reduces the storage requirement by 83% and accelerates semantic cache hit-and-miss decisions by 11%.
△ Less
Submitted 7 March, 2025; v1 submitted 5 March, 2024;
originally announced March 2024.
-
Everything You Always Wanted to Know About Storage Compressibility of Pre-Trained ML Models but Were Afraid to Ask
Authors:
Zhaoyuan Su,
Ammar Ahmed,
Zirui Wang,
Ali Anwar,
Yue Cheng
Abstract:
As the number of pre-trained machine learning (ML) models is growing exponentially, data reduction tools are not catching up. Existing data reduction techniques are not specifically designed for pre-trained model (PTM) dataset files. This is largely due to a lack of understanding of the patterns and characteristics of these datasets, especially those relevant to data reduction and compressibility.…
▽ More
As the number of pre-trained machine learning (ML) models is growing exponentially, data reduction tools are not catching up. Existing data reduction techniques are not specifically designed for pre-trained model (PTM) dataset files. This is largely due to a lack of understanding of the patterns and characteristics of these datasets, especially those relevant to data reduction and compressibility.
This paper presents the first, exhaustive analysis to date of PTM datasets on storage compressibility. Our analysis spans different types of data reduction and compression techniques, from hash-based data deduplication, data similarity detection, to dictionary-coding compression. Our analysis explores these techniques at three data granularity levels, from model layers, model chunks, to model parameters. We draw new observations that indicate that modern data reduction tools are not effective when handling PTM datasets. There is a pressing need for new compression methods that take into account PTMs' data characteristics for effective storage reduction.
Motivated by our findings, we design ELF, a simple yet effective, error-bounded, lossy floating-point compression method. ELF transforms floating-point parameters in such a way that the common exponent field of the transformed parameters can be completely eliminated to save storage space. We develop Elves, a compression framework that integrates ELF along with several other data reduction methods. Elves uses the most effective method to compress PTMs that exhibit different patterns. Evaluation shows that Elves achieves an overall compression ratio of $1.52\times$, which is $1.31\times$, $1.32\times$ and $1.29\times$ higher than a general-purpose compressor (zstd), an error-bounded lossy compressor (SZ3), and the uniform model quantization, respectively, with negligible model accuracy loss.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Decision Theory-Guided Deep Reinforcement Learning for Fast Learning
Authors:
Zelin Wan,
Jin-Hee Cho,
Mu Zhu,
Ahmed H. Anwar,
Charles Kamhoua,
Munindar P. Singh
Abstract:
This paper introduces a novel approach, Decision Theory-guided Deep Reinforcement Learning (DT-guided DRL), to address the inherent cold start problem in DRL. By integrating decision theory principles, DT-guided DRL enhances agents' initial performance and robustness in complex environments, enabling more efficient and reliable convergence during learning. Our investigation encompasses two primary…
▽ More
This paper introduces a novel approach, Decision Theory-guided Deep Reinforcement Learning (DT-guided DRL), to address the inherent cold start problem in DRL. By integrating decision theory principles, DT-guided DRL enhances agents' initial performance and robustness in complex environments, enabling more efficient and reliable convergence during learning. Our investigation encompasses two primary problem contexts: the cart pole and maze navigation challenges. Experimental results demonstrate that the integration of decision theory not only facilitates effective initial guidance for DRL agents but also promotes a more structured and informed exploration strategy, particularly in environments characterized by large and intricate state spaces. The results of experiment demonstrate that DT-guided DRL can provide significantly higher rewards compared to regular DRL. Specifically, during the initial phase of training, the DT-guided DRL yields up to an 184% increase in accumulated reward. Moreover, even after reaching convergence, it maintains a superior performance, ending with up to 53% more reward than standard DRL in large maze problems. DT-guided DRL represents an advancement in mitigating a fundamental challenge of DRL by leveraging functions informed by human (designer) knowledge, setting a foundation for further research in this promising interdisciplinary domain.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
TraceFL: Interpretability-Driven Debugging in Federated Learning via Neuron Provenance
Authors:
Waris Gill,
Ali Anwar,
Muhammad Ali Gulzar
Abstract:
In Federated Learning, clients train models on local data and send updates to a central server, which aggregates them into a global model using a fusion algorithm. This collaborative yet privacy-preserving training comes at a cost. FL developers face significant challenges in attributing global model predictions to specific clients. Localizing responsible clients is a crucial step towards (a) excl…
▽ More
In Federated Learning, clients train models on local data and send updates to a central server, which aggregates them into a global model using a fusion algorithm. This collaborative yet privacy-preserving training comes at a cost. FL developers face significant challenges in attributing global model predictions to specific clients. Localizing responsible clients is a crucial step towards (a) excluding clients primarily responsible for incorrect predictions and (b) encouraging clients who contributed high-quality models to continue participating in the future. Existing ML debugging approaches are inherently inapplicable as they are designed for single-model, centralized training.
We introduce TraceFL, a fine-grained neuron provenance capturing mechanism that identifies clients responsible for a global model's prediction by tracking the flow of information from individual clients to the global model. Since inference on different inputs activates a different set of neurons of the global model, TraceFL dynamically quantifies the significance of the global model's neurons in a given prediction, identifying the most crucial neurons in the global model. It then maps them to the corresponding neurons in every participating client to determine each client's contribution, ultimately localizing the responsible client. We evaluate TraceFL on six datasets, including two real-world medical imaging datasets and four neural networks, including advanced models such as GPT. TraceFL achieves 99% accuracy in localizing the responsible client in FL tasks spanning both image and text classification tasks. At a time when state-of-the-artML debugging approaches are mostly domain-specific (e.g., image classification only), TraceFL is the first technique to enable highly accurate automated reasoning across a wide range of FL applications.
△ Less
Submitted 17 January, 2025; v1 submitted 21 December, 2023;
originally announced December 2023.
-
Autonomous Port Navigation With Ranging Sensors Using Model-Based Reinforcement Learning
Authors:
Siemen Herremans,
Ali Anwar,
Arne Troch,
Ian Ravijts,
Maarten Vangeneugden,
Siegfried Mercelis,
Peter Hellinckx
Abstract:
Autonomous shipping has recently gained much interest in the research community. However, little research focuses on inland - and port navigation, even though this is identified by countries such as Belgium and the Netherlands as an essential step towards a sustainable future. These environments pose unique challenges, since they can contain dynamic obstacles that do not broadcast their location,…
▽ More
Autonomous shipping has recently gained much interest in the research community. However, little research focuses on inland - and port navigation, even though this is identified by countries such as Belgium and the Netherlands as an essential step towards a sustainable future. These environments pose unique challenges, since they can contain dynamic obstacles that do not broadcast their location, such as small vessels, kayaks or buoys. Therefore, this research proposes a navigational algorithm which can navigate an inland vessel in a wide variety of complex port scenarios using ranging sensors to observe the environment. The proposed methodology is based on a machine learning approach that has recently set benchmark results in various domains: model-based reinforcement learning. By randomizing the port environments during training, the trained model can navigate in scenarios that it never encountered during training. Furthermore, results show that our approach outperforms the commonly used dynamic window approach and a benchmark model-free reinforcement learning algorithm. This work is therefore a significant step towards vessels that can navigate autonomously in complex port scenarios.
△ Less
Submitted 17 November, 2023;
originally announced December 2023.
-
SeaDSC: A video-based unsupervised method for dynamic scene change detection in unmanned surface vehicles
Authors:
Linh Trinh,
Ali Anwar,
Siegfried Mercelis
Abstract:
Recently, there has been an upsurge in the research on maritime vision, where a lot of works are influenced by the application of computer vision for Unmanned Surface Vehicles (USVs). Various sensor modalities such as camera, radar, and lidar have been used to perform tasks such as object detection, segmentation, object tracking, and motion planning. A large subset of this research is focused on t…
▽ More
Recently, there has been an upsurge in the research on maritime vision, where a lot of works are influenced by the application of computer vision for Unmanned Surface Vehicles (USVs). Various sensor modalities such as camera, radar, and lidar have been used to perform tasks such as object detection, segmentation, object tracking, and motion planning. A large subset of this research is focused on the video analysis, since most of the current vessel fleets contain the camera's onboard for various surveillance tasks. Due to the vast abundance of the video data, video scene change detection is an initial and crucial stage for scene understanding of USVs. This paper outlines our approach to detect dynamic scene changes in USVs. To the best of our understanding, this work represents the first investigation of scene change detection in the maritime vision application. Our objective is to identify significant changes in the dynamic scenes of maritime video data, particularly those scenes that exhibit a high degree of resemblance. In our system for dynamic scene change detection, we propose completely unsupervised learning method. In contrast to earlier studies, we utilize a modified cutting-edge generative picture model called VQ-VAE-2 to train on multiple marine datasets, aiming to enhance the feature extraction. Next, we introduce our innovative similarity scoring technique for directly calculating the level of similarity in a sequence of consecutive frames by utilizing grid calculation on retrieved features. The experiments were conducted using a nautical video dataset called RoboWhaler to showcase the efficient performance of our technique.
△ Less
Submitted 20 November, 2023;
originally announced November 2023.
-
Safety Aware Autonomous Path Planning Using Model Predictive Reinforcement Learning for Inland Waterways
Authors:
Astrid Vanneste,
Simon Vanneste,
Olivier Vasseur,
Robin Janssens,
Mattias Billast,
Ali Anwar,
Kevin Mets,
Tom De Schepper,
Siegfried Mercelis,
Peter Hellinckx
Abstract:
In recent years, interest in autonomous shipping in urban waterways has increased significantly due to the trend of keeping cars and trucks out of city centers. Classical approaches such as Frenet frame based planning and potential field navigation often require tuning of many configuration parameters and sometimes even require a different configuration depending on the situation. In this paper, w…
▽ More
In recent years, interest in autonomous shipping in urban waterways has increased significantly due to the trend of keeping cars and trucks out of city centers. Classical approaches such as Frenet frame based planning and potential field navigation often require tuning of many configuration parameters and sometimes even require a different configuration depending on the situation. In this paper, we propose a novel path planning approach based on reinforcement learning called Model Predictive Reinforcement Learning (MPRL). MPRL calculates a series of waypoints for the vessel to follow. The environment is represented as an occupancy grid map, allowing us to deal with any shape of waterway and any number and shape of obstacles. We demonstrate our approach on two scenarios and compare the resulting path with path planning using a Frenet frame and path planning based on a proximal policy optimization (PPO) agent. Our results show that MPRL outperforms both baselines in both test scenarios. The PPO based approach was not able to reach the goal in either scenario while the Frenet frame approach failed in the scenario consisting of a corner with obstacles. MPRL was able to safely (collision free) navigate to the goal in both of the test scenarios.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Which One? Leveraging Context Between Objects and Multiple Views for Language Grounding
Authors:
Chancharik Mitra,
Abrar Anwar,
Rodolfo Corona,
Dan Klein,
Trevor Darrell,
Jesse Thomason
Abstract:
When connecting objects and their language referents in an embodied 3D environment, it is important to note that: (1) an object can be better characterized by leveraging comparative information between itself and other objects, and (2) an object's appearance can vary with camera position. As such, we present the Multi-view Approach to Grounding in Context (MAGiC), which selects an object referent…
▽ More
When connecting objects and their language referents in an embodied 3D environment, it is important to note that: (1) an object can be better characterized by leveraging comparative information between itself and other objects, and (2) an object's appearance can vary with camera position. As such, we present the Multi-view Approach to Grounding in Context (MAGiC), which selects an object referent based on language that distinguishes between two similar objects. By pragmatically reasoning over both objects and across multiple views of those objects, MAGiC improves over the state-of-the-art model on the SNARE object reference task with a relative error reduction of 12.9\% (representing an absolute improvement of 2.7\%). Ablation studies show that reasoning jointly over object referent candidates and multiple views of each object both contribute to improved accuracy. Code: https://github.com/rcorona/magic_snare/
△ Less
Submitted 6 April, 2024; v1 submitted 11 November, 2023;
originally announced November 2023.
-
Honeypot Allocation for Cyber Deception in Dynamic Tactical Networks: A Game Theoretic Approach
Authors:
Md Abu Sayed,
Ahmed H. Anwar,
Christopher Kiekintveld,
Charles Kamhoua
Abstract:
Honeypots play a crucial role in implementing various cyber deception techniques as they possess the capability to divert attackers away from valuable assets. Careful strategic placement of honeypots in networks should consider not only network aspects but also attackers' preferences. The allocation of honeypots in tactical networks under network mobility is of great interest. To achieve this obje…
▽ More
Honeypots play a crucial role in implementing various cyber deception techniques as they possess the capability to divert attackers away from valuable assets. Careful strategic placement of honeypots in networks should consider not only network aspects but also attackers' preferences. The allocation of honeypots in tactical networks under network mobility is of great interest. To achieve this objective, we present a game-theoretic approach that generates optimal honeypot allocation strategies within an attack/defense scenario. Our proposed approach takes into consideration the changes in network connectivity. In particular, we introduce a two-player dynamic game model that explicitly incorporates the future state evolution resulting from changes in network connectivity. The defender's objective is twofold: to maximize the likelihood of the attacker hitting a honeypot and to minimize the cost associated with deception and reconfiguration due to changes in network topology. We present an iterative algorithm to find Nash equilibrium strategies and analyze the scalability of the algorithm. Finally, we validate our approach and present numerical results based on simulations, demonstrating that our game model successfully enhances network security. Additionally, we have proposed additional enhancements to improve the scalability of the proposed approach.
△ Less
Submitted 18 September, 2023; v1 submitted 22 August, 2023;
originally announced August 2023.
-
Blockchain-Based and Fuzzy Logic-Enabled False Data Discovery for the Intelligent Autonomous Vehicular System
Authors:
Ziaur Rahman,
Xun Yi,
Ibrahim Khalil,
Adnan Anwar,
Shantanu Pal
Abstract:
Since the beginning of this decade, several incidents report that false data injection attacks targeting intelligent connected vehicles cause huge industrial damage and loss of lives. Data Theft, Flooding, Fuzzing, Hijacking, Malware Spoofing and Advanced Persistent Threats have been immensely growing attack that leads to end-user conflict by abolishing trust on autonomous vehicle. Looking after t…
▽ More
Since the beginning of this decade, several incidents report that false data injection attacks targeting intelligent connected vehicles cause huge industrial damage and loss of lives. Data Theft, Flooding, Fuzzing, Hijacking, Malware Spoofing and Advanced Persistent Threats have been immensely growing attack that leads to end-user conflict by abolishing trust on autonomous vehicle. Looking after those sensitive data that contributes to measure the localisation factors of the vehicle, conventional centralised techniques can be misused to update the legitimate vehicular status maliciously. As investigated, the existing centralized false data detection approach based on state and likelihood estimation has a reprehensible trade-off in terms of accuracy, trust, cost, and efficiency. Blockchain with Fuzzy-logic Intelligence has shown its potential to solve localisation issues, trust and false data detection challenges encountered by today's autonomous vehicular system. The proposed Blockchain-based fuzzy solution demonstrates a novel false data detection and reputation preservation technique. The illustrated proposed model filters false and anomalous data based on the vehicles' rules and behaviours. Besides improving the detection accuracy and eliminating the single point of failure, the contributions include appropriating fuzzy AI functions within the Road-side Unit node before authorizing status data by a Blockchain network. Finally, thorough experimental evaluation validates the effectiveness of the proposed model.
△ Less
Submitted 17 August, 2023;
originally announced August 2023.
-
Digital Emotion Regulation on Social Media
Authors:
Akriti Verma,
Shama Islam,
Valeh Moghaddam,
Adnan Anwar
Abstract:
Emotion regulation is the process of consciously altering one's affective state, that is the underlying emotional state such as happiness, confidence, guilt, anger etc. The ability to effectively regulate emotions is necessary for functioning efficiently in everyday life. Today, the pervasiveness of digital technology is being purposefully employed to modify our affective states, a process known a…
▽ More
Emotion regulation is the process of consciously altering one's affective state, that is the underlying emotional state such as happiness, confidence, guilt, anger etc. The ability to effectively regulate emotions is necessary for functioning efficiently in everyday life. Today, the pervasiveness of digital technology is being purposefully employed to modify our affective states, a process known as digital emotion regulation. Understanding digital emotion regulation can help support the rise of ethical technology design, development, and deployment. This article presents an overview of digital emotion regulation in social media applications, as well as a synthesis of recent research on emotion regulation interventions for social media. We share our findings from analysing state-of-the-art literature on how different social media applications are utilised at different stages in the process of emotion regulation.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
Cyber Deception against Zero-day Attacks: A Game Theoretic Approach
Authors:
Md Abu Sayed,
Ahmed H. Anwar,
Christopher Kiekintveld,
Branislav Bosansky,
Charles Kamhoua
Abstract:
Reconnaissance activities precedent other attack steps in the cyber kill chain. Zero-day attacks exploit unknown vulnerabilities and give attackers the upper hand against conventional defenses. Honeypots have been used to deceive attackers by misrepresenting the true state of the network. Existing work on cyber deception does not model zero-day attacks. In this paper, we address the question of "H…
▽ More
Reconnaissance activities precedent other attack steps in the cyber kill chain. Zero-day attacks exploit unknown vulnerabilities and give attackers the upper hand against conventional defenses. Honeypots have been used to deceive attackers by misrepresenting the true state of the network. Existing work on cyber deception does not model zero-day attacks. In this paper, we address the question of "How to allocate honeypots over the network?" to protect its most valuable assets. To this end, we develop a two-player zero-sum game theoretic approach to study the potential reconnaissance tracks and attack paths that attackers may use. However, zero-day attacks allow attackers to avoid placed honeypots by creating new attack paths. Therefore, we introduce a sensitivity analysis to investigate the impact of different zero-day vulnerabilities on the performance of the proposed deception technique. Next, we propose several mitigating strategies to defend the network against zero-day attacks based on this analysis. Finally, our numerical results validate our findings and illustrate the effectiveness of the proposed defense approach.
△ Less
Submitted 25 July, 2023; v1 submitted 24 July, 2023;
originally announced July 2023.