-
ECAM: A Contrastive Learning Approach to Avoid Environmental Collision in Trajectory Forecasting
Authors:
Giacomo Rosin,
Muhammad Rameez Ur Rahman,
Sebastiano Vascon
Abstract:
Human trajectory forecasting is crucial in applications such as autonomous driving, robotics and surveillance. Accurate forecasting requires models to consider various factors, including social interactions, multi-modal predictions, pedestrian intention and environmental context. While existing methods account for these factors, they often overlook the impact of the environment, which leads to col…
▽ More
Human trajectory forecasting is crucial in applications such as autonomous driving, robotics and surveillance. Accurate forecasting requires models to consider various factors, including social interactions, multi-modal predictions, pedestrian intention and environmental context. While existing methods account for these factors, they often overlook the impact of the environment, which leads to collisions with obstacles. This paper introduces ECAM (Environmental Collision Avoidance Module), a contrastive learning-based module to enhance collision avoidance ability with the environment. The proposed module can be integrated into existing trajectory forecasting models, improving their ability to generate collision-free predictions. We evaluate our method on the ETH/UCY dataset and quantitatively and qualitatively demonstrate its collision avoidance capabilities. Our experiments show that state-of-the-art methods significantly reduce (-40/50%) the collision rate when integrated with the proposed module. The code is available at https://github.com/CVML-CFU/ECAM.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
LOOPE: Learnable Optimal Patch Order in Positional Embeddings for Vision Transformers
Authors:
Md Abtahi Majeed Chowdhury,
Md Rifat Ur Rahman,
Akil Ahmad Taki
Abstract:
Positional embeddings (PE) play a crucial role in Vision Transformers (ViTs) by providing spatial information otherwise lost due to the permutation invariant nature of self attention. While absolute positional embeddings (APE) have shown theoretical advantages over relative positional embeddings (RPE), particularly due to the ability of sinusoidal functions to preserve spatial inductive biases lik…
▽ More
Positional embeddings (PE) play a crucial role in Vision Transformers (ViTs) by providing spatial information otherwise lost due to the permutation invariant nature of self attention. While absolute positional embeddings (APE) have shown theoretical advantages over relative positional embeddings (RPE), particularly due to the ability of sinusoidal functions to preserve spatial inductive biases like monotonicity and shift invariance, a fundamental challenge arises when mapping a 2D grid to a 1D sequence. Existing methods have mostly overlooked or never explored the impact of patch ordering in positional embeddings. To address this, we propose LOOPE, a learnable patch-ordering method that optimizes spatial representation for a given set of frequencies, providing a principled approach to patch order optimization. Empirical results show that our PE significantly improves classification accuracy across various ViT architectures. To rigorously evaluate the effectiveness of positional embeddings, we introduce the "Three Cell Experiment", a novel benchmarking framework that assesses the ability of PEs to retain relative and absolute positional information across different ViT architectures. Unlike standard evaluations, which typically report a performance gap of 4 to 6% between models with and without PE, our method reveals a striking 30 to 35% difference, offering a more sensitive diagnostic tool to measure the efficacy of PEs. Our experimental analysis confirms that the proposed LOOPE demonstrates enhanced effectiveness in retaining both relative and absolute positional information.
△ Less
Submitted 19 April, 2025;
originally announced April 2025.
-
RAG for Effective Supply Chain Security Questionnaire Automation
Authors:
Zaynab Batool Reza,
Abdul Rafay Syed,
Omer Iqbal,
Ethel Mensah,
Qian Liu,
Maxx Richard Rahman,
Wolfgang Maass
Abstract:
In an era where digital security is crucial, efficient processing of security-related inquiries through supply chain security questionnaires is imperative. This paper introduces a novel approach using Natural Language Processing (NLP) and Retrieval-Augmented Generation (RAG) to automate these responses. We developed QuestSecure, a system that interprets diverse document formats and generates preci…
▽ More
In an era where digital security is crucial, efficient processing of security-related inquiries through supply chain security questionnaires is imperative. This paper introduces a novel approach using Natural Language Processing (NLP) and Retrieval-Augmented Generation (RAG) to automate these responses. We developed QuestSecure, a system that interprets diverse document formats and generates precise responses by integrating large language models (LLMs) with an advanced retrieval system. Our experiments show that QuestSecure significantly improves response accuracy and operational efficiency. By employing advanced NLP techniques and tailored retrieval mechanisms, the system consistently produces contextually relevant and semantically rich responses, reducing cognitive load on security teams and minimizing potential errors. This research offers promising avenues for automating complex security management tasks, enhancing organizational security processes.
△ Less
Submitted 18 December, 2024;
originally announced December 2024.
-
Incorporating Metabolic Information into LLMs for Anomaly Detection in Clinical Time-Series
Authors:
Maxx Richard Rahman,
Ruoxuan Liu,
Wolfgang Maass
Abstract:
Anomaly detection in clinical time-series holds significant potential in identifying suspicious patterns in different biological parameters. In this paper, we propose a targeted method that incorporates the clinical domain knowledge into LLMs to improve their ability to detect anomalies. We introduce the Metabolism Pathway-driven Prompting (MPP) method, which integrates the information about metab…
▽ More
Anomaly detection in clinical time-series holds significant potential in identifying suspicious patterns in different biological parameters. In this paper, we propose a targeted method that incorporates the clinical domain knowledge into LLMs to improve their ability to detect anomalies. We introduce the Metabolism Pathway-driven Prompting (MPP) method, which integrates the information about metabolic pathways to better capture the structural and temporal changes in biological samples. We applied our method for doping detection in sports, focusing on steroid metabolism, and evaluated using real-world data from athletes. The results show that our method improves anomaly detection performance by leveraging metabolic context, providing a more nuanced and accurate prediction of suspicious samples in athletes' profiles.
△ Less
Submitted 24 November, 2024; v1 submitted 2 October, 2024;
originally announced October 2024.
-
Probabilistic Classification of Near-Surface Shallow-Water Sediments using A Portable Free-Fall Penetrometer
Authors:
Md Rejwanur Rahman,
Adrian Rodriguez-Marek,
Nina Stark,
Grace Massey,
Carl Friedrichs,
Kelly M. Dorgan
Abstract:
The geotechnical evaluation of seabed sediments is important for engineering projects and naval applications, offering valuable insights into sediment properties, behavior, and strength. Obtaining high-quality seabed samples can be a challenging task, making in-situ testing an essential part of site characterization. Free Fall Penetrometers (FFP) have emerged as robust tools for rapidly profiling…
▽ More
The geotechnical evaluation of seabed sediments is important for engineering projects and naval applications, offering valuable insights into sediment properties, behavior, and strength. Obtaining high-quality seabed samples can be a challenging task, making in-situ testing an essential part of site characterization. Free Fall Penetrometers (FFP) have emerged as robust tools for rapidly profiling seabed surface sediments, even in energetic nearshore or estuarine conditions and shallow as well as deep depths. While methods for interpretation of traditional offshore Cone Penetration Testing (CPT) data are well-established, their adaptation to FFP data is still an area of research. In this study, we introduce an innovative approach that utilizes machine learning algorithms to create a sediment behavior classification system based on portable free fall penetrometer (PFFP) data. The proposed model leverages PFFP measurements obtained from locations such as Sequim Bay (Washington), the Potomac River, and the York River (Virginia). The result shows 91.1\% accuracy in the class prediction, with the classes representing cohesionless sediment with little to no plasticity, cohesionless sediment with some plasticity, cohesive sediment with low plasticity, and cohesive sediment with high plasticity. The model prediction not only provides the predicted class but also yields an estimate of inherent uncertainty associated with the prediction, which can provide valuable insight about different sediment behaviors. These uncertainties typically range from very low to very high, with lower uncertainties being more common, but they can increase significantly dpending on variations in sediment composition, environmental conditions, and operational techniques. By quantifying uncertainty, the model offers a more comprehensive and informed approach to sediment classification.
△ Less
Submitted 30 September, 2024;
originally announced October 2024.
-
Container Data Item: An Abstract Datatype for Efficient Container-based Edge Computing
Authors:
Md Rezwanur Rahman,
Tarun Annapareddy,
Shirin Ebadi,
Varsha Natarajan,
Adarsh Srinivasan,
Eric Keller,
Shivakant Mishra
Abstract:
We present Container Data Item (CDI), an abstract datatype that allows multiple containers to efficiently operate on a common data item while preserving their strong security and isolation semantics. Application developers can use CDIs to enable multiple containers to operate on the same data, synchronize execution among themselves, and control the ownership of the shared data item during runtime.…
▽ More
We present Container Data Item (CDI), an abstract datatype that allows multiple containers to efficiently operate on a common data item while preserving their strong security and isolation semantics. Application developers can use CDIs to enable multiple containers to operate on the same data, synchronize execution among themselves, and control the ownership of the shared data item during runtime. These containers may reside on the same server or different servers. CDI is designed to support microservice based applications comprised of a set of interconnected microservices, each implemented by a separate dedicated container. CDI preserves the important isolation semantics of containers by ensuring that exactly one container owns a CDI object at any instant and the ownership of a CDI object may be transferred from one container to another only by the current CDI object owner. We present three different implementations of CDI that allow different containers residing on the same server as well containers residing on different servers to use CDI for efficiently operating on a common data item. The paper provides an extensive performance evaluation of CDI along with two representative applications, an augmented reality application and a decentralized workflow orchestrator.
△ Less
Submitted 1 September, 2024;
originally announced September 2024.
-
Modularity in Transformers: Investigating Neuron Separability & Specialization
Authors:
Nicholas Pochinkov,
Thomas Jones,
Mohammed Rashidur Rahman
Abstract:
Transformer models are increasingly prevalent in various applications, yet our understanding of their internal workings remains limited. This paper investigates the modularity and task specialization of neurons within transformer architectures, focusing on both vision (ViT) and language (Mistral 7B) models. Using a combination of selective pruning and MoEfication clustering techniques, we analyze…
▽ More
Transformer models are increasingly prevalent in various applications, yet our understanding of their internal workings remains limited. This paper investigates the modularity and task specialization of neurons within transformer architectures, focusing on both vision (ViT) and language (Mistral 7B) models. Using a combination of selective pruning and MoEfication clustering techniques, we analyze the overlap and specialization of neurons across different tasks and data subsets. Our findings reveal evidence of task-specific neuron clusters, with varying degrees of overlap between related tasks. We observe that neuron importance patterns persist to some extent even in randomly initialized models, suggesting an inherent structure that training refines. Additionally, we find that neuron clusters identified through MoEfication correspond more strongly to task-specific neurons in earlier and later layers of the models. This work contributes to a more nuanced understanding of transformer internals and offers insights into potential avenues for improving model interpretability and efficiency.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
OpenNav: Efficient Open Vocabulary 3D Object Detection for Smart Wheelchair Navigation
Authors:
Muhammad Rameez ur Rahman,
Piero Simonetto,
Anna Polato,
Francesco Pasti,
Luca Tonin,
Sebastiano Vascon
Abstract:
Open vocabulary 3D object detection (OV3D) allows precise and extensible object recognition crucial for adapting to diverse environments encountered in assistive robotics. This paper presents OpenNav, a zero-shot 3D object detection pipeline based on RGB-D images for smart wheelchairs. Our pipeline integrates an open-vocabulary 2D object detector with a mask generator for semantic segmentation, fo…
▽ More
Open vocabulary 3D object detection (OV3D) allows precise and extensible object recognition crucial for adapting to diverse environments encountered in assistive robotics. This paper presents OpenNav, a zero-shot 3D object detection pipeline based on RGB-D images for smart wheelchairs. Our pipeline integrates an open-vocabulary 2D object detector with a mask generator for semantic segmentation, followed by depth isolation and point cloud construction to create 3D bounding boxes. The smart wheelchair exploits these 3D bounding boxes to identify potential targets and navigate safely. We demonstrate OpenNav's performance through experiments on the Replica dataset and we report preliminary results with a real wheelchair. OpenNav improves state-of-the-art significantly on the Replica dataset at mAP25 (+9pts) and mAP50 (+5pts) with marginal improvement at mAP. The code is publicly available at this link: https://github.com/EasyWalk-PRIN/OpenNav.
△ Less
Submitted 25 August, 2024;
originally announced August 2024.
-
OVOSE: Open-Vocabulary Semantic Segmentation in Event-Based Cameras
Authors:
Muhammad Rameez Ur Rahman,
Jhony H. Giraldo,
Indro Spinelli,
Stéphane Lathuilière,
Fabio Galasso
Abstract:
Event cameras, known for low-latency operation and superior performance in challenging lighting conditions, are suitable for sensitive computer vision tasks such as semantic segmentation in autonomous driving. However, challenges arise due to limited event-based data and the absence of large-scale segmentation benchmarks. Current works are confined to closed-set semantic segmentation, limiting the…
▽ More
Event cameras, known for low-latency operation and superior performance in challenging lighting conditions, are suitable for sensitive computer vision tasks such as semantic segmentation in autonomous driving. However, challenges arise due to limited event-based data and the absence of large-scale segmentation benchmarks. Current works are confined to closed-set semantic segmentation, limiting their adaptability to other applications. In this paper, we introduce OVOSE, the first Open-Vocabulary Semantic Segmentation algorithm for Event cameras. OVOSE leverages synthetic event data and knowledge distillation from a pre-trained image-based foundation model to an event-based counterpart, effectively preserving spatial context and transferring open-vocabulary semantic segmentation capabilities. We evaluate the performance of OVOSE on two driving semantic segmentation datasets DDD17, and DSEC-Semantic, comparing it with existing conventional image open-vocabulary models adapted for event-based data. Similarly, we compare OVOSE with state-of-the-art methods designed for closed-set settings in unsupervised domain adaptation for event-based semantic segmentation. OVOSE demonstrates superior performance, showcasing its potential for real-world applications. The code is available at https://github.com/ram95d/OVOSE.
△ Less
Submitted 18 August, 2024;
originally announced August 2024.
-
Heterogeneous Peridynamic Neural Operators: Discover Biotissue Constitutive Law and Microstructure From Digital Image Correlation Measurements
Authors:
Siavash Jafarzadeh,
Stewart Silling,
Lu Zhang,
Colton Ross,
Chung-Hao Lee,
S. M. Rakibur Rahman,
Shuodao Wang,
Yue Yu
Abstract:
Human tissues are highly organized structures with collagen fiber arrangements varying from point to point. Anisotropy of the tissue arises from the natural orientation of the fibers, resulting in location-dependent anisotropy. Heterogeneity also plays an important role in tissue function. It is therefore critical to discover and understand the distribution of fiber orientations from experimental…
▽ More
Human tissues are highly organized structures with collagen fiber arrangements varying from point to point. Anisotropy of the tissue arises from the natural orientation of the fibers, resulting in location-dependent anisotropy. Heterogeneity also plays an important role in tissue function. It is therefore critical to discover and understand the distribution of fiber orientations from experimental mechanical measurements such as digital image correlation (DIC) data. To this end, we introduce the Heterogeneous Peridynamic Neural Operator (HeteroPNO) approach for data-driven constitutive modeling of heterogeneous anisotropic materials. Our goal is to learn a nonlocal constitutive law together with the material microstructure, in the form of a heterogeneous fiber orientation field, from load-displacement field measurements. We propose a two-phase learning approach. Firstly, we learn a homogeneous constitutive law in the form of a neural network-based kernel function and a nonlocal bond force, to capture complex homogeneous material responses from data. Then, in the second phase we reinitialize the learnt bond force and the kernel function, and training them together with a fiber orientation field for each material point. Owing to the state-based peridynamic skeleton, our HeteroPNO-learned material models are objective and have the balance of linear and angular momentum guaranteed. Moreover, the effects from heterogeneity and nonlinear constitutive relationship are captured by the kernel function and the bond force respectively, enabling physical interpretability. As a result, our HeteroPNO architecture can learn a constitutive model for a biological tissue with anisotropic heterogeneous response undergoing large deformation regime. Moreover, the framework is capable to provide displacement and stress field predictions for new and unseen loading instances.
△ Less
Submitted 19 July, 2024; v1 submitted 27 March, 2024;
originally announced March 2024.
-
PureConnect: A Localized Social Media System to Increase Awareness and Connectedness in Environmental Justice Communities
Authors:
Omar Hammad,
Md Rezwanur Rahman,
Gopala Krishna Vasanth Kanugo,
Nicholas Clements,
Shelly Miller,
Shivakant Mishra,
Esther Sullivan
Abstract:
Frequent disruptions like highway constructions are common now-a-days, often impacting environmental justice communities (communities with low socio-economic status with disproportionately high and adverse human health and environmental effects) that live nearby. Based on our interactions via focus groups with the members of four environmental justice communities impacted by a major highway constr…
▽ More
Frequent disruptions like highway constructions are common now-a-days, often impacting environmental justice communities (communities with low socio-economic status with disproportionately high and adverse human health and environmental effects) that live nearby. Based on our interactions via focus groups with the members of four environmental justice communities impacted by a major highway construction, a common concern is a sense of uncertainty about project activities and loss of social connectedness, leading to increased stress, depression, anxiety and diminished well-being. This paper addresses this concern by developing a localized social media system called PureConnect with a goal to raise the level of awareness about the project and increase social connectedness among the community members. PureConnect has been designed using active engagement with four environmental justice communities affected by a major highway construction. It has been deployed in the real world among the members of the four environmental justice communities, and a detailed analysis of the data collected from this deployment as well as surveys show that PureConnect is potentially useful in improving community members' well-being and the members appreciate the functionalities it provides.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
PureNav: A Personalized Navigation Service for Environmental Justice Communities Impacted by Planned Disruptions
Authors:
Omar Hammad,
Md Rezwanur Rahman,
Nicholas Clements,
Shivakant Mishra,
Shelly Miller,
Esther Sullivan
Abstract:
Planned disruptions such as highway constructions are commonplace nowadays and the communities living near these disruptions generally tend to be environmental justice communities -- low socioeconomic status with disproportionately high and adverse human health and environmental effects. A major concern is that such activities negatively impact people's well-being by disrupting their daily commute…
▽ More
Planned disruptions such as highway constructions are commonplace nowadays and the communities living near these disruptions generally tend to be environmental justice communities -- low socioeconomic status with disproportionately high and adverse human health and environmental effects. A major concern is that such activities negatively impact people's well-being by disrupting their daily commutes via frequent road closures and increased dust and air pollution. This paper addresses this concern by developing a personalized navigation service called PureNav to mitigate the negative impacts of disruptions in daily commutes on people's well-being. PureNav has been designed using active engagement with four environmental justice communities affected by major highway construction. It has been deployed in the real world among the members of the four communities, and a detailed analysis of the data collected from this deployment as well as surveys show that PureNav is potentially useful in improving people's well-being. The paper describes the design, implementation, and evaluation of PureNav, and offers suggestions for further improving its efficacy.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
LLMRS: Unlocking Potentials of LLM-Based Recommender Systems for Software Purchase
Authors:
Angela John,
Theophilus Aidoo,
Hamayoon Behmanush,
Irem B. Gunduz,
Hewan Shrestha,
Maxx Richard Rahman,
Wolfgang Maaß
Abstract:
Recommendation systems are ubiquitous, from Spotify playlist suggestions to Amazon product suggestions. Nevertheless, depending on the methodology or the dataset, these systems typically fail to capture user preferences and generate general recommendations. Recent advancements in Large Language Models (LLM) offer promising results for analyzing user queries. However, employing these models to capt…
▽ More
Recommendation systems are ubiquitous, from Spotify playlist suggestions to Amazon product suggestions. Nevertheless, depending on the methodology or the dataset, these systems typically fail to capture user preferences and generate general recommendations. Recent advancements in Large Language Models (LLM) offer promising results for analyzing user queries. However, employing these models to capture user preferences and efficiency remains an open question. In this paper, we propose LLMRS, an LLM-based zero-shot recommender system where we employ pre-trained LLM to encode user reviews into a review score and generate user-tailored recommendations. We experimented with LLMRS on a real-world dataset, the Amazon product reviews, for software purchase use cases. The results show that LLMRS outperforms the ranking-based baseline model while successfully capturing meaningful information from product reviews, thereby providing more reliable recommendations.
△ Less
Submitted 12 January, 2024;
originally announced January 2024.
-
Mining Temporal Attack Patterns from Cyberthreat Intelligence Reports
Authors:
Md Rayhanur Rahman,
Brandon Wroblewski,
Quinn Matthews,
Brantley Morgan,
Tim Menzies,
Laurie Williams
Abstract:
Defending from cyberattacks requires practitioners to operate on high-level adversary behavior. Cyberthreat intelligence (CTI) reports on past cyberattack incidents describe the chain of malicious actions with respect to time. To avoid repeating cyberattack incidents, practitioners must proactively identify and defend against recurring chain of actions - which we refer to as temporal attack patter…
▽ More
Defending from cyberattacks requires practitioners to operate on high-level adversary behavior. Cyberthreat intelligence (CTI) reports on past cyberattack incidents describe the chain of malicious actions with respect to time. To avoid repeating cyberattack incidents, practitioners must proactively identify and defend against recurring chain of actions - which we refer to as temporal attack patterns. Automatically mining the patterns among actions provides structured and actionable information on the adversary behavior of past cyberattacks. The goal of this paper is to aid security practitioners in prioritizing and proactive defense against cyberattacks by mining temporal attack patterns from cyberthreat intelligence reports. To this end, we propose ChronoCTI, an automated pipeline for mining temporal attack patterns from cyberthreat intelligence (CTI) reports of past cyberattacks. To construct ChronoCTI, we build the ground truth dataset of temporal attack patterns and apply state-of-the-art large language models, natural language processing, and machine learning techniques. We apply ChronoCTI on a set of 713 CTI reports, where we identify 124 temporal attack patterns - which we categorize into nine pattern categories. We identify that the most prevalent pattern category is to trick victim users into executing malicious code to initiate the attack, followed by bypassing the anti-malware system in the victim network. Based on the observed patterns, we advocate organizations to train users about cybersecurity best practices, introduce immutable operating systems with limited functionalities, and enforce multi-user authentications. Moreover, we advocate practitioners to leverage the automated mining capability of ChronoCTI and design countermeasures against the recurring attack patterns.
△ Less
Submitted 3 January, 2024;
originally announced January 2024.
-
Attackers reveal their arsenal: An investigation of adversarial techniques in CTI reports
Authors:
Md Rayhanur Rahman,
Setu Kumar Basak,
Rezvan Mahdavi Hezaveh,
Laurie Williams
Abstract:
Context: Cybersecurity vendors often publish cyber threat intelligence (CTI) reports, referring to the written artifacts on technical and forensic analysis of the techniques used by the malware in APT attacks. Objective: The goal of this research is to inform cybersecurity practitioners about how adversaries form cyberattacks through an analysis of adversarial techniques documented in cyberthreat…
▽ More
Context: Cybersecurity vendors often publish cyber threat intelligence (CTI) reports, referring to the written artifacts on technical and forensic analysis of the techniques used by the malware in APT attacks. Objective: The goal of this research is to inform cybersecurity practitioners about how adversaries form cyberattacks through an analysis of adversarial techniques documented in cyberthreat intelligence reports. Dataset: We use 594 adversarial techniques cataloged in MITRE ATT\&CK. We systematically construct a set of 667 CTI reports that MITRE ATT\&CK used as citations in the descriptions of the cataloged adversarial techniques. Methodology: We analyze the frequency and trend of adversarial techniques, followed by a qualitative analysis of the implementation of techniques. Next, we perform association rule mining to identify pairs of techniques recurring in APT attacks. We then perform qualitative analysis to identify the underlying relations among the techniques in the recurring pairs. Findings: The set of 667 CTI reports documents 10,370 techniques in total, and we identify 19 prevalent techniques accounting for 37.3\% of documented techniques. We also identify 425 statistically significant recurring pairs and seven types of relations among the techniques in these pairs. The top three among the seven relationships suggest that techniques used by the malware inter-relate with one another in terms of (a) abusing or affecting the same system assets, (b) executing in sequences, and (c) overlapping in their implementations. Overall, the study quantifies how adversaries leverage techniques through malware in APT attacks based on publicly reported documents. We advocate organizations prioritize their defense against the identified prevalent techniques and actively hunt for potential malicious intrusion based on the identified pairs of techniques.
△ Less
Submitted 3 January, 2024;
originally announced January 2024.
-
Best Practices for 2-Body Pose Forecasting
Authors:
Muhammad Rameez Ur Rahman,
Luca Scofano,
Edoardo De Matteis,
Alessandro Flaborea,
Alessio Sampieri,
Fabio Galasso
Abstract:
The task of collaborative human pose forecasting stands for predicting the future poses of multiple interacting people, given those in previous frames. Predicting two people in interaction, instead of each separately, promises better performance, due to their body-body motion correlations. But the task has remained so far primarily unexplored.
In this paper, we review the progress in human pose…
▽ More
The task of collaborative human pose forecasting stands for predicting the future poses of multiple interacting people, given those in previous frames. Predicting two people in interaction, instead of each separately, promises better performance, due to their body-body motion correlations. But the task has remained so far primarily unexplored.
In this paper, we review the progress in human pose forecasting and provide an in-depth assessment of the single-person practices that perform best for 2-body collaborative motion forecasting. Our study confirms the positive impact of frequency input representations, space-time separable and fully-learnable interaction adjacencies for the encoding GCN and FC decoding. Other single-person practices do not transfer to 2-body, so the proposed best ones do not include hierarchical body modeling or attention-based interaction encoding.
We further contribute a novel initialization procedure for the 2-body spatial interaction parameters of the encoder, which benefits performance and stability. Altogether, our proposed 2-body pose forecasting best practices yield a performance improvement of 21.9% over the state-of-the-art on the most recent ExPI dataset, whereby the novel initialization accounts for 3.5%. See our project page at https://www.pinlab.org/bestpractices2body
△ Less
Submitted 12 April, 2023;
originally announced April 2023.
-
PACMAN Attack: A Mobility-Powered Attack in Private 5G-Enabled Industrial Automation System
Authors:
Md Rashedur Rahman,
Moinul Hossain,
Jiang Xie
Abstract:
3GPP has introduced Private 5G to support the next-generation industrial automation system (IAS) due to the versatility and flexibility of 5G architecture. Besides the 3.5GHz CBRS band, unlicensed spectrum bands, like 5GHz, are considered as an additional medium because of their free and abundant nature. However, while utilizing the unlicensed band, industrial equipment must coexist with incumbent…
▽ More
3GPP has introduced Private 5G to support the next-generation industrial automation system (IAS) due to the versatility and flexibility of 5G architecture. Besides the 3.5GHz CBRS band, unlicensed spectrum bands, like 5GHz, are considered as an additional medium because of their free and abundant nature. However, while utilizing the unlicensed band, industrial equipment must coexist with incumbents, e.g., Wi-Fi, which could introduce new security threats and resuscitate old ones. In this paper, we propose a novel attack strategy conducted by a mobility-enabled malicious Wi-Fi access point (mmAP), namely \textit{PACMAN} attack, to exploit vulnerabilities introduced by heterogeneous coexistence. A mmAP is capable of moving around the physical surface to identify mission-critical devices, hopping through the frequency domain to detect the victim's operating channel, and launching traditional MAC layer-based attacks. The multi-dimensional mobility of the attacker makes it impervious to state-of-the-art detection techniques that assume static adversaries. In addition, we propose a novel Markov Decision Process (MDP) based framework to intelligently design an attacker's multi-dimensional mobility in space and frequency. Mathematical analysis and extensive simulation results exhibit the adverse effect of the proposed mobility-powered attack.
△ Less
Submitted 16 February, 2023;
originally announced February 2023.
-
An investigation of security controls and MITRE ATT\&CK techniques
Authors:
Md Rayhanur Rahman,
Laurie Williams
Abstract:
Attackers utilize a plethora of adversarial techniques in cyberattacks to compromise the confidentiality, integrity, and availability of the target organizations and systems. Information security standards such as NIST, ISO/IEC specify hundreds of security controls that organizations can enforce to protect and defend the information systems from adversarial techniques. However, implementing all th…
▽ More
Attackers utilize a plethora of adversarial techniques in cyberattacks to compromise the confidentiality, integrity, and availability of the target organizations and systems. Information security standards such as NIST, ISO/IEC specify hundreds of security controls that organizations can enforce to protect and defend the information systems from adversarial techniques. However, implementing all the available controls at the same time can be infeasible and security controls need to be investigated in terms of their mitigation ability over adversarial techniques used in cyberattacks as well. The goal of this research is to aid organizations in making informed choices on security controls to defend against cyberthreats through an investigation of adversarial techniques used in current cyberattacks. In this study, we investigated the extent of mitigation of 298 NIST SP800-53 controls over 188 adversarial techniques used in 669 cybercrime groups and malware cataloged in the MITRE ATT\&CK framework based upon an existing mapping between the controls and techniques. We identify that, based on the mapping, only 101 out of 298 control are capable of mitigating adversarial techniques. However, we also identify that 53 adversarial techniques cannot be mitigated by any existing controls, and these techniques primarily aid adversaries in bypassing system defense and discovering targeted system information. We identify a set of 20 critical controls that can mitigate 134 adversarial techniques, and on average, can mitigate 72\% of all techniques used by 98\% of the cataloged adversaries in MITRE ATT\&CK. We urge organizations, that do not have any controls enforced in place, to implement the top controls identified in the study.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
Investigating co-occurrences of MITRE ATT\&CK Techniques
Authors:
Md Rayhanur Rahman,
Laurie Williams
Abstract:
Cyberattacks use adversarial techniques to bypass system defenses, persist, and eventually breach systems. The MITRE ATT\&CK framework catalogs a set of adversarial techniques and maps between adversaries and their used techniques and tactics. Understanding how adversaries deploy techniques in conjunction is pivotal for learning adversary behavior, hunting potential threats, and formulating a proa…
▽ More
Cyberattacks use adversarial techniques to bypass system defenses, persist, and eventually breach systems. The MITRE ATT\&CK framework catalogs a set of adversarial techniques and maps between adversaries and their used techniques and tactics. Understanding how adversaries deploy techniques in conjunction is pivotal for learning adversary behavior, hunting potential threats, and formulating a proactive defense. The goal of this research is to aid cybersecurity practitioners and researchers in choosing detection and mitigation strategies through co-occurrence analysis of adversarial techniques reported in MITRE ATT&CK. We collect the adversarial techniques of 115 cybercrime groups and 484 malware from the MITRE ATT\&CK. We apply association rule mining and network analysis to investigate how adversarial techniques co-occur. We identify that adversaries pair T1059: Command and scripting interface and T1105: Ingress tool transfer techniques with a relatively large number of ATT\&CK techniques. We also identify adversaries using the T1082: System Information Discovery technique to determine their next course of action. We observe adversaries deploy the highest number of techniques from the TA0005: Defense evasion and TA0007: Discovery tactics. Based on our findings on co-occurrence, we identify six detection, six mitigation strategies, and twelve adversary behaviors. We urge defenders to prioritize primarily the detection of TA0007: Discovery and mitigation of TA0005: Defense evasion techniques. Overall, this study approximates how adversaries leverage techniques based on publicly reported documents. We advocate organizations investigate adversarial techniques in their environment and make the findings available for a more precise and actionable understanding.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
From Threat Reports to Continuous Threat Intelligence: A Comparison of Attack Technique Extraction Methods from Textual Artifacts
Authors:
Md Rayhanur Rahman,
Laurie Williams
Abstract:
The cyberthreat landscape is continuously evolving. Hence, continuous monitoring and sharing of threat intelligence have become a priority for organizations. Threat reports, published by cybersecurity vendors, contain detailed descriptions of attack Tactics, Techniques, and Procedures (TTP) written in an unstructured text format. Extracting TTP from these reports aids cybersecurity practitioners a…
▽ More
The cyberthreat landscape is continuously evolving. Hence, continuous monitoring and sharing of threat intelligence have become a priority for organizations. Threat reports, published by cybersecurity vendors, contain detailed descriptions of attack Tactics, Techniques, and Procedures (TTP) written in an unstructured text format. Extracting TTP from these reports aids cybersecurity practitioners and researchers learn and adapt to evolving attacks and in planning threat mitigation. Researchers have proposed TTP extraction methods in the literature, however, not all of these proposed methods are compared to one another or to a baseline. \textit{The goal of this study is to aid cybersecurity researchers and practitioners choose attack technique extraction methods for monitoring and sharing threat intelligence by comparing the underlying methods from the TTP extraction studies in the literature.} In this work, we identify ten existing TTP extraction studies from the literature and implement five methods from the ten studies. We find two methods, based on Term Frequency-Inverse Document Frequency(TFIDF) and Latent Semantic Indexing (LSI), outperform the other three methods with a F1 score of 84\% and 83\%, respectively. We observe the performance of all methods in F1 score drops in the case of increasing the class labels exponentially. We also implement and evaluate an oversampling strategy to mitigate class imbalance issues. Furthermore, oversampling improves the classification performance of TTP extraction. We provide recommendations from our findings for future cybersecurity researchers, such as the construction of a benchmark dataset from a large corpus; and the selection of textual features of TTP. Our work, along with the dataset and implementation source code, can work as a baseline for cybersecurity researchers to test and compare the performance of future TTP extraction methods.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
AI-based approach for improving the detection of blood doping in sports
Authors:
Maxx Richard Rahman,
Jacob Bejder,
Thomas Christian Bonne,
Andreas Breenfeldt Andersen,
Jesús RodrÃguez Huertas,
Reid Aikin,
Nikolai Baastrup Nordsborg,
Wolfgang Maaß
Abstract:
Sports officials around the world are facing incredible challenges due to the unfair means of practices performed by the athletes to improve their performance in the game. It includes the intake of hormonal based drugs or transfusion of blood to increase their strength and the result of their training. However, the current direct test of detection of these cases includes the laboratory-based metho…
▽ More
Sports officials around the world are facing incredible challenges due to the unfair means of practices performed by the athletes to improve their performance in the game. It includes the intake of hormonal based drugs or transfusion of blood to increase their strength and the result of their training. However, the current direct test of detection of these cases includes the laboratory-based method, which is limited because of the cost factors, availability of medical experts, etc. This leads us to seek for indirect tests. With the growing interest of Artificial Intelligence in healthcare, it is important to propose an algorithm based on blood parameters to improve decision making. In this paper, we proposed a statistical and machine learning-based approach to identify the presence of doping substance rhEPO in blood samples.
△ Less
Submitted 9 February, 2022;
originally announced March 2022.
-
Intent-Aware Permission Architecture: A Model for Rethinking Informed Consent for Android Apps
Authors:
Md Rashedur Rahman,
Elizabeth Miller,
Moinul Hossain,
Aisha Ali-Gombe
Abstract:
As data privacy continues to be a crucial human-right concern as recognized by the UN, regulatory agencies have demanded developers obtain user permission before accessing user-sensitive data. Mainly through the use of privacy policies statements, developers fulfill their legal requirements to keep users abreast of the requests for their data. In addition, platforms such as Android enforces explic…
▽ More
As data privacy continues to be a crucial human-right concern as recognized by the UN, regulatory agencies have demanded developers obtain user permission before accessing user-sensitive data. Mainly through the use of privacy policies statements, developers fulfill their legal requirements to keep users abreast of the requests for their data. In addition, platforms such as Android enforces explicit permission request using the permission model. Nonetheless, recent research has shown that service providers hardly make full disclosure when requesting data in these statements. Neither is the current permission model designed to provide adequate informed consent. Often users have no clear understanding of the reason and scope of usage of the data request. This paper proposes an unambiguous, informed consent process that provides developers with a standardized method for declaring Intent. Our proposed Intent-aware permission architecture extends the current Android permission model with a precise mechanism for full disclosure of purpose and scope limitation. The design of which is based on an ontology study of data requests purposes. The overarching objective of this model is to ensure end-users are adequately informed before making decisions on their data. Additionally, this model has the potential to improve trust between end-users and developers.
△ Less
Submitted 14 February, 2022;
originally announced February 2022.
-
What are the attackers doing now? Automating cyber threat intelligence extraction from text on pace with the changing threat landscape: A survey
Authors:
Md Rayhanur Rahman,
Rezvan Mahdavi-Hezaveh,
Laurie Williams
Abstract:
Cybersecurity researchers have contributed to the automated extraction of CTI from textual sources, such as threat reports and online articles, where cyberattack strategies, procedures, and tools are described. The goal of this article is to aid cybersecurity researchers understand the current techniques used for cyberthreat intelligence extraction from text through a survey of relevant studies in…
▽ More
Cybersecurity researchers have contributed to the automated extraction of CTI from textual sources, such as threat reports and online articles, where cyberattack strategies, procedures, and tools are described. The goal of this article is to aid cybersecurity researchers understand the current techniques used for cyberthreat intelligence extraction from text through a survey of relevant studies in the literature. We systematically collect "CTI extraction from text"-related studies from the literature and categorize the CTI extraction purposes. We propose a CTI extraction pipeline abstracted from these studies. We identify the data sources, techniques, and CTI sharing formats utilized in the context of the proposed pipeline. Our work finds ten types of extraction purposes, such as extraction indicators of compromise extraction, TTPs (tactics, techniques, procedures of attack), and cybersecurity keywords. We also identify seven types of textual sources for CTI extraction, and textual data obtained from hacker forums, threat reports, social media posts, and online news articles have been used by almost 90% of the studies. Natural language processing along with both supervised and unsupervised machine learning techniques such as named entity recognition, topic modelling, dependency parsing, supervised classification, and clustering are used for CTI extraction. We observe the technical challenges associated with these studies related to obtaining available clean, labelled data which could assure replication, validation, and further extension of the studies. As we find the studies focusing on CTI information extraction from text, we advocate for building upon the current CTI extraction work to help cybersecurity practitioners with proactive decision making such as threat prioritization, automated threat modelling to utilize knowledge from past cybersecurity incidents.
△ Less
Submitted 14 September, 2021;
originally announced September 2021.
-
Decentralized Policy Information Points for Multi-Domain Environments
Authors:
M Ridwanur Rahman,
Ahmad Salehi Shahraki,
Carsten Rudolph
Abstract:
Access control models have been developed to control authorized access to sensitive resources. This control of access is important as there is now a need for collaborative resource sharing between multiple organizations over open environments like the internet. Although there are multiple access control models that are being widely used, these models are providing access control within a closed en…
▽ More
Access control models have been developed to control authorized access to sensitive resources. This control of access is important as there is now a need for collaborative resource sharing between multiple organizations over open environments like the internet. Although there are multiple access control models that are being widely used, these models are providing access control within a closed environment i.e. within the organization using it. These models have restricted capabilities in providing access control in open environments. Attribute-Based Access Control (ABAC) has emerged as a powerful access control model to bring fine-grained authorization to organizations that possess sensitive data and resources and want to collaborate over open environments. In an ABAC system, access to resources that an organization possess can be controlled by applying policies on attributes of the users. These policies are conditions that need to be satisfied by the requester in order to gain access to the resource. In this paper, we provide an introduction to ABAC and by carrying forward the architecture of ABAC, we propose a Decentralized Policy Information Point (PIP) model. Our model proposes the decentralization of PIP, which is an entity of the ABAC model that allows the storage and query of user attributes and enforces fine-grained access control for controlling the access of sensitive resources over multiple domains. Our model makes use of the concept of a cryptographic primitive called Attribute-Based Signature (ABS) to keep the identities of the users involved, private. Our model can be used for collaborative resource sharing over the internet. The evaluation of our model is also discussed to reflect the application of the proposed decentralized PIP model.
△ Less
Submitted 18 August, 2021;
originally announced August 2021.
-
Adversarial Branch Architecture Search for Unsupervised Domain Adaptation
Authors:
Luca Robbiano,
Muhammad Rameez Ur Rahman,
Fabio Galasso,
Barbara Caputo,
Fabio Maria Carlucci
Abstract:
Unsupervised Domain Adaptation (UDA) is a key issue in visual recognition, as it allows to bridge different visual domains enabling robust performances in the real world. To date, all proposed approaches rely on human expertise to manually adapt a given UDA method (e.g. DANN) to a specific backbone architecture (e.g. ResNet). This dependency on handcrafted designs limits the applicability of a giv…
▽ More
Unsupervised Domain Adaptation (UDA) is a key issue in visual recognition, as it allows to bridge different visual domains enabling robust performances in the real world. To date, all proposed approaches rely on human expertise to manually adapt a given UDA method (e.g. DANN) to a specific backbone architecture (e.g. ResNet). This dependency on handcrafted designs limits the applicability of a given approach in time, as old methods need to be constantly adapted to novel backbones.
Existing Neural Architecture Search (NAS) approaches cannot be directly applied to mitigate this issue, as they rely on labels that are not available in the UDA setting. Furthermore, most NAS methods search for full architectures, which precludes the use of pre-trained models, essential in a vast range of UDA settings for reaching SOTA results. To the best of our knowledge, no prior work has addressed these aspects in the context of NAS for UDA. Here we tackle both aspects with an Adversarial Branch Architecture Search for UDA (ABAS): i. we address the lack of target labels by a novel data-driven ensemble approach for model selection; and ii. we search for an auxiliary adversarial branch, attached to a pre-trained backbone, which drives the domain alignment.
We extensively validate ABAS to improve two modern UDA techniques, DANN and ALDA, on three standard visual recognition datasets (Office31, Office-Home and PACS). In all cases, ABAS robustly finds the adversarial branch architectures and parameters which yield best performances.
△ Less
Submitted 22 October, 2021; v1 submitted 12 February, 2021;
originally announced February 2021.
-
Privacy-preserving Decentralized Aggregation for Federated Learning
Authors:
Beomyeol Jeon,
S. M. Ferdous,
Muntasir Raihan Rahman,
Anwar Walid
Abstract:
Federated learning is a promising framework for learning over decentralized data spanning multiple regions. This approach avoids expensive central training data aggregation cost and can improve privacy because distributed sites do not have to reveal privacy-sensitive data. In this paper, we develop a privacy-preserving decentralized aggregation protocol for federated learning. We formulate the dis…
▽ More
Federated learning is a promising framework for learning over decentralized data spanning multiple regions. This approach avoids expensive central training data aggregation cost and can improve privacy because distributed sites do not have to reveal privacy-sensitive data. In this paper, we develop a privacy-preserving decentralized aggregation protocol for federated learning. We formulate the distributed aggregation protocol with the Alternating Direction Method of Multiplier (ADMM) and examine its privacy weakness. Unlike prior work that use Differential Privacy or homomorphic encryption for privacy, we develop a protocol that controls communication among participants in each round of aggregation to minimize privacy leakage. We establish its privacy guarantee against an honest-but-curious adversary. We also propose an efficient algorithm to construct such a communication pattern, inspired by combinatorial block design theory. Our secure aggregation protocol based on this novel group communication pattern design leads to an efficient algorithm for federated training with privacy guarantees. We evaluate our federated training algorithm on image classification and next-word prediction applications over benchmark datasets with 9 and 15 distributed sites. Evaluation results show that our algorithm performs comparably to the standard centralized federated learning method while preserving privacy; the degradation in test accuracy is only up to 0.73%.
△ Less
Submitted 28 December, 2020; v1 submitted 13 December, 2020;
originally announced December 2020.
-
Crime Prediction Using Multiple-ANFIS Architecture and Spatiotemporal Data
Authors:
Mashnoon Islam,
Redwanul Karim,
Kalyan Roy,
Saif Mahmood,
Sadat Hossain,
M. Rashedur Rahman
Abstract:
Statistical values alone cannot bring the whole scenario of crime occurrences in the city of Dhaka. We need a better way to use these statistical values to predict crime occurrences and make the city a safer place to live. Proper decision-making for the future is key in reducing the rate of criminal offenses in an area or a city. If the law enforcement bodies can allocate their resources efficient…
▽ More
Statistical values alone cannot bring the whole scenario of crime occurrences in the city of Dhaka. We need a better way to use these statistical values to predict crime occurrences and make the city a safer place to live. Proper decision-making for the future is key in reducing the rate of criminal offenses in an area or a city. If the law enforcement bodies can allocate their resources efficiently for the future, the rate of crime in Dhaka can be brought down to a minimum. In this work, we have made an initiative to provide an effective tool with which law enforcement officials and detectives can predict crime occurrences ahead of time and take better decisions easily and quickly. We have used several Fuzzy Inference Systems (FIS) and Adaptive Neuro-Fuzzy Inference Systems (ANFIS) to predict the type of crime that is highly likely to occur at a certain place and time.
△ Less
Submitted 7 November, 2020;
originally announced November 2020.
-
Visual Summarization of Lecture Video Segments for Enhanced Navigation
Authors:
Mohammad Rajiur Rahman,
Jaspal Subhlok,
Shishir Shah
Abstract:
Lecture videos are an increasingly important learning resource for higher education. However, the challenge of quickly finding the content of interest in a lecture video is an important limitation of this format. This paper introduces visual summarization of lecture video segments to enhance navigation. A lecture video is divided into segments based on the frame-to-frame similarity of content. The…
▽ More
Lecture videos are an increasingly important learning resource for higher education. However, the challenge of quickly finding the content of interest in a lecture video is an important limitation of this format. This paper introduces visual summarization of lecture video segments to enhance navigation. A lecture video is divided into segments based on the frame-to-frame similarity of content. The user navigates the lecture video content by viewing a single frame visual and textual summary of each segment. The paper presents a novel methodology to generate the visual summary of a lecture video segment by computing similarities between images extracted from the segment and employing a graph-based algorithm to identify the subset of most representative images. The results from this research are integrated into a real-world lecture video management portal called Videopoints. To collect ground truth for evaluation, a survey was conducted where multiple users manually provided visual summaries for 40 lecture video segments. The users also stated whether any images were not selected for the summary because they were similar to other selected images. The graph based algorithm for identifying summary images achieves 78% precision and 72% F1-measure with frequently selected images as the ground truth, and 94% precision and 72% F1-measure with the union of all user selected images as the ground truth. For 98% of algorithm selected visual summary images, at least one user also selected that image for their summary or considered it similar to another image they selected. Over 65% of automatically generated summaries were rated as good or very good by the users on a 4-point scale from poor to very good. Overall, the results establish that the methodology introduced in this paper produces good quality visual summaries that are practically useful for lecture video navigation.
△ Less
Submitted 3 June, 2020;
originally announced June 2020.
-
Deep Reinforcement Learning for Network Slicing with Heterogeneous Resource Requirements and Time Varying Traffic Dynamics
Authors:
Jaehoon Koo,
Veena B. Mendiratta,
Muntasir Raihan Rahman,
Anwar Walid
Abstract:
Efficient network slicing is vital to deal with the highly variable and dynamic characteristics of network traffic generated by a varied range of applications. The problem is made more challenging with the advent of new technologies such as 5G and new architectures such as SDN and NFV. Network slicing addresses a challenging dynamic network resource allocation problem where a single network infras…
▽ More
Efficient network slicing is vital to deal with the highly variable and dynamic characteristics of network traffic generated by a varied range of applications. The problem is made more challenging with the advent of new technologies such as 5G and new architectures such as SDN and NFV. Network slicing addresses a challenging dynamic network resource allocation problem where a single network infrastructure is divided into (virtual) multiple slices to meet the demands of different users with varying requirements, the main challenges being --- the traffic arrival characteristics and the job resource requirements (e.g., compute, memory and bandwidth resources) for each slice can be highly dynamic. Traditional model-based optimization or queueing theoretic modeling becomes intractable with the high reliability, and stringent bandwidth and latency requirements imposed by 5G technologies. In addition these approaches lack adaptivity in dynamic environments. We propose a deep reinforcement learning approach to address this dynamic coupled resource allocation problem. Model evaluation using both synthetic simulation data and real workload driven traces demonstrates that our deep reinforcement learning solution improves overall resource utilization, latency performance, and demands satisfied as compared to a baseline equal-slicing strategy.
△ Less
Submitted 8 August, 2019;
originally announced August 2019.
-
Security Smells in Ansible and Chef Scripts: A Replication Study
Authors:
Akond Rahman,
Md. Rayhanur Rahman,
Chris Parnin,
Laurie Williams
Abstract:
Context: Security smells are recurring coding patterns that are indicative of security weakness, and require further inspection. As infrastructure as code (IaC) scripts, such as Ansible and Chef scripts, are used to provision cloud-based servers and systems at scale, security smells in IaC scripts could be used to enable malicious users to exploit vulnerabilities in the provisioned systems. Goal:…
▽ More
Context: Security smells are recurring coding patterns that are indicative of security weakness, and require further inspection. As infrastructure as code (IaC) scripts, such as Ansible and Chef scripts, are used to provision cloud-based servers and systems at scale, security smells in IaC scripts could be used to enable malicious users to exploit vulnerabilities in the provisioned systems. Goal: The goal of this paper is to help practitioners avoid insecure coding practices while developing infrastructure as code scripts through an empirical study of security smells in Ansible and Chef scripts. Methodology: We conduct a replication study where we apply qualitative analysis with 1,956 IaC scripts to identify security smells for IaC scripts written in two languages: Ansible and Chef. We construct a static analysis tool called Security Linter for Ansible and Chef scripts (SLAC) to automatically identify security smells in 50,323 scripts collected from 813 open source software repositories. We also submit bug reports for 1,000 randomly-selected smell occurrences. Results: We identify two security smells not reported in prior work: missing default in case statement and no integrity check. By applying SLAC we identify 46,600 occurrences of security smells that include 7,849 hard-coded passwords. We observe agreement for 65 of the responded 94 bug reports, which suggests the relevance of security smells for Ansible and Chef scripts amongst practitioners. Conclusion: We observe security smells to be prevalent in Ansible and Chef scripts, similar to that of the Puppet scripts. We recommend practitioners to rigorously inspect the presence of the identified security smells in Ansible and Chef scripts using (i) code review, and (ii) static analysis tools.
△ Less
Submitted 20 June, 2020; v1 submitted 16 July, 2019;
originally announced July 2019.
-
MobiCoMonkey - Context Testing of Android Apps
Authors:
Amit Seal Ami,
Md. Mehedi Hasan,
Md. Rayhanur Rahman,
Kazi Sakib
Abstract:
The functionality of many mobile applications is dependent on various contextual, external factors. Depending on unforeseen scenarios, mobile apps can even malfunction or crash. In this paper, we have introduced MobiCoMonkey - automated tool that allows a developer to test app against custom or auto generated contextual scenarios and help detect possible bugs through the emulator. Moreover, it rep…
▽ More
The functionality of many mobile applications is dependent on various contextual, external factors. Depending on unforeseen scenarios, mobile apps can even malfunction or crash. In this paper, we have introduced MobiCoMonkey - automated tool that allows a developer to test app against custom or auto generated contextual scenarios and help detect possible bugs through the emulator. Moreover, it reports the connection between the bugs and contextual factors so that the bugs can later be reproduced. It utilizes the tools offered by Android SDK and logcat to inject events and capture traces of the app execution.
△ Less
Submitted 7 April, 2018;
originally announced April 2018.
-
Inferring Formal Properties of Production Key-Value Stores
Authors:
Edgar Pek,
Pranav Garg,
Muntasir Raihan Rahman,
Karl Palmskog,
Indranil Gupta,
P. Madhusudan
Abstract:
Production distributed systems are challenging to formally verify, in particular when they are based on distributed protocols that are not rigorously described or fully understood. In this paper, we derive models and properties for two core distributed protocols used in eventually consistent production key-value stores such as Riak and Cassandra. We propose a novel modeling called certified progra…
▽ More
Production distributed systems are challenging to formally verify, in particular when they are based on distributed protocols that are not rigorously described or fully understood. In this paper, we derive models and properties for two core distributed protocols used in eventually consistent production key-value stores such as Riak and Cassandra. We propose a novel modeling called certified program models, where complete distributed systems are captured as programs written in traditional systems languages such as concurrent C. Specifically, we model the read-repair and hinted-handoff recovery protocols as concurrent C programs, test them for conformance with real systems, and then verify that they guarantee eventual consistency, modeling precisely the specification as well as the failure assumptions under which the results hold.
△ Less
Submitted 28 December, 2017;
originally announced December 2017.
-
Characterizing and Adapting the Consistency-Latency Tradeoff in Distributed Key-value Stores
Authors:
Muntasir Raihan Rahman,
Lewis Tseng,
Son Nguyen,
Indranil Gupta,
Nitin Vaidya
Abstract:
The CAP theorem is a fundamental result that applies to distributed storage systems. In this paper, we first present and prove two CAP-like impossibility theorems. To state these theorems, we present probabilistic models to characterize the three important elements of the CAP theorem: consistency (C), availability or latency (A), and partition tolerance (P). The theorems show the un-achievable env…
▽ More
The CAP theorem is a fundamental result that applies to distributed storage systems. In this paper, we first present and prove two CAP-like impossibility theorems. To state these theorems, we present probabilistic models to characterize the three important elements of the CAP theorem: consistency (C), availability or latency (A), and partition tolerance (P). The theorems show the un-achievable envelope, i.e., which combinations of the parameters of the three models make them impossible to achieve together. Next, we present the design of a class of systems called PCAP that perform close to the envelope described by our theorems. In addition, these systems allow applications running on a single data-center to specify either a latency SLA or a consistency SLA. The PCAP systems automatically adapt, in real-time and under changing network conditions, to meet the SLA while optimizing the other C/A metric. We incorporate PCAP into two popular key-value stores -- Apache Cassandra and Riak. Our experiments with these two deployments, under realistic workloads, reveal that the PCAP system satisfactorily meets SLAs, and performs close to the achievable envelope. We also extend PCAP from a single data-center to multiple geo-distributed data-centers.
△ Less
Submitted 23 January, 2016; v1 submitted 8 September, 2015;
originally announced September 2015.
-
Network connectivity in non-convex domains with reflections
Authors:
Orestis Georgiou,
Mohammud Z. Bocus,
Mohammed R. Rahman,
Carl P. Dettmann,
Justin P. Coon
Abstract:
Recent research has demonstrated the importance of boundary effects on the overall connection probability of wireless networks but has largely focused on convex deployment regions. We consider here a scenario of practical importance to wireless communications, in which one or more nodes are located outside the convex space where the remaining nodes reside. We call these `external nodes', and assum…
▽ More
Recent research has demonstrated the importance of boundary effects on the overall connection probability of wireless networks but has largely focused on convex deployment regions. We consider here a scenario of practical importance to wireless communications, in which one or more nodes are located outside the convex space where the remaining nodes reside. We call these `external nodes', and assume that they play some essential role in the macro network functionality e.g. a gateway to a dense self-contained mesh network cloud. Conventional approaches with the underlying assumption of only line-of-sight (LOS) or direct connections between nodes, fail to provide the correct analysis for such a network setup. To this end we present a novel analytical framework that accommodates for the non-convexity of the domain and explicitly considers the effects of non-LOS nodes through reflections from the domain boundaries. We obtain analytical expressions in 2D and 3D which are confirmed numerically for Rician channel fading statistics and discuss possible extensions and applications.
△ Less
Submitted 16 December, 2014;
originally announced December 2014.
-
Keyhole and Reflection Effects in Network Connectivity Analysis
Authors:
Mohammud Z. Bocus,
Carl P. Dettmann,
Justin P. Coon,
Mohammed R. Rahman
Abstract:
Recent research has demonstrated the importance of boundary effects on the overall connection probability of wireless networks, but has largely focused on convex domains. We consider two generic scenarios of practical importance to wireless communications, in which one or more nodes are located outside the convex space where the remaining nodes reside. Consequently, conventional approaches with th…
▽ More
Recent research has demonstrated the importance of boundary effects on the overall connection probability of wireless networks, but has largely focused on convex domains. We consider two generic scenarios of practical importance to wireless communications, in which one or more nodes are located outside the convex space where the remaining nodes reside. Consequently, conventional approaches with the underlying assumption that only line-of-sight (LOS) or direct connections between nodes are possible, fail to provide the correct analysis for the connectivity. We present an analytical framework that explicitly considers the effects of reflections from the system boundaries on the full connection probability. This study provides a different strategy to ray tracing tools for predicting the wireless propagation environment. A simple two-dimensional geometry is first considered, followed by a more practical three-dimensional system. We investigate the effects of different system parameters on the connectivity of the network though analysis corroborated by numerical simulations, and highlight the potential of our approach for more general non-convex geometries.t system parameters on the connectivity of the network through simulation and analysis.
△ Less
Submitted 5 March, 2014; v1 submitted 27 November, 2012;
originally announced November 2012.
-
Toward a Principled Framework for Benchmarking Consistency
Authors:
Muntasir Raihan Rahman,
Wojciech Golab,
Alvin AuYoung,
Kimberly Keeton,
Jay J. Wylie
Abstract:
Large-scale key-value storage systems sacrifice consistency in the interest of dependability (i.e., partition tolerance and availability), as well as performance (i.e., latency). Such systems provide eventual consistency,which---to this point---has been difficult to quantify in real systems. Given the many implementations and deployments of eventually-consistent systems (e.g., NoSQL systems), atte…
▽ More
Large-scale key-value storage systems sacrifice consistency in the interest of dependability (i.e., partition tolerance and availability), as well as performance (i.e., latency). Such systems provide eventual consistency,which---to this point---has been difficult to quantify in real systems. Given the many implementations and deployments of eventually-consistent systems (e.g., NoSQL systems), attempts have been made to measure this consistency empirically, but they suffer from important drawbacks. For example, state-of-the art consistency benchmarks exercise the system only in restricted ways and disrupt the workload, which limits their accuracy.
In this paper, we take the position that a consistency benchmark should paint a comprehensive picture of the relationship between the storage system under consideration, the workload, the pattern of failures, and the consistency observed by clients. To illustrate our point, we first survey prior efforts to quantify eventual consistency. We then present a benchmarking technique that overcomes the shortcomings of existing techniques to measure the consistency observed by clients as they execute the workload under consideration. This method is versatile and minimally disruptive to the system under test. As a proof of concept, we demonstrate this tool on Cassandra.
△ Less
Submitted 19 November, 2012; v1 submitted 18 November, 2012;
originally announced November 2012.