-
LegalRAG: A Hybrid RAG System for Multilingual Legal Information Retrieval
Authors:
Muhammad Rafsan Kabir,
Rafeed Mohammad Sultan,
Fuad Rahman,
Mohammad Ruhul Amin,
Sifat Momen,
Nabeel Mohammed,
Shafin Rahman
Abstract:
Natural Language Processing (NLP) and computational linguistic techniques are increasingly being applied across various domains, yet their use in legal and regulatory tasks remains limited. To address this gap, we develop an efficient bilingual question-answering framework for regulatory documents, specifically the Bangladesh Police Gazettes, which contain both English and Bangla text. Our approac…
▽ More
Natural Language Processing (NLP) and computational linguistic techniques are increasingly being applied across various domains, yet their use in legal and regulatory tasks remains limited. To address this gap, we develop an efficient bilingual question-answering framework for regulatory documents, specifically the Bangladesh Police Gazettes, which contain both English and Bangla text. Our approach employs modern Retrieval Augmented Generation (RAG) pipelines to enhance information retrieval and response generation. In addition to conventional RAG pipelines, we propose an advanced RAG-based approach that improves retrieval performance, leading to more precise answers. This system enables efficient searching for specific government legal notices, making legal information more accessible. We evaluate both our proposed and conventional RAG systems on a diverse test set on Bangladesh Police Gazettes, demonstrating that our approach consistently outperforms existing methods across all evaluation metrics.
△ Less
Submitted 19 April, 2025;
originally announced April 2025.
-
Language modelling techniques for analysing the impact of human genetic variation
Authors:
Megha Hegde,
Jean-Christophe Nebel,
Farzana Rahman
Abstract:
Interpreting the effects of variants within the human genome and proteome is essential for analysing disease risk, predicting medication response, and developing personalised health interventions. Due to the intrinsic similarities between the structure of natural languages and genetic sequences, natural language processing techniques have demonstrated great applicability in computational variant e…
▽ More
Interpreting the effects of variants within the human genome and proteome is essential for analysing disease risk, predicting medication response, and developing personalised health interventions. Due to the intrinsic similarities between the structure of natural languages and genetic sequences, natural language processing techniques have demonstrated great applicability in computational variant effect prediction. In particular, the advent of the Transformer has led to significant advancements in the field. However, Transformer-based models are not without their limitations, and a number of extensions and alternatives have been developed to improve results and enhance computational efficiency. This review explores the use of language models for computational variant effect prediction over the past decade, analysing the main architectures, and identifying key trends and future directions.
△ Less
Submitted 7 March, 2025;
originally announced March 2025.
-
On The Concurrence of Layer-wise Preconditioning Methods and Provable Feature Learning
Authors:
Thomas T. Zhang,
Behrad Moniri,
Ansh Nagwekar,
Faraz Rahman,
Anton Xue,
Hamed Hassani,
Nikolai Matni
Abstract:
Layer-wise preconditioning methods are a family of memory-efficient optimization algorithms that introduce preconditioners per axis of each layer's weight tensors. These methods have seen a recent resurgence, demonstrating impressive performance relative to entry-wise ("diagonal") preconditioning methods such as Adam(W) on a wide range of neural network optimization tasks. Complementary to their p…
▽ More
Layer-wise preconditioning methods are a family of memory-efficient optimization algorithms that introduce preconditioners per axis of each layer's weight tensors. These methods have seen a recent resurgence, demonstrating impressive performance relative to entry-wise ("diagonal") preconditioning methods such as Adam(W) on a wide range of neural network optimization tasks. Complementary to their practical performance, we demonstrate that layer-wise preconditioning methods are provably necessary from a statistical perspective. To showcase this, we consider two prototypical models, linear representation learning and single-index learning, which are widely used to study how typical algorithms efficiently learn useful features to enable generalization. In these problems, we show SGD is a suboptimal feature learner when extending beyond ideal isotropic inputs $\mathbf{x} \sim \mathsf{N}(\mathbf{0}, \mathbf{I})$ and well-conditioned settings typically assumed in prior work. We demonstrate theoretically and numerically that this suboptimality is fundamental, and that layer-wise preconditioning emerges naturally as the solution. We further show that standard tools like Adam preconditioning and batch-norm only mildly mitigate these issues, supporting the unique benefits of layer-wise preconditioning.
△ Less
Submitted 3 February, 2025;
originally announced February 2025.
-
Steganography and Probabilistic Risk Analysis: A Game Theoretical Framework for Quantifying Adversary Advantage and Impact
Authors:
Obinna Omego,
Farzana Rahman,
Onalo Samuel,
Jean-Christophe Nebel
Abstract:
In high-risk environments where unlawful surveillance is prevalent, securing confidential communications is critical. This study introduces a novel steganographic game-theoretic model to analyze the strategic interactions between a defending company and an adversary. By framing the scenario as a non-cooperative game, there is systematic evaluation of optimal strategies for both parties, incorporat…
▽ More
In high-risk environments where unlawful surveillance is prevalent, securing confidential communications is critical. This study introduces a novel steganographic game-theoretic model to analyze the strategic interactions between a defending company and an adversary. By framing the scenario as a non-cooperative game, there is systematic evaluation of optimal strategies for both parties, incorporating costs and benefits such as implementation expenses, potential data leaks, and operational advantages. The derived equilibrium probabilities enable the assessment of success rates, illustrating conditions under which the company benefits from hiding messages or faces increased risks when not implementing steganography. Sensitivity analysis explores how changes in key parameters impact these strategies, enhancing the understanding of decision-making in secure communications. Furthermore, the introduction of an adversary model that quantifies the adversary's advantage using conditional probabilities derived from success rates allows for a quantitative measure of the adversary's effectiveness based on the defender's strategies. By integrating the adversary's advantage into a novel risk analysis framework and employing Monte Carlo simulations, dynamic interactions are captured across advantage scenarios, considering factors like impact factor, steganography effectiveness, and equilibrium probabilities. This comprehensive framework offers practical insights into optimizing security strategies by quantifying potential risk reductions when the adversary is disadvantaged, providing a clear methodology for assessing and mitigating adversarial threats in complex security environments.
△ Less
Submitted 3 January, 2025; v1 submitted 23 December, 2024;
originally announced December 2024.
-
BanglaDialecto: An End-to-End AI-Powered Regional Speech Standardization
Authors:
Md. Nazmus Sadat Samin,
Jawad Ibn Ahad,
Tanjila Ahmed Medha,
Fuad Rahman,
Mohammad Ruhul Amin,
Nabeel Mohammed,
Shafin Rahman
Abstract:
This study focuses on recognizing Bangladeshi dialects and converting diverse Bengali accents into standardized formal Bengali speech. Dialects, often referred to as regional languages, are distinctive variations of a language spoken in a particular location and are identified by their phonetics, pronunciations, and lexicon. Subtle changes in pronunciation and intonation are also influenced by geo…
▽ More
This study focuses on recognizing Bangladeshi dialects and converting diverse Bengali accents into standardized formal Bengali speech. Dialects, often referred to as regional languages, are distinctive variations of a language spoken in a particular location and are identified by their phonetics, pronunciations, and lexicon. Subtle changes in pronunciation and intonation are also influenced by geographic location, educational attainment, and socioeconomic status. Dialect standardization is needed to ensure effective communication, educational consistency, access to technology, economic opportunities, and the preservation of linguistic resources while respecting cultural diversity. Being the fifth most spoken language with around 55 distinct dialects spoken by 160 million people, addressing Bangla dialects is crucial for developing inclusive communication tools. However, limited research exists due to a lack of comprehensive datasets and the challenges of handling diverse dialects. With the advancement in multilingual Large Language Models (mLLMs), emerging possibilities have been created to address the challenges of dialectal Automated Speech Recognition (ASR) and Machine Translation (MT). This study presents an end-to-end pipeline for converting dialectal Noakhali speech to standard Bangla speech. This investigation includes constructing a large-scale diverse dataset with dialectal speech signals that tailored the fine-tuning process in ASR and LLM for transcribing the dialect speech to dialect text and translating the dialect text to standard Bangla text. Our experiments demonstrated that fine-tuning the Whisper ASR model achieved a CER of 0.8% and WER of 1.5%, while the BanglaT5 model attained a BLEU score of 41.6% for dialect-to-standard text translation.
△ Less
Submitted 16 November, 2024;
originally announced November 2024.
-
Empowering Meta-Analysis: Leveraging Large Language Models for Scientific Synthesis
Authors:
Jawad Ibn Ahad,
Rafeed Mohammad Sultan,
Abraham Kaikobad,
Fuad Rahman,
Mohammad Ruhul Amin,
Nabeel Mohammed,
Shafin Rahman
Abstract:
This study investigates the automation of meta-analysis in scientific documents using large language models (LLMs). Meta-analysis is a robust statistical method that synthesizes the findings of multiple studies support articles to provide a comprehensive understanding. We know that a meta-article provides a structured analysis of several articles. However, conducting meta-analysis by hand is labor…
▽ More
This study investigates the automation of meta-analysis in scientific documents using large language models (LLMs). Meta-analysis is a robust statistical method that synthesizes the findings of multiple studies support articles to provide a comprehensive understanding. We know that a meta-article provides a structured analysis of several articles. However, conducting meta-analysis by hand is labor-intensive, time-consuming, and susceptible to human error, highlighting the need for automated pipelines to streamline the process. Our research introduces a novel approach that fine-tunes the LLM on extensive scientific datasets to address challenges in big data handling and structured data extraction. We automate and optimize the meta-analysis process by integrating Retrieval Augmented Generation (RAG). Tailored through prompt engineering and a new loss metric, Inverse Cosine Distance (ICD), designed for fine-tuning on large contextual datasets, LLMs efficiently generate structured meta-analysis content. Human evaluation then assesses relevance and provides information on model performance in key metrics. This research demonstrates that fine-tuned models outperform non-fine-tuned models, with fine-tuned LLMs generating 87.6% relevant meta-analysis abstracts. The relevance of the context, based on human evaluation, shows a reduction in irrelevancy from 4.56% to 1.9%. These experiments were conducted in a low-resource environment, highlighting the study's contribution to enhancing the efficiency and reliability of meta-analysis automation.
△ Less
Submitted 16 November, 2024;
originally announced November 2024.
-
BUET Multi-disease Heart Sound Dataset: A Comprehensive Auscultation Dataset for Developing Computer-Aided Diagnostic Systems
Authors:
Shams Nafisa Ali,
Afia Zahin,
Samiul Based Shuvo,
Nusrat Binta Nizam,
Shoyad Ibn Sabur Khan Nuhash,
Sayeed Sajjad Razin,
S. M. Sakeef Sani,
Farihin Rahman,
Nawshad Binta Nizam,
Farhat Binte Azam,
Rakib Hossen,
Sumaiya Ohab,
Nawsabah Noor,
Taufiq Hasan
Abstract:
Cardiac auscultation, an integral tool in diagnosing cardiovascular diseases (CVDs), often relies on the subjective interpretation of clinicians, presenting a limitation in consistency and accuracy. Addressing this, we introduce the BUET Multi-disease Heart Sound (BMD-HS) dataset - a comprehensive and meticulously curated collection of heart sound recordings. This dataset, encompassing 864 recordi…
▽ More
Cardiac auscultation, an integral tool in diagnosing cardiovascular diseases (CVDs), often relies on the subjective interpretation of clinicians, presenting a limitation in consistency and accuracy. Addressing this, we introduce the BUET Multi-disease Heart Sound (BMD-HS) dataset - a comprehensive and meticulously curated collection of heart sound recordings. This dataset, encompassing 864 recordings across five distinct classes of common heart sounds, represents a broad spectrum of valvular heart diseases, with a focus on diagnostically challenging cases. The standout feature of the BMD-HS dataset is its innovative multi-label annotation system, which captures a diverse range of diseases and unique disease states. This system significantly enhances the dataset's utility for developing advanced machine learning models in automated heart sound classification and diagnosis. By bridging the gap between traditional auscultation practices and contemporary data-driven diagnostic methods, the BMD-HS dataset is poised to revolutionize CVD diagnosis and management, providing an invaluable resource for the advancement of cardiac health research. The dataset is publicly available at this link: https://github.com/mHealthBuet/BMD-HS-Dataset.
△ Less
Submitted 1 September, 2024;
originally announced September 2024.
-
3D Point Cloud Network Pruning: When Some Weights Do not Matter
Authors:
Amrijit Biswas,
Md. Ismail Hossain,
M M Lutfe Elahi,
Ali Cheraghian,
Fuad Rahman,
Nabeel Mohammed,
Shafin Rahman
Abstract:
A point cloud is a crucial geometric data structure utilized in numerous applications. The adoption of deep neural networks referred to as Point Cloud Neural Networks (PC- NNs), for processing 3D point clouds, has significantly advanced fields that rely on 3D geometric data to enhance the efficiency of tasks. Expanding the size of both neural network models and 3D point clouds introduces significa…
▽ More
A point cloud is a crucial geometric data structure utilized in numerous applications. The adoption of deep neural networks referred to as Point Cloud Neural Networks (PC- NNs), for processing 3D point clouds, has significantly advanced fields that rely on 3D geometric data to enhance the efficiency of tasks. Expanding the size of both neural network models and 3D point clouds introduces significant challenges in minimizing computational and memory requirements. This is essential for meeting the demanding requirements of real-world applications, which prioritize minimal energy consumption and low latency. Therefore, investigating redundancy in PCNNs is crucial yet challenging due to their sensitivity to parameters. Additionally, traditional pruning methods face difficulties as these networks rely heavily on weights and points. Nonetheless, our research reveals a promising phenomenon that could refine standard PCNN pruning techniques. Our findings suggest that preserving only the top p% of the highest magnitude weights is crucial for accuracy preservation. For example, pruning 99% of the weights from the PointNet model still results in accuracy close to the base level. Specifically, in the ModelNet40 dataset, where the base accuracy with the PointNet model was 87. 5%, preserving only 1% of the weights still achieves an accuracy of 86.8%. Codes are available in: https://github.com/apurba-nsu-rnd-lab/PCNN_Pruning
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Beyond Labels: Aligning Large Language Models with Human-like Reasoning
Authors:
Muhammad Rafsan Kabir,
Rafeed Mohammad Sultan,
Ihsanul Haque Asif,
Jawad Ibn Ahad,
Fuad Rahman,
Mohammad Ruhul Amin,
Nabeel Mohammed,
Shafin Rahman
Abstract:
Aligning large language models (LLMs) with a human reasoning approach ensures that LLMs produce morally correct and human-like decisions. Ethical concerns are raised because current models are prone to generating false positives and providing malicious responses. To contribute to this issue, we have curated an ethics dataset named Dataset for Aligning Reasons (DFAR), designed to aid in aligning la…
▽ More
Aligning large language models (LLMs) with a human reasoning approach ensures that LLMs produce morally correct and human-like decisions. Ethical concerns are raised because current models are prone to generating false positives and providing malicious responses. To contribute to this issue, we have curated an ethics dataset named Dataset for Aligning Reasons (DFAR), designed to aid in aligning language models to generate human-like reasons. The dataset comprises statements with ethical-unethical labels and their corresponding reasons. In this study, we employed a unique and novel fine-tuning approach that utilizes ethics labels and their corresponding reasons (L+R), in contrast to the existing fine-tuning approach that only uses labels (L). The original pre-trained versions, the existing fine-tuned versions, and our proposed fine-tuned versions of LLMs were then evaluated on an ethical-unethical classification task and a reason-generation task. Our proposed fine-tuning strategy notably outperforms the others in both tasks, achieving significantly higher accuracy scores in the classification task and lower misalignment rates in the reason-generation task. The increase in classification accuracies and decrease in misalignment rates indicate that the L+R fine-tuned models align more with human ethics. Hence, this study illustrates that injecting reasons has substantially improved the alignment of LLMs, resulting in more human-like responses. We have made the DFAR dataset and corresponding codes publicly available at https://github.com/apurba-nsu-rnd-lab/DFAR.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
EnterpriseEM: Fine-tuned Embeddings for Enterprise Semantic Search
Authors:
Kamalkumar Rathinasamy,
Jayarama Nettar,
Amit Kumar,
Vishal Manchanda,
Arun Vijayakumar,
Ayush Kataria,
Venkateshprasanna Manjunath,
Chidambaram GS,
Jaskirat Singh Sodhi,
Shoeb Shaikh,
Wasim Akhtar Khan,
Prashant Singh,
Tanishq Dattatray Ige,
Vipin Tiwari,
Rajab Ali Mondal,
Harshini K,
S Reka,
Chetana Amancharla,
Faiz ur Rahman,
Harikrishnan P A,
Indraneel Saha,
Bhavya Tiwary,
Navin Shankar Patel,
Pradeep T S,
Balaji A J
, et al. (2 additional authors not shown)
Abstract:
Enterprises grapple with the significant challenge of managing proprietary unstructured data, hindering efficient information retrieval. This has led to the emergence of AI-driven information retrieval solutions, designed to adeptly extract relevant insights to address employee inquiries. These solutions often leverage pre-trained embedding models and generative models as foundational components.…
▽ More
Enterprises grapple with the significant challenge of managing proprietary unstructured data, hindering efficient information retrieval. This has led to the emergence of AI-driven information retrieval solutions, designed to adeptly extract relevant insights to address employee inquiries. These solutions often leverage pre-trained embedding models and generative models as foundational components. While pre-trained embeddings may exhibit proximity or disparity based on their original training objectives, they might not fully align with the unique characteristics of enterprise-specific data, leading to suboptimal alignment with the retrieval goals of enterprise environments. In this paper, we propose a comprehensive methodology for contextualizing pre-trained embedding models to enterprise environments, covering the entire process from data preparation to model fine-tuning and evaluation. By adapting the embeddings to better suit the retrieval tasks prevalent in enterprises, we aim to enhance the performance of information retrieval solutions. We discuss the process of fine-tuning, its effect on retrieval accuracy, and the potential benefits for enterprise information management. Our findings demonstrate the efficacy of fine-tuned embedding models in improving the precision and relevance of search results in enterprise settings.
△ Less
Submitted 27 September, 2024; v1 submitted 18 May, 2024;
originally announced June 2024.
-
Track Anything Rapter(TAR)
Authors:
Tharun V. Puthanveettil,
Fnu Obaid ur Rahman
Abstract:
Object tracking is a fundamental task in computer vision with broad practical applications across various domains, including traffic monitoring, robotics, and autonomous vehicle tracking. In this project, we aim to develop a sophisticated aerial vehicle system known as Track Anything Rapter (TAR), designed to detect, segment, and track objects of interest based on user-provided multimodal queries,…
▽ More
Object tracking is a fundamental task in computer vision with broad practical applications across various domains, including traffic monitoring, robotics, and autonomous vehicle tracking. In this project, we aim to develop a sophisticated aerial vehicle system known as Track Anything Rapter (TAR), designed to detect, segment, and track objects of interest based on user-provided multimodal queries, such as text, images, and clicks. TAR utilizes cutting-edge pre-trained models like DINO, CLIP, and SAM to estimate the relative pose of the queried object. The tracking problem is approached as a Visual Servoing task, enabling the UAV to consistently focus on the object through advanced motion planning and control algorithms. We showcase how the integration of these foundational models with a custom high-level control algorithm results in a highly stable and precise tracking system deployed on a custom-built PX4 Autopilot-enabled Voxl2 M500 drone. To validate the tracking algorithm's performance, we compare it against Vicon-based ground truth. Additionally, we evaluate the reliability of the foundational models in aiding tracking in scenarios involving occlusions. Finally, we test and validate the model's ability to work seamlessly with multiple modalities, such as click, bounding box, and image templates.
△ Less
Submitted 29 May, 2024; v1 submitted 19 May, 2024;
originally announced May 2024.
-
Enhancement of Bengali OCR by Specialized Models and Advanced Techniques for Diverse Document Types
Authors:
AKM Shahariar Azad Rabby,
Hasmot Ali,
Md. Majedul Islam,
Sheikh Abujar,
Fuad Rahman
Abstract:
This research paper presents a unique Bengali OCR system with some capabilities. The system excels in reconstructing document layouts while preserving structure, alignment, and images. It incorporates advanced image and signature detection for accurate extraction. Specialized models for word segmentation cater to diverse document types, including computer-composed, letterpress, typewriter, and han…
▽ More
This research paper presents a unique Bengali OCR system with some capabilities. The system excels in reconstructing document layouts while preserving structure, alignment, and images. It incorporates advanced image and signature detection for accurate extraction. Specialized models for word segmentation cater to diverse document types, including computer-composed, letterpress, typewriter, and handwritten documents. The system handles static and dynamic handwritten inputs, recognizing various writing styles. Furthermore, it has the ability to recognize compound characters in Bengali. Extensive data collection efforts provide a diverse corpus, while advanced technical components optimize character and word recognition. Additional contributions include image, logo, signature and table recognition, perspective correction, layout reconstruction, and a queuing module for efficient and scalable processing. The system demonstrates outstanding performance in efficient and accurate text extraction and analysis.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Involution Fused ConvNet for Classifying Eye-Tracking Patterns of Children with Autism Spectrum Disorder
Authors:
Md. Farhadul Islam,
Meem Arafat Manab,
Joyanta Jyoti Mondal,
Sarah Zabeen,
Fardin Bin Rahman,
Md. Zahidul Hasan,
Farig Sadeque,
Jannatun Noor
Abstract:
Autism Spectrum Disorder (ASD) is a complicated neurological condition which is challenging to diagnose. Numerous studies demonstrate that children diagnosed with autism struggle with maintaining attention spans and have less focused vision. The eye-tracking technology has drawn special attention in the context of ASD since anomalies in gaze have long been acknowledged as a defining feature of aut…
▽ More
Autism Spectrum Disorder (ASD) is a complicated neurological condition which is challenging to diagnose. Numerous studies demonstrate that children diagnosed with autism struggle with maintaining attention spans and have less focused vision. The eye-tracking technology has drawn special attention in the context of ASD since anomalies in gaze have long been acknowledged as a defining feature of autism in general. Deep Learning (DL) approaches coupled with eye-tracking sensors are exploiting additional capabilities to advance the diagnostic and its applications. By learning intricate nonlinear input-output relations, DL can accurately recognize the various gaze and eye-tracking patterns and adjust to the data. Convolutions alone are insufficient to capture the important spatial information in gaze patterns or eye tracking. The dynamic kernel-based process known as involutions can improve the efficiency of classifying gaze patterns or eye tracking data. In this paper, we utilise two different image-processing operations to see how these processes learn eye-tracking patterns. Since these patterns are primarily based on spatial information, we use involution with convolution making it a hybrid, which adds location-specific capability to a deep learning model. Our proposed model is implemented in a simple yet effective approach, which makes it easier for applying in real life. We investigate the reasons why our approach works well for classifying eye-tracking patterns. For comparative analysis, we experiment with two separate datasets as well as a combined version of both. The results show that IC with three involution layers outperforms the previous approaches.
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
A Novel Approach for Defect Detection of Wind Turbine Blade Using Virtual Reality and Deep Learning
Authors:
Md Fazle Rabbi,
Solayman Hossain Emon,
Ehtesham Mahmud Nishat,
Tzu-Liang,
Tseng,
Atira Ferdoushi,
Chun-Che Huang,
Md Fashiar Rahman
Abstract:
Wind turbines are subjected to continuous rotational stresses and unusual external forces such as storms, lightning, strikes by flying objects, etc., which may cause defects in turbine blades. Hence, it requires a periodical inspection to ensure proper functionality and avoid catastrophic failure. The task of inspection is challenging due to the remote location and inconvenient reachability by hum…
▽ More
Wind turbines are subjected to continuous rotational stresses and unusual external forces such as storms, lightning, strikes by flying objects, etc., which may cause defects in turbine blades. Hence, it requires a periodical inspection to ensure proper functionality and avoid catastrophic failure. The task of inspection is challenging due to the remote location and inconvenient reachability by human inspection. Researchers used images with cropped defects from the wind turbine in the literature. They neglected possible background biases, which may hinder real-time and autonomous defect detection using aerial vehicles such as drones or others. To overcome such challenges, in this paper, we experiment with defect detection accuracy by having the defects with the background using a two-step deep-learning methodology. In the first step, we develop virtual models of wind turbines to synthesize the near-reality images for four types of common defects - cracks, leading edge erosion, bending, and light striking damage. The Unity perception package is used to generate wind turbine blade defects images with variations in background, randomness, camera angle, and light effects. In the second step, a customized U-Net architecture is trained to classify and segment the defect in turbine blades. The outcomes of U-Net architecture have been thoroughly tested and compared with 5-fold validation datasets. The proposed methodology provides reasonable defect detection accuracy, making it suitable for autonomous and remote inspection through aerial vehicles.
△ Less
Submitted 30 December, 2023;
originally announced January 2024.
-
Addressing Trust Challenges in Blockchain Oracles Using Asymmetric Byzantine Quorums
Authors:
Fahad Rahman,
Chafiq Titouna,
Farid Nait-Abdesselam
Abstract:
Distributed Computing in Blockchain Technology (BCT) hinges on a trust assumption among independent nodes. Without a third-party interface or what is known as a Blockchain Oracle, it can not interact with the external world. This Oracle plays a crucial role by feeding extrinsic data into the Blockchain, ensuring that Smart Contracts operate accurately in real time. The Oracle problem arises from t…
▽ More
Distributed Computing in Blockchain Technology (BCT) hinges on a trust assumption among independent nodes. Without a third-party interface or what is known as a Blockchain Oracle, it can not interact with the external world. This Oracle plays a crucial role by feeding extrinsic data into the Blockchain, ensuring that Smart Contracts operate accurately in real time. The Oracle problem arises from the inherent difficulty in verifying the truthfulness of the data sourced by these Oracles. The genuineness of a Blockchain Oracle is paramount, as it directly influences the Blockchain's reliability, credibility, and scalability. To tackle these challenges, a strategy rooted in Byzantine fault tolerance φ is introduced. Furthermore, an autonomous system for sustainability and audibility, built on heuristic detection, is put forth. The effectiveness and precision of the proposed strategy outperformed existing methods using two real-world datasets, aimed to meet the authenticity standards for Blockchain Oracles.
△ Less
Submitted 30 December, 2023;
originally announced January 2024.
-
A Contextualized Real-Time Multimodal Emotion Recognition for Conversational Agents using Graph Convolutional Networks in Reinforcement Learning
Authors:
Fathima Abdul Rahman,
Guang Lu
Abstract:
Owing to the recent developments in Generative Artificial Intelligence (GenAI) and Large Language Models (LLM), conversational agents are becoming increasingly popular and accepted. They provide a human touch by interacting in ways familiar to us and by providing support as virtual companions. Therefore, it is important to understand the user's emotions in order to respond considerately. Compared…
▽ More
Owing to the recent developments in Generative Artificial Intelligence (GenAI) and Large Language Models (LLM), conversational agents are becoming increasingly popular and accepted. They provide a human touch by interacting in ways familiar to us and by providing support as virtual companions. Therefore, it is important to understand the user's emotions in order to respond considerately. Compared to the standard problem of emotion recognition, conversational agents face an additional constraint in that recognition must be real-time. Studies on model architectures using audio, visual, and textual modalities have mainly focused on emotion classification using full video sequences that do not provide online features. In this work, we present a novel paradigm for contextualized Emotion Recognition using Graph Convolutional Network with Reinforcement Learning (conER-GRL). Conversations are partitioned into smaller groups of utterances for effective extraction of contextual information. The system uses Gated Recurrent Units (GRU) to extract multimodal features from these groups of utterances. More importantly, Graph Convolutional Networks (GCN) and Reinforcement Learning (RL) agents are cascade trained to capture the complex dependencies of emotion features in interactive scenarios. Comparing the results of the conER-GRL model with other state-of-the-art models on the benchmark dataset IEMOCAP demonstrates the advantageous capabilities of the conER-GRL architecture in recognizing emotions in real-time from multimodal conversational signals.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
LumiNet: The Bright Side of Perceptual Knowledge Distillation
Authors:
Md. Ismail Hossain,
M M Lutfe Elahi,
Sameera Ramasinghe,
Ali Cheraghian,
Fuad Rahman,
Nabeel Mohammed,
Shafin Rahman
Abstract:
In knowledge distillation literature, feature-based methods have dominated due to their ability to effectively tap into extensive teacher models. In contrast, logit-based approaches, which aim to distill `dark knowledge' from teachers, typically exhibit inferior performance compared to feature-based methods. To bridge this gap, we present LumiNet, a novel knowledge distillation algorithm designed…
▽ More
In knowledge distillation literature, feature-based methods have dominated due to their ability to effectively tap into extensive teacher models. In contrast, logit-based approaches, which aim to distill `dark knowledge' from teachers, typically exhibit inferior performance compared to feature-based methods. To bridge this gap, we present LumiNet, a novel knowledge distillation algorithm designed to enhance logit-based distillation. We introduce the concept of 'perception', aiming to calibrate logits based on the model's representation capability. This concept addresses overconfidence issues in logit-based distillation method while also introducing a novel method to distill knowledge from the teacher. It reconstructs the logits of a sample/instances by considering relationships with other samples in the batch. LumiNet excels on benchmarks like CIFAR-100, ImageNet, and MSCOCO, outperforming leading feature-based methods, e.g., compared to KD with ResNet18 and MobileNetV2 on ImageNet, it shows improvements of 1.5% and 2.05%, respectively.
△ Less
Submitted 9 March, 2024; v1 submitted 5 October, 2023;
originally announced October 2023.
-
Weak Supervision for Label Efficient Visual Bug Detection
Authors:
Farrukh Rahman
Abstract:
As video games evolve into expansive, detailed worlds, visual quality becomes essential, yet increasingly challenging. Traditional testing methods, limited by resources, face difficulties in addressing the plethora of potential bugs. Machine learning offers scalable solutions; however, heavy reliance on large labeled datasets remains a constraint. Addressing this challenge, we propose a novel meth…
▽ More
As video games evolve into expansive, detailed worlds, visual quality becomes essential, yet increasingly challenging. Traditional testing methods, limited by resources, face difficulties in addressing the plethora of potential bugs. Machine learning offers scalable solutions; however, heavy reliance on large labeled datasets remains a constraint. Addressing this challenge, we propose a novel method, utilizing unlabeled gameplay and domain-specific augmentations to generate datasets & self-supervised objectives used during pre-training or multi-task settings for downstream visual bug detection. Our methodology uses weak-supervision to scale datasets for the crafted objectives and facilitates both autonomous and interactive weak-supervision, incorporating unsupervised clustering and/or an interactive approach based on text and geometric prompts. We demonstrate on first-person player clipping/collision bugs (FPPC) within the expansive Giantmap game world, that our approach is very effective, improving over a strong supervised baseline in a practical, very low-prevalence, low data regime (0.336 $\rightarrow$ 0.550 F1 score). With just 5 labeled "good" exemplars (i.e., 0 bugs), our self-supervised objective alone captures enough signal to outperform the low-labeled supervised settings. Building on large-pretrained vision models, our approach is adaptable across various visual bugs. Our results suggest applicability in curating datasets for broader image and video tasks within video games beyond visual bugs.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
Invariant Scattering Transform for Medical Imaging
Authors:
Nafisa Labiba Ishrat Huda,
Angona Biswas,
MD Abdullah Al Nasim,
Md. Fahim Rahman,
Shoaib Ahmed
Abstract:
Invariant scattering transform introduces new area of research that merges the signal processing with deep learning for computer vision. Nowadays, Deep Learning algorithms are able to solve a variety of problems in medical sector. Medical images are used to detect diseases brain cancer or tumor, Alzheimer's disease, breast cancer, Parkinson's disease and many others. During pandemic back in 2020,…
▽ More
Invariant scattering transform introduces new area of research that merges the signal processing with deep learning for computer vision. Nowadays, Deep Learning algorithms are able to solve a variety of problems in medical sector. Medical images are used to detect diseases brain cancer or tumor, Alzheimer's disease, breast cancer, Parkinson's disease and many others. During pandemic back in 2020, machine learning and deep learning has played a critical role to detect COVID-19 which included mutation analysis, prediction, diagnosis and decision making. Medical images like X-ray, MRI known as magnetic resonance imaging, CT scans are used for detecting diseases. There is another method in deep learning for medical imaging which is scattering transform. It builds useful signal representation for image classification. It is a wavelet technique; which is impactful for medical image classification problems. This research article discusses scattering transform as the efficient system for medical image analysis where it's figured by scattering the signal information implemented in a deep convolutional network. A step by step case study is manifested at this research work.
△ Less
Submitted 7 July, 2023;
originally announced July 2023.
-
Bangla-Wave: Improving Bangla Automatic Speech Recognition Utilizing N-gram Language Models
Authors:
Mohammed Rakib,
Md. Ismail Hossain,
Nabeel Mohammed,
Fuad Rahman
Abstract:
Although over 300M around the world speak Bangla, scant work has been done in improving Bangla voice-to-text transcription due to Bangla being a low-resource language. However, with the introduction of the Bengali Common Voice 9.0 speech dataset, Automatic Speech Recognition (ASR) models can now be significantly improved. With 399hrs of speech recordings, Bengali Common Voice is the largest and mo…
▽ More
Although over 300M around the world speak Bangla, scant work has been done in improving Bangla voice-to-text transcription due to Bangla being a low-resource language. However, with the introduction of the Bengali Common Voice 9.0 speech dataset, Automatic Speech Recognition (ASR) models can now be significantly improved. With 399hrs of speech recordings, Bengali Common Voice is the largest and most diversified open-source Bengali speech corpus in the world. In this paper, we outperform the SOTA pretrained Bengali ASR models by finetuning a pretrained wav2vec2 model on the common voice dataset. We also demonstrate how to significantly improve the performance of an ASR model by adding an n-gram language model as a post-processor. Finally, we do some experiments and hyperparameter tuning to generate a robust Bangla ASR model that is better than the existing ASR models.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Deep Lake: a Lakehouse for Deep Learning
Authors:
Sasun Hambardzumyan,
Abhinav Tuli,
Levon Ghukasyan,
Fariz Rahman,
Hrant Topchyan,
David Isayan,
Mark McQuade,
Mikayel Harutyunyan,
Tatevik Hakobyan,
Ivo Stranic,
Davit Buniatyan
Abstract:
Traditional data lakes provide critical data infrastructure for analytical workloads by enabling time travel, running SQL queries, ingesting data with ACID transactions, and visualizing petabyte-scale datasets on cloud storage. They allow organizations to break down data silos, unlock data-driven decision-making, improve operational efficiency, and reduce costs. However, as deep learning usage inc…
▽ More
Traditional data lakes provide critical data infrastructure for analytical workloads by enabling time travel, running SQL queries, ingesting data with ACID transactions, and visualizing petabyte-scale datasets on cloud storage. They allow organizations to break down data silos, unlock data-driven decision-making, improve operational efficiency, and reduce costs. However, as deep learning usage increases, traditional data lakes are not well-designed for applications such as natural language processing (NLP), audio processing, computer vision, and applications involving non-tabular datasets. This paper presents Deep Lake, an open-source lakehouse for deep learning applications developed at Activeloop. Deep Lake maintains the benefits of a vanilla data lake with one key difference: it stores complex data, such as images, videos, annotations, as well as tabular data, in the form of tensors and rapidly streams the data over the network to (a) Tensor Query Language, (b) in-browser visualization engine, or (c) deep learning frameworks without sacrificing GPU utilization. Datasets stored in Deep Lake can be accessed from PyTorch, TensorFlow, JAX, and integrate with numerous MLOps tools.
△ Less
Submitted 13 December, 2022; v1 submitted 22 September, 2022;
originally announced September 2022.
-
On the Surprising Effectiveness of Transformers in Low-Labeled Video Recognition
Authors:
Farrukh Rahman,
Ömer Mubarek,
Zsolt Kira
Abstract:
Recently vision transformers have been shown to be competitive with convolution-based methods (CNNs) broadly across multiple vision tasks. The less restrictive inductive bias of transformers endows greater representational capacity in comparison with CNNs. However, in the image classification setting this flexibility comes with a trade-off with respect to sample efficiency, where transformers requ…
▽ More
Recently vision transformers have been shown to be competitive with convolution-based methods (CNNs) broadly across multiple vision tasks. The less restrictive inductive bias of transformers endows greater representational capacity in comparison with CNNs. However, in the image classification setting this flexibility comes with a trade-off with respect to sample efficiency, where transformers require ImageNet-scale training. This notion has carried over to video where transformers have not yet been explored for video classification in the low-labeled or semi-supervised settings. Our work empirically explores the low data regime for video classification and discovers that, surprisingly, transformers perform extremely well in the low-labeled video setting compared to CNNs. We specifically evaluate video vision transformers across two contrasting video datasets (Kinetics-400 and SomethingSomething-V2) and perform thorough analysis and ablation studies to explain this observation using the predominant features of video transformer architectures. We even show that using just the labeled data, transformers significantly outperform complex semi-supervised CNN methods that leverage large-scale unlabeled data as well. Our experiments inform our recommendation that semi-supervised learning video work should consider the use of video transformers in the future.
△ Less
Submitted 25 October, 2022; v1 submitted 15 September, 2022;
originally announced September 2022.
-
LILA-BOTI : Leveraging Isolated Letter Accumulations By Ordering Teacher Insights for Bangla Handwriting Recognition
Authors:
Md. Ismail Hossain,
Mohammed Rakib,
Sabbir Mollah,
Fuad Rahman,
Nabeel Mohammed
Abstract:
Word-level handwritten optical character recognition (OCR) remains a challenge for morphologically rich languages like Bangla. The complexity arises from the existence of a large number of alphabets, the presence of several diacritic forms, and the appearance of complex conjuncts. The difficulty is exacerbated by the fact that some graphemes occur infrequently but remain indispensable, so addressi…
▽ More
Word-level handwritten optical character recognition (OCR) remains a challenge for morphologically rich languages like Bangla. The complexity arises from the existence of a large number of alphabets, the presence of several diacritic forms, and the appearance of complex conjuncts. The difficulty is exacerbated by the fact that some graphemes occur infrequently but remain indispensable, so addressing the class imbalance is required for satisfactory results. This paper addresses this issue by introducing two knowledge distillation methods: Leveraging Isolated Letter Accumulations By Ordering Teacher Insights (LILA-BOTI) and Super Teacher LILA-BOTI. In both cases, a Convolutional Recurrent Neural Network (CRNN) student model is trained with the dark knowledge gained from a printed isolated character recognition teacher model. We conducted inter-dataset testing on \emph{BN-HTRd} and \emph{BanglaWriting} as our evaluation protocol, thus setting up a challenging problem where the results would better reflect the performance on unseen data. Our evaluations achieved up to a 3.5% increase in the F1-Macro score for the minor classes and up to 4.5% increase in our overall word recognition rate when compared with the base model (No KD) and conventional KD.
△ Less
Submitted 23 May, 2022;
originally announced May 2022.
-
Rethinking Task-Incremental Learning Baselines
Authors:
Md Sazzad Hossain,
Pritom Saha,
Townim Faisal Chowdhury,
Shafin Rahman,
Fuad Rahman,
Nabeel Mohammed
Abstract:
It is common to have continuous streams of new data that need to be introduced in the system in real-world applications. The model needs to learn newly added capabilities (future tasks) while retaining the old knowledge (past tasks). Incremental learning has recently become increasingly appealing for this problem. Task-incremental learning is a kind of incremental learning where task identity of n…
▽ More
It is common to have continuous streams of new data that need to be introduced in the system in real-world applications. The model needs to learn newly added capabilities (future tasks) while retaining the old knowledge (past tasks). Incremental learning has recently become increasingly appealing for this problem. Task-incremental learning is a kind of incremental learning where task identity of newly included task (a set of classes) remains known during inference. A common goal of task-incremental methods is to design a network that can operate on minimal size, maintaining decent performance. To manage the stability-plasticity dilemma, different methods utilize replay memory of past tasks, specialized hardware, regularization monitoring etc. However, these methods are still less memory efficient in terms of architecture growth or input data costs. In this study, we present a simple yet effective adjustment network (SAN) for task incremental learning that achieves near state-of-the-art performance while using minimal architectural size without using memory instances compared to previous state-of-the-art approaches. We investigate this approach on both 3D point cloud object (ModelNet40) and 2D image (CIFAR10, CIFAR100, MiniImageNet, MNIST, PermutedMNIST, notMNIST, SVHN, and FashionMNIST) recognition tasks and establish a strong baseline result for a fair comparison with existing methods. On both 2D and 3D domains, we also observe that SAN is primarily unaffected by different task orders in a task-incremental setting.
△ Less
Submitted 23 May, 2022;
originally announced May 2022.
-
Digital Twin for Secure Semiconductor Lifecycle Management: Prospects and Applications
Authors:
Hasan Al Shaikh,
Mohammad Bin Monjil,
Shigang Chen,
Navid Asadizanjani,
Farimah Farahmandi,
Mark Tehranipoor,
Fahim Rahman
Abstract:
The expansive globalization of the semiconductor supply chain has introduced numerous untrusted entities into different stages of a device's lifecycle. To make matters worse, the increase complexity in the design as well as aggressive time to market requirements of the newer generation of integrated circuits can lead either designers to unintentionally introduce security vulnerabilities or verific…
▽ More
The expansive globalization of the semiconductor supply chain has introduced numerous untrusted entities into different stages of a device's lifecycle. To make matters worse, the increase complexity in the design as well as aggressive time to market requirements of the newer generation of integrated circuits can lead either designers to unintentionally introduce security vulnerabilities or verification engineers to fail in detecting them earlier in the design lifecycle. These overlooked or undetected vulnerabilities can be exploited by malicious entities in subsequent stages of the lifecycle through an ever widening variety of hardware attacks. The ability to ascertain the provenance of these vulnerabilities, therefore, becomes a pressing issue when the security assurance across the whole lifecycle is required to be ensured. We posit that if there is a malicious or unintentional breach of security policies of a device, it will be reflected in the form of anomalies in the traditional design, verification and testing activities throughout the lifecycle. With that, a digital simulacrum of a device's lifecycle, called a digital twin (DT), can be formed by the data gathered from different stages to secure the lifecycle of the device. In this paper, we put forward a realization of intertwined relationships of security vulnerabilities with data available from the silicon lifecycle and formulate different components of an AI driven DT framework. The proposed DT framework leverages these relationships and relational learning to achieve Forward and Backward Trust Analysis functionalities enabling security aware management of the entire lifecycle. Finally, we provide potential future research avenues and challenges for realization of the digital twin framework to enable secure semiconductor lifecycle management.
△ Less
Submitted 24 May, 2022; v1 submitted 22 May, 2022;
originally announced May 2022.
-
Quantifiable Assurance: From IPs to Platforms
Authors:
Bulbul Ahmed,
Md Kawser Bepary,
Nitin Pundir,
Mike Borza,
Oleg Raikhman,
Amit Garg,
Dale Donchin,
Adam Cron,
Mohamed A Abdel-moneum,
Farimah Farahmandi,
Fahim Rahman,
Mark Tehranipoor
Abstract:
Hardware vulnerabilities are generally considered more difficult to fix than software ones because they are persistent after fabrication. Thus, it is crucial to assess the security and fix the vulnerabilities at earlier design phases, such as Register Transfer Level (RTL) and gate level. The focus of the existing security assessment techniques is mainly twofold. First, they check the security of I…
▽ More
Hardware vulnerabilities are generally considered more difficult to fix than software ones because they are persistent after fabrication. Thus, it is crucial to assess the security and fix the vulnerabilities at earlier design phases, such as Register Transfer Level (RTL) and gate level. The focus of the existing security assessment techniques is mainly twofold. First, they check the security of Intellectual Property (IP) blocks separately. Second, they aim to assess the security against individual threats considering the threats are orthogonal. We argue that IP-level security assessment is not sufficient. Eventually, the IPs are placed in a platform, such as a system-on-chip (SoC), where each IP is surrounded by other IPs connected through glue logic and shared/private buses. Hence, we must develop a methodology to assess the platform-level security by considering both the IP-level security and the impact of the additional parameters introduced during platform integration. Another important factor to consider is that the threats are not always orthogonal. Improving security against one threat may affect the security against other threats. Hence, to build a secure platform, we must first answer the following questions: What additional parameters are introduced during the platform integration? How do we define and characterize the impact of these parameters on security? How do the mitigation techniques of one threat impact others? This paper aims to answer these important questions and proposes techniques for quantifiable assurance by quantitatively estimating and measuring the security of a platform at the pre-silicon stages. We also touch upon the term security optimization and present the challenges for future research directions.
△ Less
Submitted 16 April, 2022;
originally announced April 2022.
-
A Systematic Review on Interactive Virtual Reality Laboratory
Authors:
Fozlur Rahman,
Marium Sana Mim,
Feekra Baset Baishakhi,
Mahmudul Hasan,
Md. Kishor Morol
Abstract:
Virtual Reality has become a significant element of education throughout the years. To understand the quality and advantages of these techniques, it is important to understand how they were developed and evaluated. Since COVID-19, the education system has drastically changed a lot. It has shifted from being in a classroom with a whiteboard and projectors to having your own room in front of your la…
▽ More
Virtual Reality has become a significant element of education throughout the years. To understand the quality and advantages of these techniques, it is important to understand how they were developed and evaluated. Since COVID-19, the education system has drastically changed a lot. It has shifted from being in a classroom with a whiteboard and projectors to having your own room in front of your laptop in a virtual meeting. In this respect, virtual reality in the laboratory or Virtual Laboratory is the main focus of this research, which is intended to comprehend the work done in quality education from a distance using VR. As per the findings of the study, adopting virtual reality in education can help students learn more effectively and also help them increase perspective, enthusiasm, and knowledge of complex notions by offering them an interactive experience in which they can engage and learn more effectively. This highlights the importance of a significant expansion of VR use in learning, the majority of which employ scientific comparison approaches to compare students who use VR to those who use the traditional method for learning.
△ Less
Submitted 26 March, 2022;
originally announced March 2022.
-
Grasp-and-Lift Detection from EEG Signal Using Convolutional Neural Network
Authors:
Md. Kamrul Hasan,
Sifat Redwan Wahid,
Faria Rahman,
Shanjida Khan Maliha,
Sauda Binte Rahman
Abstract:
People undergoing neuromuscular dysfunctions and amputated limbs require automatic prosthetic appliances. In developing such prostheses, the precise detection of brain motor actions is imperative for the Grasp-and-Lift (GAL) tasks. Because of the low-cost and non-invasive essence of Electroencephalography (EEG), it is widely preferred for detecting motor actions during the controls of prosthetic t…
▽ More
People undergoing neuromuscular dysfunctions and amputated limbs require automatic prosthetic appliances. In developing such prostheses, the precise detection of brain motor actions is imperative for the Grasp-and-Lift (GAL) tasks. Because of the low-cost and non-invasive essence of Electroencephalography (EEG), it is widely preferred for detecting motor actions during the controls of prosthetic tools. This article has automated the hand movement activity viz GAL detection method from the 32-channel EEG signals. The proposed pipeline essentially combines preprocessing and end-to-end detection steps, eliminating the requirement of hand-crafted feature engineering. Preprocessing action consists of raw signal denoising, using either Discrete Wavelet Transform (DWT) or highpass or bandpass filtering and data standardization. The detection step consists of Convolutional Neural Network (CNN)- or Long Short Term Memory (LSTM)-based model. All the investigations utilize the publicly available WAY-EEG-GAL dataset, having six different GAL events. The best experiment reveals that the proposed framework achieves an average area under the ROC curve of 0.944, employing the DWT-based denoising filter, data standardization, and CNN-based detection model. The obtained outcome designates an excellent achievement of the introduced method in detecting GAL events from the EEG signals, turning it applicable to prosthetic appliances, brain-computer interfaces, robotic arms, etc.
△ Less
Submitted 12 February, 2022;
originally announced February 2022.
-
TableQuery: Querying tabular data with natural language
Authors:
Abhijith Neil Abraham,
Fariz Rahman,
Damanpreet Kaur
Abstract:
This paper presents TableQuery, a novel tool for querying tabular data using deep learning models pre-trained to answer questions on free text. Existing deep learning methods for question answering on tabular data have various limitations, such as having to feed the entire table as input into a neural network model, making them unsuitable for most real-world applications. Since real-world data mig…
▽ More
This paper presents TableQuery, a novel tool for querying tabular data using deep learning models pre-trained to answer questions on free text. Existing deep learning methods for question answering on tabular data have various limitations, such as having to feed the entire table as input into a neural network model, making them unsuitable for most real-world applications. Since real-world data might contain millions of rows, it may not entirely fit into the memory. Moreover, data could be stored in live databases, which are updated in real-time, and it is impractical to serialize an entire database to a neural network-friendly format each time it is updated. In TableQuery, we use deep learning models pre-trained for question answering on free text to convert natural language queries to structured queries, which can be run against a database or a spreadsheet. This method eliminates the need for fitting the entire data into memory as well as serializing databases. Furthermore, deep learning models pre-trained for question answering on free text are readily available on platforms such as HuggingFace Model Hub (7). TableQuery does not require re-training; when a newly trained model for question answering with better performance is available, it can replace the existing model in TableQuery.
△ Less
Submitted 27 January, 2022;
originally announced February 2022.
-
Handover Experiments with UAVs: Software Radio Tools and Experimental Research Platform
Authors:
Keith Powell,
Andrew Yingst,
Talha Faizur Rahman,
Vuk Marojevic
Abstract:
Mobility management is the key feature of cellular networks. When integrating unmanned aerial vehicles (UAVs) into cellular networks, their cell association needs to be carefully managed for coexistence with other cellular users. UAVs move in three dimensions and may traverse several cells on their flight path, and so may be subject to several handovers. In order to enable research on mobility man…
▽ More
Mobility management is the key feature of cellular networks. When integrating unmanned aerial vehicles (UAVs) into cellular networks, their cell association needs to be carefully managed for coexistence with other cellular users. UAVs move in three dimensions and may traverse several cells on their flight path, and so may be subject to several handovers. In order to enable research on mobility management with UAV users, this paper describes the design, implementation, and testing methodology for handover experiments with aerial users. We leverage software-defined radios (SDRs) and implement a series of tools for preparing the experiment in the laboratory and for taking it outdoors for field testing. We use solely commercial off-the-shelf hardware, open-source software, and an experimental license to enable reproducible and scalable experiments. Our initial outdoor results with two SDR base stations connected to an open-source software core network, implementing the 4G long-term evolution protocol, and one low altitude UAV user equipment demonstrate the handover process.
△ Less
Submitted 3 January, 2022;
originally announced January 2022.
-
IndoNLI: A Natural Language Inference Dataset for Indonesian
Authors:
Rahmad Mahendra,
Alham Fikri Aji,
Samuel Louvan,
Fahrurrozi Rahman,
Clara Vania
Abstract:
We present IndoNLI, the first human-elicited NLI dataset for Indonesian. We adapt the data collection protocol for MNLI and collect nearly 18K sentence pairs annotated by crowd workers and experts. The expert-annotated data is used exclusively as a test set. It is designed to provide a challenging test-bed for Indonesian NLI by explicitly incorporating various linguistic phenomena such as numerica…
▽ More
We present IndoNLI, the first human-elicited NLI dataset for Indonesian. We adapt the data collection protocol for MNLI and collect nearly 18K sentence pairs annotated by crowd workers and experts. The expert-annotated data is used exclusively as a test set. It is designed to provide a challenging test-bed for Indonesian NLI by explicitly incorporating various linguistic phenomena such as numerical reasoning, structural changes, idioms, or temporal and spatial reasoning. Experiment results show that XLM-R outperforms other pre-trained models in our data. The best performance on the expert-annotated data is still far below human performance (13.4% accuracy gap), suggesting that this test set is especially challenging. Furthermore, our analysis shows that our expert-annotated data is more diverse and contains fewer annotation artifacts than the crowd-annotated data. We hope this dataset can help accelerate progress in Indonesian NLP research.
△ Less
Submitted 27 October, 2021;
originally announced October 2021.
-
Network and Physical Layer Attacks and countermeasures to AI-Enabled 6G O-RAN
Authors:
Talha F. Rahman,
Aly Sabri Abdalla,
Keith Powell,
Walaa AlQwider,
Vuk Marojevic
Abstract:
Artificial intelligence (AI) will play an increasing role in cellular network deployment, configuration and management. This paper examines the security implications of AI-driven 6G radio access networks (RANs). While the expected timeline for 6G standardization is still several years out, pre-standardization efforts related to 6G security are already ongoing and will benefit from fundamental and…
▽ More
Artificial intelligence (AI) will play an increasing role in cellular network deployment, configuration and management. This paper examines the security implications of AI-driven 6G radio access networks (RANs). While the expected timeline for 6G standardization is still several years out, pre-standardization efforts related to 6G security are already ongoing and will benefit from fundamental and experimental research. The Open RAN (O-RAN) describes an industry-driven open architecture and interfaces for building next generation RANs with AI control. Considering this architecture, we identify the critical threats to data driven network and physical layer elements, the corresponding countermeasures, and the research directions.
△ Less
Submitted 2 September, 2022; v1 submitted 1 June, 2021;
originally announced June 2021.
-
ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive Summarization with Argument Mining
Authors:
Alexander R. Fabbri,
Faiaz Rahman,
Imad Rizvi,
Borui Wang,
Haoran Li,
Yashar Mehdad,
Dragomir Radev
Abstract:
While online conversations can cover a vast amount of information in many different formats, abstractive text summarization has primarily focused on modeling solely news articles. This research gap is due, in part, to the lack of standardized datasets for summarizing online discussions. To address this gap, we design annotation protocols motivated by an issues--viewpoints--assertions framework to…
▽ More
While online conversations can cover a vast amount of information in many different formats, abstractive text summarization has primarily focused on modeling solely news articles. This research gap is due, in part, to the lack of standardized datasets for summarizing online discussions. To address this gap, we design annotation protocols motivated by an issues--viewpoints--assertions framework to crowdsource four new datasets on diverse online conversation forms of news comments, discussion forums, community question answering forums, and email threads. We benchmark state-of-the-art models on our datasets and analyze characteristics associated with the data. To create a comprehensive benchmark, we also evaluate these models on widely-used conversation summarization datasets to establish strong baselines in this domain. Furthermore, we incorporate argument mining through graph construction to directly model the issues, viewpoints, and assertions present in a conversation and filter noisy input, showing comparable or improved results according to automatic and human evaluations.
△ Less
Submitted 1 June, 2021;
originally announced June 2021.
-
UAVs with Reconfigurable Intelligent Surfaces: Applications, Challenges, and Opportunities
Authors:
Aly Sabri Abdalla,
Talha Faizur Rahman,
Vuk Marojevic
Abstract:
A reconfigurable intelligent surface (RIS) is a metamaterial that can be integrated into walls and influence the propagation of electromagnetic waves. This, typically passive radio frequency (RF) technology is emerging for indoor and outdoor use with the potential of making wireless communications more reliable in increasingly challenging radio environments. This paper goes one step further and in…
▽ More
A reconfigurable intelligent surface (RIS) is a metamaterial that can be integrated into walls and influence the propagation of electromagnetic waves. This, typically passive radio frequency (RF) technology is emerging for indoor and outdoor use with the potential of making wireless communications more reliable in increasingly challenging radio environments. This paper goes one step further and introduces mobile RIS, specifically, RIS carried by unmanned aerial vehicles (UAVs) to support cellular communications networks and services of the future. We elaborate on several use cases, challenges, and future research opportunities for designing and optimizing wireless systems at low cost and with low energy footprint.
△ Less
Submitted 8 December, 2020;
originally announced December 2020.
-
RanStop: A Hardware-assisted Runtime Crypto-Ransomware Detection Technique
Authors:
Nitin Pundir,
Mark Tehranipoor,
Fahim Rahman
Abstract:
Among many prevailing malware, crypto-ransomware poses a significant threat as it financially extorts affected users by creating denial of access via unauthorized encryption of their documents as well as holding their documents hostage and financially extorting them. This results in millions of dollars of annual losses worldwide. Multiple variants of ransomware are growing in number with capabilit…
▽ More
Among many prevailing malware, crypto-ransomware poses a significant threat as it financially extorts affected users by creating denial of access via unauthorized encryption of their documents as well as holding their documents hostage and financially extorting them. This results in millions of dollars of annual losses worldwide. Multiple variants of ransomware are growing in number with capabilities of evasion from many anti-viruses and software-only malware detection schemes that rely on static execution signatures. In this paper, we propose a hardware-assisted scheme, called RanStop, for early detection of crypto-ransomware infection in commodity processors. RanStop leverages the information of hardware performance counters embedded in the performance monitoring unit in modern processors to observe micro-architectural event sets and detects known and unknown crypto-ransomware variants. In this paper, we train a recurrent neural network-based machine learning architecture using long short-term memory (LSTM) model for analyzing micro-architectural events in the hardware domain when executing multiple variants of ransomware as well as benign programs. We create timeseries to develop intrinsic statistical features using the information of related HPCs and improve the detection accuracy of RanStop and reduce noise by via LSTM and global average pooling. As an early detection scheme, RanStop can accurately and quickly identify ransomware within 2ms from the start of the program execution by analyzing HPC information collected for 20 timestamps each 100us apart. This detection time is too early for a ransomware to make any significant damage, if none. Moreover, validation against benign programs with behavioral (sub-routine-centric) similarity with that of a crypto-ransomware shows that RanStop can detect ransomware with an average of 97% accuracy for fifty random trials.
△ Less
Submitted 24 November, 2020;
originally announced November 2020.
-
A Large Multi-Target Dataset of Common Bengali Handwritten Graphemes
Authors:
Samiul Alam,
Tahsin Reasat,
Asif Shahriyar Sushmit,
Sadi Mohammad Siddiquee,
Fuad Rahman,
Mahady Hasan,
Ahmed Imtiaz Humayun
Abstract:
Latin has historically led the state-of-the-art in handwritten optical character recognition (OCR) research. Adapting existing systems from Latin to alpha-syllabary languages is particularly challenging due to a sharp contrast between their orthographies. The segmentation of graphical constituents corresponding to characters becomes significantly hard due to a cursive writing system and frequent u…
▽ More
Latin has historically led the state-of-the-art in handwritten optical character recognition (OCR) research. Adapting existing systems from Latin to alpha-syllabary languages is particularly challenging due to a sharp contrast between their orthographies. The segmentation of graphical constituents corresponding to characters becomes significantly hard due to a cursive writing system and frequent use of diacritics in the alpha-syllabary family of languages. We propose a labeling scheme based on graphemes (linguistic segments of word formation) that makes segmentation in-side alpha-syllabary words linear and present the first dataset of Bengali handwritten graphemes that are commonly used in an everyday context. The dataset contains 411k curated samples of 1295 unique commonly used Bengali graphemes. Additionally, the test set contains 900 uncommon Bengali graphemes for out of dictionary performance evaluation. The dataset is open-sourced as a part of a public Handwritten Grapheme Classification Challenge on Kaggle to benchmark vision algorithms for multi-target grapheme classification. The unique graphemes present in this dataset are selected based on commonality in the Google Bengali ASR corpus. From competition proceedings, we see that deep-learning methods can generalize to a large span of out of dictionary graphemes which are absent during training. Dataset and starter codes at www.kaggle.com/c/bengaliai-cv19.
△ Less
Submitted 13 January, 2021; v1 submitted 30 September, 2020;
originally announced October 2020.
-
DART: Open-Domain Structured Data Record to Text Generation
Authors:
Linyong Nan,
Dragomir Radev,
Rui Zhang,
Amrit Rau,
Abhinand Sivaprasad,
Chiachun Hsieh,
Xiangru Tang,
Aadit Vyas,
Neha Verma,
Pranav Krishna,
Yangxiaokang Liu,
Nadia Irwanto,
Jessica Pan,
Faiaz Rahman,
Ahmad Zaidi,
Mutethia Mutuma,
Yasin Tarabar,
Ankit Gupta,
Tao Yu,
Yi Chern Tan,
Xi Victoria Lin,
Caiming Xiong,
Richard Socher,
Nazneen Fatema Rajani
Abstract:
We present DART, an open domain structured DAta Record to Text generation dataset with over 82k instances (DARTs). Data-to-Text annotations can be a costly process, especially when dealing with tables which are the major source of structured data and contain nontrivial structures. To this end, we propose a procedure of extracting semantic triples from tables that encodes their structures by exploi…
▽ More
We present DART, an open domain structured DAta Record to Text generation dataset with over 82k instances (DARTs). Data-to-Text annotations can be a costly process, especially when dealing with tables which are the major source of structured data and contain nontrivial structures. To this end, we propose a procedure of extracting semantic triples from tables that encodes their structures by exploiting the semantic dependencies among table headers and the table title. Our dataset construction framework effectively merged heterogeneous sources from open domain semantic parsing and dialogue-act-based meaning representation tasks by utilizing techniques such as: tree ontology annotation, question-answer pair to declarative sentence conversion, and predicate unification, all with minimum post-editing. We present systematic evaluation on DART as well as new state-of-the-art results on WebNLG 2017 to show that DART (1) poses new challenges to existing data-to-text datasets and (2) facilitates out-of-domain generalization. Our data and code can be found at https://github.com/Yale-LILY/dart.
△ Less
Submitted 12 April, 2021; v1 submitted 6 July, 2020;
originally announced July 2020.
-
Topological Descriptors for Parkinson's Disease Classification and Regression Analysis
Authors:
Afra Nawar,
Farhan Rahman,
Narayanan Krishnamurthi,
Anirudh Som,
Pavan Turaga
Abstract:
At present, the vast majority of human subjects with neurological disease are still diagnosed through in-person assessments and qualitative analysis of patient data. In this paper, we propose to use Topological Data Analysis (TDA) together with machine learning tools to automate the process of Parkinson's disease classification and severity assessment. An automated, stable, and accurate method to…
▽ More
At present, the vast majority of human subjects with neurological disease are still diagnosed through in-person assessments and qualitative analysis of patient data. In this paper, we propose to use Topological Data Analysis (TDA) together with machine learning tools to automate the process of Parkinson's disease classification and severity assessment. An automated, stable, and accurate method to evaluate Parkinson's would be significant in streamlining diagnoses of patients and providing families more time for corrective measures. We propose a methodology which incorporates TDA into analyzing Parkinson's disease postural shifts data through the representation of persistence images. Studying the topology of a system has proven to be invariant to small changes in data and has been shown to perform well in discrimination tasks. The contributions of the paper are twofold. We propose a method to 1) classify healthy patients from those afflicted by disease and 2) diagnose the severity of disease. We explore the use of the proposed method in an application involving a Parkinson's disease dataset comprised of healthy-elderly, healthy-young and Parkinson's disease patients. Our code is available at https://github.com/itsmeafra/Sublevel-Set-TDA.
△ Less
Submitted 6 May, 2020; v1 submitted 15 April, 2020;
originally announced April 2020.
-
Real-Time Kinodynamic Motion Planning for Omnidirectional Mobile Robot Soccer using Rapidly-Exploring Random Tree in Dynamic Environment with Moving Obstacles
Authors:
Fahri Ali Rahman,
Igi Ardiyanto,
Adha Imam Cahyadi
Abstract:
RoboCup Middle Size League (RoboCup MSL) provides a standardized testbed for research on mobile robot navigation, multi-robot cooperation, communication and integration via robot soccer competition in which the environment is highly dynamic and adversarial. One of important research topic in such area is kinodynamic motion planning that plan the trajectory of the robot while avoiding obstacles and…
▽ More
RoboCup Middle Size League (RoboCup MSL) provides a standardized testbed for research on mobile robot navigation, multi-robot cooperation, communication and integration via robot soccer competition in which the environment is highly dynamic and adversarial. One of important research topic in such area is kinodynamic motion planning that plan the trajectory of the robot while avoiding obstacles and obeying its dynamics. Kinodynamic motion planning for omnidirectional robot based on kinodynamic-RRT* method is presented in this work. Trajectory tracking control to execute the planned trajectory is also considered in this work. Robot motion planning in translational and rotational direction are decoupled. Then we implemented kinodynamic-RRT* with double integrator model to plan the translational trajectory. The rotational trajectory is generated using minimum-time trajectory generator satisfying velocity and acceleration constraints. The planned trajectory is then tracked using PI-Control. To address changing environment, we developed concurrent sofware module for motion planning and trajectory tracking. The resulting system were applied and tested using RoboCup simulation system based on Robot Operating System (ROS). The simulation results that the motion planning system are able to generate collision-free trajectory and the trajectory tracking system are able to follow the generated trajectory. It is also shown that in highly dynamic environment the online scheme are able to re-plan the trajectory.
△ Less
Submitted 12 May, 2019;
originally announced May 2019.
-
A Comparative Analysis of the Cyber Security Strategy of Bangladesh
Authors:
Kaushik Sarker,
Hasibur Rahman,
Khandaker Farzana Rahman,
Md. Shohel Arman,
Saikat Biswas,
Touhid Bhuiyan
Abstract:
Technology is an endless evolving expression in modern era, which increased security concerns and pushed us to create cyber environment. A National Cyber Security Strategy (NCSS) of a country reflects the state of that country's cyber strength which represents the aim and vision of the cyber security of a country. Formerly, researchers have worked on NCSS by comparing NCSS between different nation…
▽ More
Technology is an endless evolving expression in modern era, which increased security concerns and pushed us to create cyber environment. A National Cyber Security Strategy (NCSS) of a country reflects the state of that country's cyber strength which represents the aim and vision of the cyber security of a country. Formerly, researchers have worked on NCSS by comparing NCSS between different nations for international collaboration and harmonization and some researchers worked on policy framework for their respective governments. However very insignificant attempts had been made to assess the strategic strength of NCSS of Bangladesh by performing cross comparisons on NCSS of different Nations. Therefore, the motive of this research is to evaluate the robustness of the existing cyber security strategy of Bangladesh in comparison with some of the most technologically advanced countries in Asian continent and others like USA, Japan, Singapore, Malaysia and India in order to keep the NCSS of Bangladesh up-to-date.
△ Less
Submitted 1 May, 2019;
originally announced May 2019.
-
Challenges in Partially-Automated Roadway Feature Mapping Using Mobile Laser Scanning and Vehicle Trajectory Data
Authors:
Mohammad Billah,
Farzana Rahman,
Arash Maskooki,
Michael Todd,
Matthew Barth,
Jay A. Farrell
Abstract:
Connected vehicle and driver's assistance applications are greatly facilitated by Enhanced Digital Maps (EDMs) that represent roadway features (e.g., lane edges or centerlines, stop bars). Due to the large number of signalized intersections and miles of roadway, manual development of EDMs on a global basis is not feasible. Mobile Terrestrial Laser Scanning (MTLS) is the preferred data acquisition…
▽ More
Connected vehicle and driver's assistance applications are greatly facilitated by Enhanced Digital Maps (EDMs) that represent roadway features (e.g., lane edges or centerlines, stop bars). Due to the large number of signalized intersections and miles of roadway, manual development of EDMs on a global basis is not feasible. Mobile Terrestrial Laser Scanning (MTLS) is the preferred data acquisition method to provide data for automated EDM development. Such systems provide an MTLS trajectory and a point cloud for the roadway environment. The challenge is to automatically convert these data into an EDM. This article presents a new processing and feature extraction method, experimental demonstration providing SAE-J2735 map messages for eleven example intersections, and a discussion of the results that points out remaining challenges and suggests directions for future research.
△ Less
Submitted 8 February, 2019;
originally announced February 2019.
-
AI Learns to Recognize Bengali Handwritten Digits: Bengali.AI Computer Vision Challenge 2018
Authors:
Sharif Amit Kamran,
Ahmed Imtiaz Humayun,
Samiul Alam,
Rashed Mohammad Doha,
Manash Kumar Mandal,
Tahsin Reasat,
Fuad Rahman
Abstract:
Solving problems with Artificial intelligence in a competitive manner has long been absent in Bangladesh and Bengali-speaking community. On the other hand, there has not been a well structured database for Bengali Handwritten digits for mass public use. To bring out the best minds working in machine learning and use their expertise to create a model which can easily recognize Bengali Handwritten d…
▽ More
Solving problems with Artificial intelligence in a competitive manner has long been absent in Bangladesh and Bengali-speaking community. On the other hand, there has not been a well structured database for Bengali Handwritten digits for mass public use. To bring out the best minds working in machine learning and use their expertise to create a model which can easily recognize Bengali Handwritten digits, we organized Bengali.AI Computer Vision Challenge.The challenge saw both local and international teams participating with unprecedented efforts.
△ Less
Submitted 10 October, 2018;
originally announced October 2018.
-
Gene Shaving using influence function of a kernel method
Authors:
Md. Ashad Alam,
Mohammad Shahjama,
Md. Ferdush Rahman
Abstract:
Identifying significant subsets of the genes, gene shaving is an essential and challenging issue for biomedical research for a huge number of genes and the complex nature of biological networks,. Since positive definite kernel based methods on genomic information can improve the prediction of diseases, in this paper we proposed a new method, "kernel gene shaving (kernel canonical correlation analy…
▽ More
Identifying significant subsets of the genes, gene shaving is an essential and challenging issue for biomedical research for a huge number of genes and the complex nature of biological networks,. Since positive definite kernel based methods on genomic information can improve the prediction of diseases, in this paper we proposed a new method, "kernel gene shaving (kernel canonical correlation analysis (kernel CCA) based gene shaving). This problem is addressed using the influence function of the kernel CCA. To investigate the performance of the proposed method in a comparison of three popular gene selection methods (T-test, SAM and LIMMA), we were used extensive simulated and real microarray gene expression datasets. The performance measures AUC was computed for each of the methods. The achievement of the proposed method has improved than the three well-known gene selection methods. In real data analysis, the proposed method identified a subsets of $210$ genes out of $2000$ genes. The network of these genes has significantly more interactions than expected, which indicates that they may function in a concerted effort on colon cancer.
△ Less
Submitted 5 September, 2018;
originally announced September 2018.
-
Efficient Computation of Subspace Skyline over Categorical Domains
Authors:
Md Farhadur Rahman,
Abolfazl Asudeh,
Nick Koudas,
Gautam Das
Abstract:
Platforms such as AirBnB, Zillow, Yelp, and related sites have transformed the way we search for accommodation, restaurants, etc. The underlying datasets in such applications have numerous attributes that are mostly Boolean or Categorical. Discovering the skyline of such datasets over a subset of attributes would identify entries that stand out while enabling numerous applications. There are only…
▽ More
Platforms such as AirBnB, Zillow, Yelp, and related sites have transformed the way we search for accommodation, restaurants, etc. The underlying datasets in such applications have numerous attributes that are mostly Boolean or Categorical. Discovering the skyline of such datasets over a subset of attributes would identify entries that stand out while enabling numerous applications. There are only a few algorithms designed to compute the skyline over categorical attributes, yet are applicable only when the number of attributes is small.
In this paper, we place the problem of skyline discovery over categorical attributes into perspective and design efficient algorithms for two cases. (i) In the absence of indices, we propose two algorithms, ST-S and ST-P, that exploits the categorical characteristics of the datasets, organizing tuples in a tree data structure, supporting efficient dominance tests over the candidate set. (ii) We then consider the existence of widely used precomputed sorted lists. After discussing several approaches, and studying their limitations, we propose TA-SKY, a novel threshold style algorithm that utilizes sorted lists. Moreover, we further optimize TA-SKY and explore its progressive nature, making it suitable for applications with strict interactive requirements. In addition to the extensive theoretical analysis of the proposed algorithms, we conduct a comprehensive experimental evaluation of the combination of real (including the entire AirBnB data collection) and synthetic datasets to study the practicality of the proposed algorithms. The results showcase the superior performance of our techniques, outperforming applicable approaches by orders of magnitude.
△ Less
Submitted 30 May, 2017; v1 submitted 28 February, 2017;
originally announced March 2017.
-
HDBSCAN: Density based Clustering over Location Based Services
Authors:
Md Farhadur Rahman,
Weimo Liu,
Saad Bin Suhaim,
Saravanan Thirumuruganathan,
Nan Zhang,
Gautam Das
Abstract:
Location Based Services (LBS) have become extremely popular and used by millions of users. Popular LBS run the entire gamut from mapping services (such as Google Maps) to restaurants (such as Yelp) and real-estate (such as Redfin). The public query interfaces of LBS can be abstractly modeled as a kNN interface over a database of two dimensional points: given an arbitrary query point, the system re…
▽ More
Location Based Services (LBS) have become extremely popular and used by millions of users. Popular LBS run the entire gamut from mapping services (such as Google Maps) to restaurants (such as Yelp) and real-estate (such as Redfin). The public query interfaces of LBS can be abstractly modeled as a kNN interface over a database of two dimensional points: given an arbitrary query point, the system returns the k points in the database that are nearest to the query point. Often, k is set to a small value such as 20 or 50. In this paper, we consider the novel problem of enabling density based clustering over an LBS with only a limited, kNN query interface. Due to the query rate limits imposed by LBS, even retrieving every tuple once is infeasible. Hence, we seek to construct a cluster assignment function f(.) by issuing a small number of kNN queries, such that for any given tuple t in the database which may or may not have been accessed, f(.) outputs the cluster assignment of t with high accuracy. We conduct a comprehensive set of experiments over benchmark datasets and popular real-world LBS such as Yahoo! Flickr, Zillow, Redfin and Google Maps.
△ Less
Submitted 16 February, 2016; v1 submitted 11 February, 2016;
originally announced February 2016.
-
Aggregate Estimations over Location Based Services
Authors:
Weimo Liu,
Md Farhadur Rahman,
Saravanan Thirumuruganathan,
Nan Zhang,
Gautam Das
Abstract:
Location based services (LBS) have become very popular in recent years. They range from map services (e.g., Google Maps) that store geographic locations of points of interests, to online social networks (e.g., WeChat, Sina Weibo, FourSquare) that leverage user geographic locations to enable various recommendation functions. The public query interfaces of these services may be abstractly modeled as…
▽ More
Location based services (LBS) have become very popular in recent years. They range from map services (e.g., Google Maps) that store geographic locations of points of interests, to online social networks (e.g., WeChat, Sina Weibo, FourSquare) that leverage user geographic locations to enable various recommendation functions. The public query interfaces of these services may be abstractly modeled as a kNN interface over a database of two dimensional points on a plane: given an arbitrary query point, the system returns the k points in the database that are nearest to the query point. In this paper we consider the problem of obtaining approximate estimates of SUM and COUNT aggregates by only querying such databases via their restrictive public interfaces. We distinguish between interfaces that return location information of the returned tuples (e.g., Google Maps), and interfaces that do not return location information (e.g., Sina Weibo). For both types of interfaces, we develop aggregate estimation algorithms that are based on novel techniques for precisely computing or approximately estimating the Voronoi cell of tuples. We discuss a comprehensive set of real-world experiments for testing our algorithms, including experiments on Google Maps, WeChat, and Sina Weibo.
△ Less
Submitted 13 May, 2015; v1 submitted 10 May, 2015;
originally announced May 2015.
-
Rank-Based Inference over Web Databases
Authors:
Md Farhadur Rahman,
Weimo Liu,
Saravanan Thirumuruganathan,
Nan Zhang,
Gautam Das
Abstract:
In recent years, there has been much research in Ranked Retrieval model in structured databases, especially those in web databases. With this model, a search query returns top-k tuples according to not just exact matches of selection conditions, but a suitable ranking function. This paper studies a novel problem on the privacy implications of database ranking. The motivation is a novel yet serious…
▽ More
In recent years, there has been much research in Ranked Retrieval model in structured databases, especially those in web databases. With this model, a search query returns top-k tuples according to not just exact matches of selection conditions, but a suitable ranking function. This paper studies a novel problem on the privacy implications of database ranking. The motivation is a novel yet serious privacy leakage we found on real-world web databases which is caused by the ranking function design. Many such databases feature private attributes - e.g., a social network allows users to specify certain attributes as only visible to him/herself, but not to others. While these websites generally respect the privacy settings by not directly displaying private attribute values in search query answers, many of them nevertheless take into account such private attributes in the ranking function design. The conventional belief might be that tuple ranks alone are not enough to reveal the private attribute values. Our investigation, however, shows that this is not the case in reality.
To address the problem, we introduce a taxonomy of the problem space with two dimensions, (1) the type of query interface and (2) the capability of adversaries. For each subspace, we develop a novel technique which either guarantees the successful inference of private attributes, or does so for a significant portion of real-world tuples. We demonstrate the effectiveness and efficiency of our techniques through theoretical analysis, extensive experiments over real-world datasets, as well as successful online attacks over websites with tens to hundreds of millions of users - e.g., Amazon Goodreads and Renren.com.
△ Less
Submitted 5 April, 2015; v1 submitted 5 November, 2014;
originally announced November 2014.
-
Analyzing an Analytical Solution Model for Simultaneous Mobility
Authors:
Md. Ibrahim Chowdhury,
Mohammad Iqbal,
Naznin Sultana,
Faisal Rahman
Abstract:
Current mobility models for simultaneous mobility have their convolution in designing simultaneous movement where mobile nodes (MNs) travel randomly from the two adjacent cells at the same time and also have their complexity in the measurement of the occurrences of simultaneous handover. Simultaneous mobility problem incurs when two of the MNs start handover approximately at the same time. As Simu…
▽ More
Current mobility models for simultaneous mobility have their convolution in designing simultaneous movement where mobile nodes (MNs) travel randomly from the two adjacent cells at the same time and also have their complexity in the measurement of the occurrences of simultaneous handover. Simultaneous mobility problem incurs when two of the MNs start handover approximately at the same time. As Simultaneous mobility is different for the other mobility pattern, generally occurs less number of times in real time; we analyze that a simplified simultaneous mobility model can be considered by taking only symmetric positions of MNs with random steps. In addition to that, we simulated the model using mSCTP and compare the simulation results in different scenarios with customized cell ranges. The analytical results shows that with the bigger the cell sizes, simultaneous handover with random steps occurrences become lees and for the sequential mobility (where initial positions of MNs is predetermined) with random steps, simultaneous handover is more frequent.
△ Less
Submitted 9 January, 2014;
originally announced January 2014.
-
Performance Analysis of Estimation of Distribution Algorithm and Genetic Algorithm in Zone Routing Protocol
Authors:
Mst. Farhana Rahman,
S. M. Masud Karim,
Kazi Shah Nawaz Ripon,
Md. Iqbal Hossain Suvo
Abstract:
In this paper, Estimation of Distribution Algorithm (EDA) is used for Zone Routing Protocol (ZRP) in Mobile Ad-hoc Network (MANET) instead of Genetic Algorithm (GA). It is an evolutionary approach, and used when the network size grows and the search space increases. When the destination is outside the zone, EDA is applied to find the route with minimum cost and time. The implementation of proposed…
▽ More
In this paper, Estimation of Distribution Algorithm (EDA) is used for Zone Routing Protocol (ZRP) in Mobile Ad-hoc Network (MANET) instead of Genetic Algorithm (GA). It is an evolutionary approach, and used when the network size grows and the search space increases. When the destination is outside the zone, EDA is applied to find the route with minimum cost and time. The implementation of proposed method is compared with Genetic ZRP, i.e., GZRP and the result demonstrates better performance for the proposed method. Since the method provides a set of paths to the destination, it results in load balance to the network. As both EDA and GA use random search method to reach the optimal point, the searching cost reduced significantly, especially when the number of data is large.
△ Less
Submitted 22 September, 2010;
originally announced September 2010.
-
Improvement of Text Dependent Speaker Identification System Using Neuro-Genetic Hybrid Algorithm in Office Environmental Conditions
Authors:
Md. Rabiul Islam,
Md. Fayzur Rahman
Abstract:
In this paper, an improved strategy for automated text dependent speaker identification system has been proposed in noisy environment. The identification process incorporates the Neuro- Genetic hybrid algorithm with cepstral based features. To remove the background noise from the source utterance, wiener filter has been used. Different speech pre-processing techniques such as start-end point det…
▽ More
In this paper, an improved strategy for automated text dependent speaker identification system has been proposed in noisy environment. The identification process incorporates the Neuro- Genetic hybrid algorithm with cepstral based features. To remove the background noise from the source utterance, wiener filter has been used. Different speech pre-processing techniques such as start-end point detection algorithm, pre-emphasis filtering, frame blocking and windowing have been used to process the speech utterances. RCC, MFCC, MFCC, MFCC, LPC and LPCC have been used to extract the features. After feature extraction of the speech, Neuro-Genetic hybrid algorithm has been used in the learning and identification purposes. Features are extracted by using different techniques to optimize the performance of the identification. According to the VALID speech database, the highest speaker identification rate of 100.000 percent for studio environment and 82.33 percent for office environmental conditions have been achieved in the close set text dependent speaker identification system.
△ Less
Submitted 12 September, 2009;
originally announced September 2009.