-
SurgTPGS: Semantic 3D Surgical Scene Understanding with Text Promptable Gaussian Splatting
Authors:
Yiming Huang,
Long Bai,
Beilei Cui,
Kun Yuan,
Guankun Wang,
Mobarak I. Hoque,
Nicolas Padoy,
Nassir Navab,
Hongliang Ren
Abstract:
In contemporary surgical research and practice, accurately comprehending 3D surgical scenes with text-promptable capabilities is particularly crucial for surgical planning and real-time intra-operative guidance, where precisely identifying and interacting with surgical tools and anatomical structures is paramount. However, existing works focus on surgical vision-language model (VLM), 3D reconstruc…
▽ More
In contemporary surgical research and practice, accurately comprehending 3D surgical scenes with text-promptable capabilities is particularly crucial for surgical planning and real-time intra-operative guidance, where precisely identifying and interacting with surgical tools and anatomical structures is paramount. However, existing works focus on surgical vision-language model (VLM), 3D reconstruction, and segmentation separately, lacking support for real-time text-promptable 3D queries. In this paper, we present SurgTPGS, a novel text-promptable Gaussian Splatting method to fill this gap. We introduce a 3D semantics feature learning strategy incorporating the Segment Anything model and state-of-the-art vision-language models. We extract the segmented language features for 3D surgical scene reconstruction, enabling a more in-depth understanding of the complex surgical environment. We also propose semantic-aware deformation tracking to capture the seamless deformation of semantic features, providing a more precise reconstruction for both texture and semantic features. Furthermore, we present semantic region-aware optimization, which utilizes regional-based semantic information to supervise the training, particularly promoting the reconstruction quality and semantic smoothness. We conduct comprehensive experiments on two real-world surgical datasets to demonstrate the superiority of SurgTPGS over state-of-the-art methods, highlighting its potential to revolutionize surgical practices. SurgTPGS paves the way for developing next-generation intelligent surgical systems by enhancing surgical precision and safety. Our code is available at: https://github.com/lastbasket/SurgTPGS.
△ Less
Submitted 1 July, 2025; v1 submitted 29 June, 2025;
originally announced June 2025.
-
Quantum-Resistant Domain Name System: A Comprehensive System-Level Study
Authors:
Juyoul Lee,
Sanzida Hoque,
Abdullah Aydeger,
Engin Zeydan
Abstract:
The Domain Name System (DNS) plays a foundational role in Internet infrastructure, yet its core protocols remain vulnerable to compromise by quantum adversaries. As cryptographically relevant quantum computers become a realistic threat, ensuring DNS confidentiality, authenticity, and integrity in the post-quantum era is imperative. In this paper, we present a comprehensive system-level study of po…
▽ More
The Domain Name System (DNS) plays a foundational role in Internet infrastructure, yet its core protocols remain vulnerable to compromise by quantum adversaries. As cryptographically relevant quantum computers become a realistic threat, ensuring DNS confidentiality, authenticity, and integrity in the post-quantum era is imperative. In this paper, we present a comprehensive system-level study of post-quantum DNS security across three widely deployed mechanisms: DNSSEC, DNS-over-TLS (DoT), and DNS-over-HTTPS (DoH). We propose Post-Quantum Cryptographic (PQC)-DNS, a unified framework for benchmarking DNS security under legacy, post-quantum, and hybrid cryptographic configurations. Our implementation leverages the Open Quantum Safe (OQS) libraries and integrates lattice- and hash-based primitives into BIND9 and TLS 1.3 stacks. We formalize performance and threat models and analyze the impact of post-quantum key encapsulation and digital signatures on end-to-end DNS resolution. Experimental results on a containerized testbed reveal that lattice-based primitives such as Module-Lattice-Based Key-Encapsulation Mechanism (MLKEM) and Falcon offer practical latency and resource profiles, while hash-based schemes like SPHINCS+ significantly increase message sizes and processing overhead. We also examine security implications including downgrade risks, fragmentation vulnerabilities, and susceptibility to denial-of-service amplification. Our findings inform practical guidance for deploying quantum-resilient DNS and contribute to the broader effort of securing core Internet protocols for the post-quantum future.
△ Less
Submitted 24 June, 2025;
originally announced June 2025.
-
IGraSS: Learning to Identify Infrastructure Networks from Satellite Imagery by Iterative Graph-constrained Semantic Segmentation
Authors:
Oishee Bintey Hoque,
Abhijin Adiga,
Aniruddha Adiga,
Siddharth Chaudhary,
Madhav V. Marathe,
S. S. Ravi,
Kirti Rajagopalan,
Amanda Wilson,
Samarth Swarup
Abstract:
Accurate canal network mapping is essential for water management, including irrigation planning and infrastructure maintenance. State-of-the-art semantic segmentation models for infrastructure mapping, such as roads, rely on large, well-annotated remote sensing datasets. However, incomplete or inadequate ground truth can hinder these learning approaches. Many infrastructure networks have graph-lev…
▽ More
Accurate canal network mapping is essential for water management, including irrigation planning and infrastructure maintenance. State-of-the-art semantic segmentation models for infrastructure mapping, such as roads, rely on large, well-annotated remote sensing datasets. However, incomplete or inadequate ground truth can hinder these learning approaches. Many infrastructure networks have graph-level properties such as reachability to a source (like canals) or connectivity (roads) that can be leveraged to improve these existing ground truth. This paper develops a novel iterative framework IGraSS, combining a semantic segmentation module-incorporating RGB and additional modalities (NDWI, DEM)-with a graph-based ground-truth refinement module. The segmentation module processes satellite imagery patches, while the refinement module operates on the entire data viewing the infrastructure network as a graph. Experiments show that IGraSS reduces unreachable canal segments from around 18% to 3%, and training with refined ground truth significantly improves canal identification. IGraSS serves as a robust framework for both refining noisy ground truth and mapping canal networks from remote sensing imagery. We also demonstrate the effectiveness and generalizability of IGraSS using road networks as an example, applying a different graph-theoretic constraint to complete road networks.
△ Less
Submitted 10 June, 2025; v1 submitted 9 June, 2025;
originally announced June 2025.
-
Personalized Large Language Models Can Increase the Belief Accuracy of Social Networks
Authors:
Adiba Mahbub Proma,
Neeley Pate,
Sean Kelty,
Gourab Ghoshal,
James N. Druckman,
Ehsan Hoque
Abstract:
Large language models (LLMs) are increasingly involved in shaping public understanding on contested issues. This has led to substantial discussion about the potential of LLMs to reinforce or correct misperceptions. While existing literature documents the impact of LLMs on individuals' beliefs, limited work explores how LLMs affect social networks. We address this gap with a pre-registered experime…
▽ More
Large language models (LLMs) are increasingly involved in shaping public understanding on contested issues. This has led to substantial discussion about the potential of LLMs to reinforce or correct misperceptions. While existing literature documents the impact of LLMs on individuals' beliefs, limited work explores how LLMs affect social networks. We address this gap with a pre-registered experiment (N = 1265) around the 2024 US presidential election, where we empirically explore the impact of personalized LLMs on belief accuracy in the context of social networks. The LLMs are constructed to be personalized, offering messages tailored to individuals' profiles, and to have guardrails for accurate information retrieval. We find that the presence of a personalized LLM leads individuals to update their beliefs towards the truth. More importantly, individuals with a personalized LLM in their social network not only choose to follow it, indicating they would like to obtain information from it in subsequent interactions, but also construct subsequent social networks to include other individuals with beliefs similar to the LLM -- in this case, more accurate beliefs. Therefore, our results show that LLMs have the capacity to influence individual beliefs and the social networks in which people exist, and highlight the potential of LLMs to act as corrective agents in online environments. Our findings can inform future strategies for responsible AI-mediated communication.
△ Less
Submitted 6 June, 2025;
originally announced June 2025.
-
Improving Automatic Evaluation of Large Language Models (LLMs) in Biomedical Relation Extraction via LLMs-as-the-Judge
Authors:
Md Tahmid Rahman Laskar,
Israt Jahan,
Elham Dolatabadi,
Chun Peng,
Enamul Hoque,
Jimmy Huang
Abstract:
Large Language Models (LLMs) have demonstrated impressive performance in biomedical relation extraction, even in zero-shot scenarios. However, evaluating LLMs in this task remains challenging due to their ability to generate human-like text, often producing synonyms or abbreviations of gold-standard answers, making traditional automatic evaluation metrics unreliable. On the other hand, while human…
▽ More
Large Language Models (LLMs) have demonstrated impressive performance in biomedical relation extraction, even in zero-shot scenarios. However, evaluating LLMs in this task remains challenging due to their ability to generate human-like text, often producing synonyms or abbreviations of gold-standard answers, making traditional automatic evaluation metrics unreliable. On the other hand, while human evaluation is more reliable, it is costly and time-consuming, making it impractical for real-world applications. This paper investigates the use of LLMs-as-the-Judge as an alternative evaluation method for biomedical relation extraction. We benchmark 8 LLMs as judges to evaluate the responses generated by 5 other LLMs across 3 biomedical relation extraction datasets. Unlike other text-generation tasks, we observe that LLM-based judges perform quite poorly (usually below 50% accuracy) in the biomedical relation extraction task. Our findings reveal that it happens mainly because relations extracted by LLMs do not adhere to any standard format. To address this, we propose structured output formatting for LLM-generated responses that helps LLM-Judges to improve their performance by about 15% (on average). We also introduce a domain adaptation technique to further enhance LLM-Judge performance by effectively transferring knowledge between datasets. We release both our human-annotated and LLM-annotated judgment data (36k samples in total) for public use here: https://github.com/tahmedge/llm_judge_biomedical_re.
△ Less
Submitted 31 May, 2025;
originally announced June 2025.
-
A White Paper on The Multi-Messenger Science Landscape in India
Authors:
Samsuzzaman Afroz,
Sanjib Kumar Agarwalla,
Dipankar Bhattacharya,
Soumya Bhattacharya,
Subir Bhattacharyya,
Varun Bhalerao,
Debanjan Bose,
Chinmay Borwanker,
Ishwara Chandra C. H.,
Aniruddha Chakraborty,
Indranil Chakraborty,
Sovan Chakraborty,
Debarati Chatterjee,
Varsha Chitnis,
Moon Moon Devi,
Sanjeev Dhurandhar,
Amol Dighe,
Bitan Ghosal,
Sourendu Gupta,
Arpan Hait,
Md Emanuel Hoque,
Pratik Majumdar,
Nilmani Mathur,
Harsh Mehta,
Subhendra Mohanty
, et al. (13 additional authors not shown)
Abstract:
The multi-messenger science using different observational windows to the Universe such as Gravitational Waves (GWs), Electromagnetic Waves (EMs), Cosmic Rays (CRs), and Neutrinos offer an opportunity to study from the scale of a neutron star to cosmological scales over a large cosmic time. At the smallest scales, we can explore the structure of the neutron star and the different energetics involve…
▽ More
The multi-messenger science using different observational windows to the Universe such as Gravitational Waves (GWs), Electromagnetic Waves (EMs), Cosmic Rays (CRs), and Neutrinos offer an opportunity to study from the scale of a neutron star to cosmological scales over a large cosmic time. At the smallest scales, we can explore the structure of the neutron star and the different energetics involved in the transition of a pre-merger neutron star to a post-merger neutron star. This will open up a window to study the properties of matter in extreme conditions and a guaranteed discovery space. On the other hand, at the largest cosmological scales, multi-messenger observations allow us to study the long-standing problems in physical cosmology related to the Hubble constant, dark matter, and dark energy by mapping the expansion history of the Universe using GW sources. Moreover, the multi-messenger studies of astrophysical systems such as white dwarfs, neutron stars, and black holes of different masses, all the way up to a high redshift Universe, will bring insightful understanding into the physical processes associated with them that are inaccessible otherwise. This white paper discusses the key cases in the domain of multi-messenger astronomy and the role of observatories in India which can explore uncharted territories and open discovery spaces in different branches of physics ranging from nuclear physics to astrophysics.
△ Less
Submitted 30 May, 2025;
originally announced May 2025.
-
An Exploratory Approach Towards Investigating and Explaining Vision Transformer and Transfer Learning for Brain Disease Detection
Authors:
Shuvashis Sarker,
Shamim Rahim Refat,
Faika Fairuj Preotee,
Shifat Islam,
Tashreef Muhammad,
Mohammad Ashraful Hoque
Abstract:
The brain is a highly complex organ that manages many important tasks, including movement, memory and thinking. Brain-related conditions, like tumors and degenerative disorders, can be hard to diagnose and treat. Magnetic Resonance Imaging (MRI) serves as a key tool for identifying these conditions, offering high-resolution images of brain structures. Despite this, interpreting MRI scans can be co…
▽ More
The brain is a highly complex organ that manages many important tasks, including movement, memory and thinking. Brain-related conditions, like tumors and degenerative disorders, can be hard to diagnose and treat. Magnetic Resonance Imaging (MRI) serves as a key tool for identifying these conditions, offering high-resolution images of brain structures. Despite this, interpreting MRI scans can be complicated. This study tackles this challenge by conducting a comparative analysis of Vision Transformer (ViT) and Transfer Learning (TL) models such as VGG16, VGG19, Resnet50V2, MobilenetV2 for classifying brain diseases using MRI data from Bangladesh based dataset. ViT, known for their ability to capture global relationships in images, are particularly effective for medical imaging tasks. Transfer learning helps to mitigate data constraints by fine-tuning pre-trained models. Furthermore, Explainable AI (XAI) methods such as GradCAM, GradCAM++, LayerCAM, ScoreCAM, and Faster-ScoreCAM are employed to interpret model predictions. The results demonstrate that ViT surpasses transfer learning models, achieving a classification accuracy of 94.39%. The integration of XAI methods enhances model transparency, offering crucial insights to aid medical professionals in diagnosing brain diseases with greater precision.
△ Less
Submitted 22 June, 2025; v1 submitted 21 May, 2025;
originally announced May 2025.
-
Reading.help: Supporting EFL Readers with Proactive and On-Demand Explanation of English Grammar and Semantics
Authors:
Sunghyo Chung,
Hyeon Jeon,
Sungbok Shin,
Md Naimul Hoque
Abstract:
A large portion of texts in the world is written in English, but readers who see English as a Foreign Language (EFL) often struggle to read texts written in English accurately and swiftly. In many countries, EFL readers seek help from professional teachers and mentors, which is limited and costly. In this paper, we explore how an intelligent reading tool can assist EFL readers. To support our rese…
▽ More
A large portion of texts in the world is written in English, but readers who see English as a Foreign Language (EFL) often struggle to read texts written in English accurately and swiftly. In many countries, EFL readers seek help from professional teachers and mentors, which is limited and costly. In this paper, we explore how an intelligent reading tool can assist EFL readers. To support our research agenda, we conducted a case study with EFL readers in South Korea. We at first developed an LLM-based reading tool based on prior literature. We then revised the tool based on the feedback from a study with 15 South Korean EFL readers. The final tool, named Reading.help, helps EFL readers comprehend complex sentences and paragraphs with on-demand and proactive explanations. We finally evaluated the tool with 5 EFL readers and 2 EFL education professionals. Our findings suggest Reading.help could potentially help EFL readers self-learn english when they do not have access to any external support.
△ Less
Submitted 20 May, 2025;
originally announced May 2025.
-
EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video
Authors:
Ryan Hoque,
Peide Huang,
David J. Yoon,
Mouli Sivapurapu,
Jian Zhang
Abstract:
Imitation learning for manipulation has a well-known data scarcity problem. Unlike natural language and 2D computer vision, there is no Internet-scale corpus of data for dexterous manipulation. One appealing option is egocentric human video, a passively scalable data source. However, existing large-scale datasets such as Ego4D do not have native hand pose annotations and do not focus on object man…
▽ More
Imitation learning for manipulation has a well-known data scarcity problem. Unlike natural language and 2D computer vision, there is no Internet-scale corpus of data for dexterous manipulation. One appealing option is egocentric human video, a passively scalable data source. However, existing large-scale datasets such as Ego4D do not have native hand pose annotations and do not focus on object manipulation. To this end, we use Apple Vision Pro to collect EgoDex: the largest and most diverse dataset of dexterous human manipulation to date. EgoDex has 829 hours of egocentric video with paired 3D hand and finger tracking data collected at the time of recording, where multiple calibrated cameras and on-device SLAM can be used to precisely track the pose of every joint of each hand. The dataset covers a wide range of diverse manipulation behaviors with everyday household objects in 194 different tabletop tasks ranging from tying shoelaces to folding laundry. Furthermore, we train and systematically evaluate imitation learning policies for hand trajectory prediction on the dataset, introducing metrics and benchmarks for measuring progress in this increasingly important area. By releasing this large-scale dataset, we hope to push the frontier of robotics, computer vision, and foundation models.
△ Less
Submitted 16 May, 2025;
originally announced May 2025.
-
Judging the Judges: Can Large Vision-Language Models Fairly Evaluate Chart Comprehension and Reasoning?
Authors:
Md Tahmid Rahman Laskar,
Mohammed Saidul Islam,
Ridwan Mahbub,
Ahmed Masry,
Mizanur Rahman,
Amran Bhuiyan,
Mir Tafseer Nayeem,
Shafiq Joty,
Enamul Hoque,
Jimmy Huang
Abstract:
Charts are ubiquitous as they help people understand and reason with data. Recently, various downstream tasks, such as chart question answering, chart2text, and fact-checking, have emerged. Large Vision-Language Models (LVLMs) show promise in tackling these tasks, but their evaluation is costly and time-consuming, limiting real-world deployment. While using LVLMs as judges to assess the chart comp…
▽ More
Charts are ubiquitous as they help people understand and reason with data. Recently, various downstream tasks, such as chart question answering, chart2text, and fact-checking, have emerged. Large Vision-Language Models (LVLMs) show promise in tackling these tasks, but their evaluation is costly and time-consuming, limiting real-world deployment. While using LVLMs as judges to assess the chart comprehension capabilities of other LVLMs could streamline evaluation processes, challenges like proprietary datasets, restricted access to powerful models, and evaluation costs hinder their adoption in industrial settings. To this end, we present a comprehensive evaluation of 13 open-source LVLMs as judges for diverse chart comprehension and reasoning tasks. We design both pairwise and pointwise evaluation tasks covering criteria like factual correctness, informativeness, and relevancy. Additionally, we analyze LVLM judges based on format adherence, positional consistency, length bias, and instruction-following. We focus on cost-effective LVLMs (<10B parameters) suitable for both research and commercial use, following a standardized evaluation protocol and rubric to measure the LVLM judge's accuracy. Experimental results reveal notable variability: while some open LVLM judges achieve GPT-4-level evaluation performance (about 80% agreement with GPT-4 judgments), others struggle (below ~10% agreement). Our findings highlight that state-of-the-art open-source LVLMs can serve as cost-effective automatic evaluators for chart-related tasks, though biases such as positional preference and length bias persist.
△ Less
Submitted 13 May, 2025;
originally announced May 2025.
-
Knowledge-Informed Deep Learning for Irrigation Type Mapping from Remote Sensing
Authors:
Oishee Bintey Hoque,
Nibir Chandra Mandal,
Abhijin Adiga,
Samarth Swarup,
Sayjro Kossi Nouwakpo,
Amanda Wilson,
Madhav Marathe
Abstract:
Accurate mapping of irrigation methods is crucial for sustainable agricultural practices and food systems. However, existing models that rely solely on spectral features from satellite imagery are ineffective due to the complexity of agricultural landscapes and limited training data, making this a challenging problem. We present Knowledge-Informed Irrigation Mapping (KIIM), a novel Swin-Transforme…
▽ More
Accurate mapping of irrigation methods is crucial for sustainable agricultural practices and food systems. However, existing models that rely solely on spectral features from satellite imagery are ineffective due to the complexity of agricultural landscapes and limited training data, making this a challenging problem. We present Knowledge-Informed Irrigation Mapping (KIIM), a novel Swin-Transformer based approach that uses (i) a specialized projection matrix to encode crop to irrigation probability, (ii) a spatial attention map to identify agricultural lands from non-agricultural lands, (iii) bi-directional cross-attention to focus complementary information from different modalities, and (iv) a weighted ensemble for combining predictions from images and crop information. Our experimentation on five states in the US shows up to 22.9\% (IoU) improvement over baseline with a 71.4% (IoU) improvement for hard-to-classify drip irrigation. In addition, we propose a two-phase transfer learning approach to enhance cross-state irrigation mapping, achieving a 51% IoU boost in a state with limited labeled data. The ability to achieve baseline performance with only 40% of the training data highlights its efficiency, reducing the dependency on extensive manual labeling efforts and making large-scale, automated irrigation mapping more feasible and cost-effective.
△ Less
Submitted 5 June, 2025; v1 submitted 13 May, 2025;
originally announced May 2025.
-
IrrMap: A Large-Scale Comprehensive Dataset for Irrigation Method Mapping
Authors:
Nibir Chandra Mandal,
Oishee Bintey Hoque,
Abhijin Adiga,
Samarth Swarup,
Mandy Wilson,
Lu Feng,
Yangfeng Ji,
Miaomiao Zhang,
Geoffrey Fox,
Madhav Marathe
Abstract:
We introduce IrrMap, the first large-scale dataset (1.1 million patches) for irrigation method mapping across regions. IrrMap consists of multi-resolution satellite imagery from LandSat and Sentinel, along with key auxiliary data such as crop type, land use, and vegetation indices. The dataset spans 1,687,899 farms and 14,117,330 acres across multiple western U.S. states from 2013 to 2023, providi…
▽ More
We introduce IrrMap, the first large-scale dataset (1.1 million patches) for irrigation method mapping across regions. IrrMap consists of multi-resolution satellite imagery from LandSat and Sentinel, along with key auxiliary data such as crop type, land use, and vegetation indices. The dataset spans 1,687,899 farms and 14,117,330 acres across multiple western U.S. states from 2013 to 2023, providing a rich and diverse foundation for irrigation analysis and ensuring geospatial alignment and quality control. The dataset is ML-ready, with standardized 224x224 GeoTIFF patches, the multiple input modalities, carefully chosen train-test-split data, and accompanying dataloaders for seamless deep learning model training andbenchmarking in irrigation mapping. The dataset is also accompanied by a complete pipeline for dataset generation, enabling researchers to extend IrrMap to new regions for irrigation data collection or adapt it with minimal effort for other similar applications in agricultural and geospatial analysis. We also analyze the irrigation method distribution across crop groups, spatial irrigation patterns (using Shannon diversity indices), and irrigated area variations for both LandSat and Sentinel, providing insights into regional and resolution-based differences. To promote further exploration, we openly release IrrMap, along with the derived datasets, benchmark models, and pipeline code, through a GitHub repository: https://github.com/Nibir088/IrrMap and Data repository: https://huggingface.co/Nibir/IrrMap, providing comprehensive documentation and implementation details.
△ Less
Submitted 31 May, 2025; v1 submitted 13 May, 2025;
originally announced May 2025.
-
The ALMA-ATOMS Survey: Exploring Protostellar Outflows in HC$_3$N
Authors:
Ariful Hoque,
Tapas Baug,
Lokesh K. Dewangan,
Mika Juvela,
Anandmayee Tej,
Paul F. Goldsmith,
Pablo García,
Amelia M. Stutz,
Tie Liu,
Chang Won Lee,
Fengwei Xu,
Patricio Sanhueza,
N. K. Bhadari,
K. Tatematsu,
Xunchuan Liu,
Hong-Li Liu,
Yong Zhang,
Xindi Tang,
Guido Garay,
Ke Wang,
Siju Zhang,
L. Viktor Tóth,
Hafiz Nazeer,
Jihye Hwang,
Prasanta Gorai
, et al. (3 additional authors not shown)
Abstract:
We present the first systematic study of bipolar outflows using HC$_3$N as a tracer in a sample of 146 massive star-forming regions from ALMA-ATOMS survey. Protostellar outflows arise at the initial stage of star formation as a consequence of active accretion. In general, these outflows play a pivotal role in regulating the star formation processes by injecting energetic material in the parent mol…
▽ More
We present the first systematic study of bipolar outflows using HC$_3$N as a tracer in a sample of 146 massive star-forming regions from ALMA-ATOMS survey. Protostellar outflows arise at the initial stage of star formation as a consequence of active accretion. In general, these outflows play a pivotal role in regulating the star formation processes by injecting energetic material in the parent molecular clouds. In such process, lower velocity components of outflows contain a significant portion of the energy. However, extraction of those component is difficult as the corresponding gas is often mixed with that of the ambient cloud. In our sample, we identified 44 bipolar outflows and one explosive outflow in HC$_3$N (J=11--10). The host clumps of these outflows are found to be at different evolutionary stages, suggesting that outflows in HC$_3$N are detectable in different stages of star formation. Also, the non-correlation of HC$_3$N outflows with clump evolutionary stages suggests that HC$_3$N is an unbiased tracer of outflows. Analyses revealed that HC$_3$N performs slightly better in detecting low-velocity components of outflows than traditionally employed tracers like SiO. The derived outflow parameters (i.e outflow mass, momentum, and energy) show moderate correlations with clump mass and luminosity. Our analysis of outflow opening angles and position-velocity diagrams across the outflow lobes show that, HC$_3$N is not only a good tracer of low-velocity outflows, but can also detect high-velocity collimated outflows. Overall, this study indicates that HC$_3$N can be used as a complementary outflow tracer along with the traditionally known outflow tracers, particularly in the detection of the low-velocity components of outflows.
△ Less
Submitted 7 May, 2025;
originally announced May 2025.
-
AI Standardized Patient Improves Human Conversations in Advanced Cancer Care
Authors:
Kurtis Haut,
Masum Hasan,
Thomas Carroll,
Ronald Epstein,
Taylan Sen,
Ehsan Hoque
Abstract:
Serious illness communication (SIC) in end-of-life care faces challenges such as emotional stress, cultural barriers, and balancing hope with honesty. Despite its importance, one of the few available ways for clinicians to practice SIC is with standardized patients, which is expensive, time-consuming, and inflexible. In this paper, we present SOPHIE, an AI-powered standardized patient simulation a…
▽ More
Serious illness communication (SIC) in end-of-life care faces challenges such as emotional stress, cultural barriers, and balancing hope with honesty. Despite its importance, one of the few available ways for clinicians to practice SIC is with standardized patients, which is expensive, time-consuming, and inflexible. In this paper, we present SOPHIE, an AI-powered standardized patient simulation and automated feedback system. SOPHIE combines large language models (LLMs), a lifelike virtual avatar, and automated, personalized feedback based on clinical literature to provide remote, on-demand SIC training. In a randomized control study with healthcare students and professionals, SOPHIE users demonstrated significant improvement across three critical SIC domains: Empathize, Be Explicit, and Empower. These results suggest that AI-driven tools can enhance complex interpersonal communication skills, offering scalable, accessible solutions to address a critical gap in clinician education.
△ Less
Submitted 5 May, 2025;
originally announced May 2025.
-
CBM-RAG: Demonstrating Enhanced Interpretability in Radiology Report Generation with Multi-Agent RAG and Concept Bottleneck Models
Authors:
Hasan Md Tusfiqur Alam,
Devansh Srivastav,
Abdulrahman Mohamed Selim,
Md Abdul Kadir,
Md Moktadirul Hoque Shuvo,
Daniel Sonntag
Abstract:
Advancements in generative Artificial Intelligence (AI) hold great promise for automating radiology workflows, yet challenges in interpretability and reliability hinder clinical adoption. This paper presents an automated radiology report generation framework that combines Concept Bottleneck Models (CBMs) with a Multi-Agent Retrieval-Augmented Generation (RAG) system to bridge AI performance with c…
▽ More
Advancements in generative Artificial Intelligence (AI) hold great promise for automating radiology workflows, yet challenges in interpretability and reliability hinder clinical adoption. This paper presents an automated radiology report generation framework that combines Concept Bottleneck Models (CBMs) with a Multi-Agent Retrieval-Augmented Generation (RAG) system to bridge AI performance with clinical explainability. CBMs map chest X-ray features to human-understandable clinical concepts, enabling transparent disease classification. Meanwhile, the RAG system integrates multi-agent collaboration and external knowledge to produce contextually rich, evidence-based reports. Our demonstration showcases the system's ability to deliver interpretable predictions, mitigate hallucinations, and generate high-quality, tailored reports with an interactive interface addressing accuracy, trust, and usability challenges. This framework provides a pathway to improving diagnostic consistency and empowering radiologists with actionable insights.
△ Less
Submitted 4 May, 2025; v1 submitted 29 April, 2025;
originally announced April 2025.
-
Conformal Einstein equation and symplectic flux with a positive cosmological constant
Authors:
SK Jahanur Hoque,
Pavel Krtous,
Carlos Peón-Nieto
Abstract:
We analyze the conformal Einstein equation with a positive cosmological constant to extract fall-off conditions of the gravitational fields. The fall-off conditions are consistent with a finite, non-trivial presymplectic current on the future boundary of de Sitter. Hence our result allows a non-zero gravitational flux across the boundary of the de Sitter. We present an explicit gauge-free computat…
▽ More
We analyze the conformal Einstein equation with a positive cosmological constant to extract fall-off conditions of the gravitational fields. The fall-off conditions are consistent with a finite, non-trivial presymplectic current on the future boundary of de Sitter. Hence our result allows a non-zero gravitational flux across the boundary of the de Sitter. We present an explicit gauge-free computation to show that the Gibbons-Hawking boundary term, counterterm in the action, and fall-off condition of gravitational field in conformal Einstein equation are crucial to reproduce the finite symplectic flux.
△ Less
Submitted 29 April, 2025;
originally announced April 2025.
-
EPSILON: Adaptive Fault Mitigation in Approximate Deep Neural Network using Statistical Signatures
Authors:
Khurram Khalil,
Khaza Anuarul Hoque
Abstract:
The increasing adoption of approximate computing in deep neural network accelerators (AxDNNs) promises significant energy efficiency gains. However, permanent faults in AxDNNs can severely degrade their performance compared to their accurate counterparts (AccDNNs). Traditional fault detection and mitigation approaches, while effective for AccDNNs, introduce substantial overhead and latency, making…
▽ More
The increasing adoption of approximate computing in deep neural network accelerators (AxDNNs) promises significant energy efficiency gains. However, permanent faults in AxDNNs can severely degrade their performance compared to their accurate counterparts (AccDNNs). Traditional fault detection and mitigation approaches, while effective for AccDNNs, introduce substantial overhead and latency, making them impractical for energy-constrained real-time deployment. To address this, we introduce EPSILON, a lightweight framework that leverages pre-computed statistical signatures and layer-wise importance metrics for efficient fault detection and mitigation in AxDNNs. Our framework introduces a novel non-parametric pattern-matching algorithm that enables constant-time fault detection without interrupting normal execution while dynamically adapting to different network architectures and fault patterns. EPSILON maintains model accuracy by intelligently adjusting mitigation strategies based on a statistical analysis of weight distribution and layer criticality while preserving the energy benefits of approximate computing. Extensive evaluations across various approximate multipliers, AxDNN architectures, popular datasets (MNIST, CIFAR-10, CIFAR-100, ImageNet-1k), and fault scenarios demonstrate that EPSILON maintains 80.05\% accuracy while offering 22\% improvement in inference time and 28\% improvement in energy efficiency, establishing EPSILON as a practical solution for deploying reliable AxDNNs in safety-critical edge applications.
△ Less
Submitted 24 April, 2025;
originally announced April 2025.
-
ApproXAI: Energy-Efficient Hardware Acceleration of Explainable AI using Approximate Computing
Authors:
Ayesha Siddique,
Khurram Khalil,
Khaza Anuarul Hoque
Abstract:
Explainable artificial intelligence (XAI) enhances AI system transparency by framing interpretability as an optimization problem. However, this approach often necessitates numerous iterations of computationally intensive operations, limiting its applicability in real-time scenarios. While recent research has focused on XAI hardware acceleration on FPGAs and TPU, these methods do not fully address…
▽ More
Explainable artificial intelligence (XAI) enhances AI system transparency by framing interpretability as an optimization problem. However, this approach often necessitates numerous iterations of computationally intensive operations, limiting its applicability in real-time scenarios. While recent research has focused on XAI hardware acceleration on FPGAs and TPU, these methods do not fully address energy efficiency in real-time settings. To address this limitation, we propose XAIedge, a novel framework that leverages approximate computing techniques into XAI algorithms, including integrated gradients, model distillation, and Shapley analysis. XAIedge translates these algorithms into approximate matrix computations and exploits the synergy between convolution, Fourier transform, and approximate computing paradigms. This approach enables efficient hardware acceleration on TPU-based edge devices, facilitating faster real-time outcome interpretations. Our comprehensive evaluation demonstrates that XAIedge achieves a $2\times$ improvement in energy efficiency compared to existing accurate XAI hardware acceleration techniques while maintaining comparable accuracy. These results highlight the potential of XAIedge to significantly advance the deployment of explainable AI in energy-constrained real-time applications.
△ Less
Submitted 12 May, 2025; v1 submitted 24 April, 2025;
originally announced April 2025.
-
DashGuide: Authoring Interactive Dashboard Tours for Guiding Dashboard Users
Authors:
Naimul Hoque,
Nicole Sultanum
Abstract:
Dashboard guidance helps dashboard users better navigate interactive features, understand the underlying data, and assess insights they can potentially extract from dashboards. However, authoring dashboard guidance is a time consuming task, and embedding guidance into dashboards for effective delivery is difficult to realize. In this work, we contribute DashGuide, a framework and system to support…
▽ More
Dashboard guidance helps dashboard users better navigate interactive features, understand the underlying data, and assess insights they can potentially extract from dashboards. However, authoring dashboard guidance is a time consuming task, and embedding guidance into dashboards for effective delivery is difficult to realize. In this work, we contribute DashGuide, a framework and system to support the creation of interactive dashboard guidance with minimal authoring input. Given a dashboard and a communication goal, DashGuide captures a sequence of author-performed interactions to generate guidance materials delivered as playable step-by-step overlays, a.k.a., dashboard tours. Authors can further edit and refine individual tour steps while receiving generative assistance. We also contribute findings from a formative assessment with 9 dashboard creators, which helped inform the design of DashGuide; and findings from an evaluation of DashGuide with 12 dashboard creators, suggesting it provides an improved authoring experience that balances efficiency, expressiveness, and creative freedom.
△ Less
Submitted 23 April, 2025;
originally announced April 2025.
-
Prognosis Of Lithium-Ion Battery Health with Hybrid EKF-CNN+LSTM Model Using Differential Capacity
Authors:
Md Azizul Hoque,
Babul Salam,
Mohd Khair Hassan,
Abdulkabir Aliyu,
Abedalmuhdi Almomany,
Muhammed Sutcu
Abstract:
Battery degradation is a major challenge in electric vehicles (EV) and energy storage systems (ESS). However, most degradation investigations focus mainly on estimating the state of charge (SOC), which fails to accurately interpret the cells' internal degradation mechanisms. Differential capacity analysis (DCA) focuses on the rate of change of cell voltage about the change in cell capacity, under…
▽ More
Battery degradation is a major challenge in electric vehicles (EV) and energy storage systems (ESS). However, most degradation investigations focus mainly on estimating the state of charge (SOC), which fails to accurately interpret the cells' internal degradation mechanisms. Differential capacity analysis (DCA) focuses on the rate of change of cell voltage about the change in cell capacity, under various charge/discharge rates. This paper developed a battery cell degradation testing model that used two types of lithium-ions (Li-ion) battery cells, namely lithium nickel cobalt aluminium oxides (LiNiCoAlO2) and lithium iron phosphate (LiFePO4), to evaluate internal degradation during loading conditions. The proposed battery degradation model contains distinct charge rates (DCR) of 0.2C, 0.5C, 1C, and 1.5C, as well as discharge rates (DDR) of 0.5C, 0.9C, 1.3C, and 1.6C to analyze the internal health and performance of battery cells during slow, moderate, and fast loading conditions. Besides, this research proposed a model that incorporates the Extended Kalman Filter (EKF), Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM) networks to validate experimental data. The proposed model yields excellent modelling results based on mean squared error (MSE), and root mean squared error (RMSE), with errors of less than 0.001% at DCR and DDR. The peak identification technique (PIM) has been utilized to investigate battery health based on the number of peaks, peak position, peak height, peak area, and peak width. At last, the PIM method has discovered that the cell aged gradually under normal loading rates but deteriorated rapidly under fast loading conditions. Overall, LiFePO4 batteries perform more robustly and consistently than (LiNiCoAlO2) cells under varying loading conditions.
△ Less
Submitted 16 April, 2025;
originally announced April 2025.
-
ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering
Authors:
Ahmed Masry,
Mohammed Saidul Islam,
Mahir Ahmed,
Aayush Bajaj,
Firoz Kabir,
Aaryaman Kartha,
Md Tahmid Rahman Laskar,
Mizanur Rahman,
Shadikur Rahman,
Mehrad Shahmohammadi,
Megh Thakkar,
Md Rizwan Parvez,
Enamul Hoque,
Shafiq Joty
Abstract:
Charts are ubiquitous, as people often use them to analyze data, answer questions, and discover critical insights. However, performing complex analytical tasks with charts requires significant perceptual and cognitive effort. Chart Question Answering (CQA) systems automate this process by enabling models to interpret and reason with visual representations of data. However, existing benchmarks like…
▽ More
Charts are ubiquitous, as people often use them to analyze data, answer questions, and discover critical insights. However, performing complex analytical tasks with charts requires significant perceptual and cognitive effort. Chart Question Answering (CQA) systems automate this process by enabling models to interpret and reason with visual representations of data. However, existing benchmarks like ChartQA lack real-world diversity and have recently shown performance saturation with modern large vision-language models (LVLMs). To address these limitations, we introduce ChartQAPro, a new benchmark that includes 1,341 charts from 157 diverse sources, spanning various chart types, including infographics and dashboards, and featuring 1,948 questions in various types, such as multiple-choice, conversational, hypothetical, and unanswerable questions, to better reflect real-world challenges. Our evaluations with 21 models show a substantial performance drop for LVLMs on ChartQAPro; e.g., Claude Sonnet 3.5 scores 90.5% on ChartQA but only 55.81% on ChartQAPro, underscoring the complexity of chart reasoning. We complement our findings with detailed error analyses and ablation studies, identifying key challenges and opportunities for advancing LVLMs in chart understanding and reasoning. We release ChartQAPro at https://github.com/vis-nlp/ChartQAPro.
△ Less
Submitted 10 April, 2025; v1 submitted 7 April, 2025;
originally announced April 2025.
-
A Theoretical Framework for Graph-based Digital Twins for Supply Chain Management and Optimization
Authors:
Azmine Toushik Wasi,
Mahfuz Ahmed Anik,
Abdur Rahman,
Md. Iqramul Hoque,
MD Shafikul Islam,
Md Manjurul Ahsan
Abstract:
Supply chain management is growing increasingly complex due to globalization, evolving market demands, and sustainability pressures, yet traditional systems struggle with fragmented data and limited analytical capabilities. Graph-based modeling offers a powerful way to capture the intricate relationships within supply chains, while Digital Twins (DTs) enable real-time monitoring and dynamic simula…
▽ More
Supply chain management is growing increasingly complex due to globalization, evolving market demands, and sustainability pressures, yet traditional systems struggle with fragmented data and limited analytical capabilities. Graph-based modeling offers a powerful way to capture the intricate relationships within supply chains, while Digital Twins (DTs) enable real-time monitoring and dynamic simulations. However, current implementations often face challenges related to scalability, data integration, and the lack of sustainability-focused metrics. To address these gaps, we propose a Graph-Based Digital Twin Framework for Supply Chain Optimization, which combines graph modeling with DT architecture to create a dynamic, real-time representation of supply networks. Our framework integrates a Data Integration Layer to harmonize disparate sources, a Graph Construction Module to model complex dependencies, and a Simulation and Analysis Engine for scalable optimization. Importantly, we embed sustainability metrics - such as carbon footprints and resource utilization - into operational dashboards to drive eco-efficiency. By leveraging the synergy between graph-based modeling and DTs, our approach enhances scalability, improves decision-making, and enables organizations to proactively manage disruptions, cut costs, and transition toward greener, more resilient supply chains.
△ Less
Submitted 23 March, 2025;
originally announced April 2025.
-
Fields with small class group in the family $\mathbb{Q}(\sqrt{9m^2+2m})$
Authors:
Kalyan Chakraborty,
Azizul Hoque
Abstract:
Very recently, Issa and Darrag [Arch. Math. (Basel) 123 (2024), no. 4, 379-383] determined partial Dedekind zeta values for certain ideal classes in the real quadratic fields of the form $\mathbb{Q}(\sqrt{9m^2+2m})$, where $9m^2+2m$ is square-free and $m\equiv 2\pmod 3$ is an odd positive integer. We use these partial Dedekind zeta values to investigate the small class numbers of such fields. More…
▽ More
Very recently, Issa and Darrag [Arch. Math. (Basel) 123 (2024), no. 4, 379-383] determined partial Dedekind zeta values for certain ideal classes in the real quadratic fields of the form $\mathbb{Q}(\sqrt{9m^2+2m})$, where $9m^2+2m$ is square-free and $m\equiv 2\pmod 3$ is an odd positive integer. We use these partial Dedekind zeta values to investigate the small class numbers of such fields. More precisely, we prove that the class numbers of the fields in the above mentioned family are at least $4$. Further, we provide a sufficient condition permitting to specify the structure of the class groups of order $4$ in this family of fields.
△ Less
Submitted 3 April, 2025;
originally announced April 2025.
-
Characterizing Creativity in Visualization Design
Authors:
Naimul Hoque,
Zinat Ara,
Safwat Ali Khan,
Fanny Chevalier,
Niklas Elmqvist
Abstract:
Understanding the role of creativity in visualization design becomes increasingly important as the field matures, particularly with the emergence of various visualization authoring and recommendation systems. In this paper, we examine how creativity manifests in visualization design processes and how academic research has conceptualized it over time. Through a systematic review of 58 visualization…
▽ More
Understanding the role of creativity in visualization design becomes increasingly important as the field matures, particularly with the emergence of various visualization authoring and recommendation systems. In this paper, we examine how creativity manifests in visualization design processes and how academic research has conceptualized it over time. Through a systematic review of 58 visualization papers that use the terms "creativity" or "creative," we analyze the evolution of creative practices in visualization design. Our findings show that prior literature predominantly used atypical designs through free-form drawings, infographics, pictorials, and data comics to define creative representations. However, creativity in visualization design extends beyond visual representations to encompass early needfinding design activities such as sketching, storyboarding, discussion, and card sorting. Data visualization can also support a wide variety of creative tasks (e.g., fiction writing). We discuss the implications of these findings for fostering innovation within established design paradigms and for developing more sophisticated visualization authoring systems. The full list of coded papers are available here: https://vizcreativity.notion.site/coded-papers.
△ Less
Submitted 2 April, 2025;
originally announced April 2025.
-
Explainable AI-Guided Efficient Approximate DNN Generation for Multi-Pod Systolic Arrays
Authors:
Ayesha Siddique,
Khurram Khalil,
Khaza Anuarul Hoque
Abstract:
Approximate deep neural networks (AxDNNs) are promising for enhancing energy efficiency in real-world devices. One of the key contributors behind this enhanced energy efficiency in AxDNNs is the use of approximate multipliers. Unfortunately, the simulation of approximate multipliers does not usually scale well on CPUs and GPUs. As a consequence, this slows down the overall simulation of AxDNNs aim…
▽ More
Approximate deep neural networks (AxDNNs) are promising for enhancing energy efficiency in real-world devices. One of the key contributors behind this enhanced energy efficiency in AxDNNs is the use of approximate multipliers. Unfortunately, the simulation of approximate multipliers does not usually scale well on CPUs and GPUs. As a consequence, this slows down the overall simulation of AxDNNs aimed at identifying the appropriate approximate multipliers to achieve high energy efficiency with a minimum accuracy loss. To address this problem, we present a novel XAI-Gen methodology, which leverages the analytical model of the emerging hardware accelerator (e.g., Google TPU v4) and explainable artificial intelligence (XAI) to precisely identify the non-critical layers for approximation and quickly discover the appropriate approximate multipliers for AxDNN layers. Our results show that XAI-Gen achieves up to 7x lower energy consumption with only 1-2% accuracy loss. We also showcase the effectiveness of the XAI-Gen approach through a neural architecture search (XAI-NAS) case study. Interestingly, XAI-NAS achieves 40\% higher energy efficiency with up to 5x less execution time when compared to the state-of-the-art NAS methods for generating AxDNNs.
△ Less
Submitted 20 March, 2025;
originally announced March 2025.
-
Active management of battery degradation in wireless sensor network using deep reinforcement learning for group battery replacement
Authors:
Jong-Hyun Jeong,
Hongki Jo,
Qiang Zhou,
Tahsin Afroz Hoque Nishat,
Lang Wu
Abstract:
Wireless sensor networks (WSNs) have become a promising solution for structural health monitoring (SHM), especially in hard-to-reach or remote locations. Battery-powered WSNs offer various advantages over wired systems, however limited battery life has always been one of the biggest obstacles in practical use of the WSNs, regardless of energy harvesting methods. While various methods have been stu…
▽ More
Wireless sensor networks (WSNs) have become a promising solution for structural health monitoring (SHM), especially in hard-to-reach or remote locations. Battery-powered WSNs offer various advantages over wired systems, however limited battery life has always been one of the biggest obstacles in practical use of the WSNs, regardless of energy harvesting methods. While various methods have been studied for battery health management, existing methods exclusively aim to extend lifetime of individual batteries, lacking a system level view. A consequence of applying such methods is that batteries in a WSN tend to fail at different times, posing significant difficulty on planning and scheduling of battery replacement trip. This study investigate a deep reinforcement learning (DRL) method for active battery degradation management by optimizing duty cycle of WSNs at the system level. This active management strategy effectively reduces earlier failure of battery individuals which enable group replacement without sacrificing WSN performances. A simulated environment based on a real-world WSN setup was developed to train a DRL agent and learn optimal duty cycle strategies. The performance of the strategy was validated in a long-term setup with various network sizes, demonstrating its efficiency and scalability.
△ Less
Submitted 22 March, 2025; v1 submitted 20 March, 2025;
originally announced March 2025.
-
Humanoid Policy ~ Human Policy
Authors:
Ri-Zhao Qiu,
Shiqi Yang,
Xuxin Cheng,
Chaitanya Chawla,
Jialong Li,
Tairan He,
Ge Yan,
David J. Yoon,
Ryan Hoque,
Lars Paulsen,
Ge Yang,
Jian Zhang,
Sha Yi,
Guanya Shi,
Xiaolong Wang
Abstract:
Training manipulation policies for humanoid robots with diverse data enhances their robustness and generalization across tasks and platforms. However, learning solely from robot demonstrations is labor-intensive, requiring expensive tele-operated data collection which is difficult to scale. This paper investigates a more scalable data source, egocentric human demonstrations, to serve as cross-embo…
▽ More
Training manipulation policies for humanoid robots with diverse data enhances their robustness and generalization across tasks and platforms. However, learning solely from robot demonstrations is labor-intensive, requiring expensive tele-operated data collection which is difficult to scale. This paper investigates a more scalable data source, egocentric human demonstrations, to serve as cross-embodiment training data for robot learning. We mitigate the embodiment gap between humanoids and humans from both the data and modeling perspectives. We collect an egocentric task-oriented dataset (PH2D) that is directly aligned with humanoid manipulation demonstrations. We then train a human-humanoid behavior policy, which we term Human Action Transformer (HAT). The state-action space of HAT is unified for both humans and humanoid robots and can be differentiably retargeted to robot actions. Co-trained with smaller-scale robot data, HAT directly models humanoid robots and humans as different embodiments without additional supervision. We show that human data improves both generalization and robustness of HAT with significantly better data collection efficiency. Code and data: https://human-as-robot.github.io/
△ Less
Submitted 24 March, 2025; v1 submitted 17 March, 2025;
originally announced March 2025.
-
Securing Virtual Reality Experiences: Unveiling and Tackling Cybersickness Attacks with Explainable AI
Authors:
Ripan Kumar Kundu,
Matthew Denton,
Genova Mongalo,
Prasad Calyam,
Khaza Anuarul Hoque
Abstract:
The synergy between virtual reality (VR) and artificial intelligence (AI), specifically deep learning (DL)-based cybersickness detection models, has ushered in unprecedented advancements in immersive experiences by automatically detecting cybersickness severity and adaptively various mitigation techniques, offering a smooth and comfortable VR experience. While this DL-enabled cybersickness detecti…
▽ More
The synergy between virtual reality (VR) and artificial intelligence (AI), specifically deep learning (DL)-based cybersickness detection models, has ushered in unprecedented advancements in immersive experiences by automatically detecting cybersickness severity and adaptively various mitigation techniques, offering a smooth and comfortable VR experience. While this DL-enabled cybersickness detection method provides promising solutions for enhancing user experiences, it also introduces new risks since these models are vulnerable to adversarial attacks; a small perturbation of the input data that is visually undetectable to human observers can fool the cybersickness detection model and trigger unexpected mitigation, thus disrupting user immersive experiences (UIX) and even posing safety risks. In this paper, we present a new type of VR attack, i.e., a cybersickness attack, which successfully stops the triggering of cybersickness mitigation by fooling DL-based cybersickness detection models and dramatically hinders the UIX. Next, we propose a novel explainable artificial intelligence (XAI)-guided cybersickness attack detection framework to detect such attacks in VR to ensure UIX and a comfortable VR experience. We evaluate the proposed attack and the detection framework using two state-of-the-art open-source VR cybersickness datasets: Simulation 2021 and Gameplay dataset. Finally, to verify the effectiveness of our proposed method, we implement the attack and the XAI-based detection using a testbed with a custom-built VR roller coaster simulation with an HTC Vive Pro Eye headset and perform a user study. Our study shows that such an attack can dramatically hinder the UIX. However, our proposed XAI-guided cybersickness attack detection can successfully detect cybersickness attacks and trigger the proper mitigation, effectively reducing VR cybersickness.
△ Less
Submitted 17 March, 2025;
originally announced March 2025.
-
Mechanical resonant sensing of spin texture dynamics in a two-dimensional antiferromagnet
Authors:
S M Enamul Hoque Yousuf,
Yunong Wang,
Shreyas Ramachandran,
John Koptur-Palenchar,
Chiara Tarantini,
Li Xiang,
Stephen McGill,
Dmitry Smirnov,
Elton J. G. Santos,
Philip X. -L. Feng,
Xiao-Xiao Zhang
Abstract:
The coupling between the spin degrees of freedom and macroscopic mechanical motions, including striction, shearing, and rotation, has attracted wide interest with applications in actuation, transduction, and information processing. Experiments so far have established the mechanical responses to the long-range ordered or isolated single spin states. However, it remains elusive whether mechanical mo…
▽ More
The coupling between the spin degrees of freedom and macroscopic mechanical motions, including striction, shearing, and rotation, has attracted wide interest with applications in actuation, transduction, and information processing. Experiments so far have established the mechanical responses to the long-range ordered or isolated single spin states. However, it remains elusive whether mechanical motions can couple to a different type of magnetic structure, the non-collinear spin textures, which exhibit nanoscale spatial variations of spin (domain walls, skyrmions, etc.) and are promising candidates to realize high-speed computing devices. Here, we report the detection of collective spin texture dynamics with nanoelectromechanical resonators made of two-dimensional antiferromagnetic (AFM) MnPS3 with $10^{-9}$ strain sensitivity. By examining radio frequency mechanical oscillations under magnetic fields, new magnetic transitions were identified with sharp dips in resonant frequency. They are attributed to the collective AFM domain wall motions as supported by the analytical modeling of magnetostriction and large-scale spin-dynamics simulations. Additionally, an abnormally large modulation in the mechanical nonlinearity at the transition field infers a fluid-like response due to the ultrafast domain motion. Our work establishes a strong coupling between spin texture and mechanical dynamics, laying the foundation for electromechanical manipulation of spin texture and developing quantum hybrid devices.
△ Less
Submitted 14 March, 2025;
originally announced March 2025.
-
An Empirical Analysis of LLMs for Countering Misinformation
Authors:
Adiba Mahbub Proma,
Neeley Pate,
James Druckman,
Gourab Ghoshal,
Hangfeng He,
Ehsan Hoque
Abstract:
While Large Language Models (LLMs) can amplify online misinformation, they also show promise in tackling misinformation. In this paper, we empirically study the capabilities of three LLMs -- ChatGPT, Gemini, and Claude -- in countering political misinformation. We implement a two-step, chain-of-thought prompting approach, where models first identify credible sources for a given claim and then gene…
▽ More
While Large Language Models (LLMs) can amplify online misinformation, they also show promise in tackling misinformation. In this paper, we empirically study the capabilities of three LLMs -- ChatGPT, Gemini, and Claude -- in countering political misinformation. We implement a two-step, chain-of-thought prompting approach, where models first identify credible sources for a given claim and then generate persuasive responses. Our findings suggest that models struggle to ground their responses in real news sources, and tend to prefer citing left-leaning sources. We also observe varying degrees of response diversity among models. Our findings highlight concerns about using LLMs for fact-checking through only prompt-engineering, emphasizing the need for more robust guardrails. Our results have implications for both researchers and non-technical users.
△ Less
Submitted 28 February, 2025;
originally announced March 2025.
-
SparseTransX: Efficient Training of Translation-Based Knowledge Graph Embeddings Using Sparse Matrix Operations
Authors:
Md Saidul Hoque Anik,
Ariful Azad
Abstract:
Knowledge graph (KG) learning offers a powerful framework for generating new knowledge and making inferences. Training KG embedding can take a significantly long time, especially for larger datasets. Our analysis shows that the gradient computation of embedding is one of the dominant functions in the translation-based KG embedding training loop. We address this issue by replacing the core embeddin…
▽ More
Knowledge graph (KG) learning offers a powerful framework for generating new knowledge and making inferences. Training KG embedding can take a significantly long time, especially for larger datasets. Our analysis shows that the gradient computation of embedding is one of the dominant functions in the translation-based KG embedding training loop. We address this issue by replacing the core embedding computation with SpMM (Sparse-Dense Matrix Multiplication) kernels. This allows us to unify multiple scatter (and gather) operations as a single operation, reducing training time and memory usage. We create a general framework for training KG models using sparse kernels and implement four models, namely TransE, TransR, TransH, and TorusE. Our sparse implementations exhibit up to 5.3x speedup on the CPU and up to 4.2x speedup on the GPU with a significantly low GPU memory footprint. The speedups are consistent across large and small datasets for a given model. Our proposed sparse approach can be extended to accelerate other translation-based (such as TransC, TransM, etc.) and non-translational (such as DistMult, ComplEx, RotatE, etc.) models as well. An implementation of the SpTransX framework is publicly available as a Python package in https://github.com/HipGraph/SpTransX.
△ Less
Submitted 30 April, 2025; v1 submitted 24 February, 2025;
originally announced February 2025.
-
Design and Implementation of a Scalable Clinical Data Warehouse for Resource-Constrained Healthcare Systems
Authors:
Shovito Barua Soumma,
Fahim Shahriar,
Umme Niraj Mahi,
Md Hasin Abrar,
Md Abdur Rahman Fahad,
Abu Sayed Md. Latiful Hoque
Abstract:
Centralized electronic health record repositories are critical for advancing disease surveillance, public health research, and evidence-based policymaking. However, developing countries face persistent challenges in achieving this due to fragmented healthcare data sources, inconsistent record-keeping practices, and the absence of standardized patient identifiers, limiting reliable record linkage,…
▽ More
Centralized electronic health record repositories are critical for advancing disease surveillance, public health research, and evidence-based policymaking. However, developing countries face persistent challenges in achieving this due to fragmented healthcare data sources, inconsistent record-keeping practices, and the absence of standardized patient identifiers, limiting reliable record linkage, compromise data interoperability, and limit scalability-obstacles exacerbated by infrastructural constraints and privacy concerns. To address these barriers, this study proposes a scalable, privacy-preserving clinical data warehouse, NCDW, designed for heterogeneous EHR integration in resource-limited settings and tested with 1.16 million clinical records. The framework incorporates a wrapper-based data acquisition layer for secure, automated ingestion of multisource health data and introduces a soundex algorithm to resolve patient identity mismatches in the absence of unique IDs. A modular data mart is designed for disease-specific analytics, demonstrated through a dengue fever case study in Bangladesh, integrating clinical, demographic, and environmental data for outbreak prediction and resource planning. Quantitative assessment of the data mart underscores its utility in strengthening national decision-support systems, highlighting the model's adaptability for infectious disease management. Comparative evaluation of database technologies reveals NoSQL outperforms relational SQL by 40-69% in complex query processing, while system load estimates validate the architecture's capacity to manage 19 million daily records (34TB over 5 years). The framework can be adapted to various healthcare settings across developing nations by modifying the ingestion layer to accommodate standards like ICD-11 and HL7 FHIR, facilitating interoperability for managing infectious diseases (i.e., COVID, tuberculosis).
△ Less
Submitted 2 May, 2025; v1 submitted 23 February, 2025;
originally announced February 2025.
-
ANCHOLIK-NER: A Benchmark Dataset for Bangla Regional Named Entity Recognition
Authors:
Bidyarthi Paul,
Faika Fairuj Preotee,
Shuvashis Sarker,
Shamim Rahim Refat,
Shifat Islam,
Tashreef Muhammad,
Mohammad Ashraful Hoque,
Shahriar Manzoor
Abstract:
Named Entity Recognition (NER) in regional dialects is a critical yet underexplored area in Natural Language Processing (NLP), especially for low-resource languages like Bangla. While NER systems for Standard Bangla have made progress, no existing resources or models specifically address the challenge of regional dialects such as Barishal, Chittagong, Mymensingh, Noakhali, and Sylhet, which exhibi…
▽ More
Named Entity Recognition (NER) in regional dialects is a critical yet underexplored area in Natural Language Processing (NLP), especially for low-resource languages like Bangla. While NER systems for Standard Bangla have made progress, no existing resources or models specifically address the challenge of regional dialects such as Barishal, Chittagong, Mymensingh, Noakhali, and Sylhet, which exhibit unique linguistic features that existing models fail to handle effectively. To fill this gap, we introduce ANCHOLIK-NER, the first benchmark dataset for NER in Bangla regional dialects, comprising 17,405 sentences distributed across five regions. The dataset was sourced from publicly available resources and supplemented with manual translations, ensuring alignment of named entities across dialects. We evaluate three transformer-based models - Bangla BERT, Bangla BERT Base, and BERT Base Multilingual Cased - on this dataset. Our findings demonstrate that BERT Base Multilingual Cased performs best in recognizing named entities across regions, with significant performance observed in Mymensingh with an F1-score of 82.611%. Despite strong overall performance, challenges remain in region like Chittagong, where the models show lower precision and recall. Since no previous NER systems for Bangla regional dialects exist, our work represents a foundational step in addressing this gap. Future work will focus on improving model performance in underperforming regions and expanding the dataset to include more dialects, enhancing the development of dialect-aware NER systems.
△ Less
Submitted 27 May, 2025; v1 submitted 16 February, 2025;
originally announced February 2025.
-
Analysis of Robust and Secure DNS Protocols for IoT Devices
Authors:
Abdullah Aydeger,
Sanzida Hoque,
Engin Zeydan,
Kapal Dev
Abstract:
The DNS (Domain Name System) protocol has been in use since the early days of the Internet. Although DNS as a de facto networking protocol had no security considerations in its early years, there have been many security enhancements, such as DNSSec (Domain Name System Security Extensions), DoT (DNS over Transport Layer Security), DoH (DNS over HTTPS) and DoQ (DNS over QUIC). With all these securit…
▽ More
The DNS (Domain Name System) protocol has been in use since the early days of the Internet. Although DNS as a de facto networking protocol had no security considerations in its early years, there have been many security enhancements, such as DNSSec (Domain Name System Security Extensions), DoT (DNS over Transport Layer Security), DoH (DNS over HTTPS) and DoQ (DNS over QUIC). With all these security improvements, it is not yet clear what resource-constrained Internet-of-Things (IoT) devices should be used for robustness. In this paper, we investigate different DNS security approaches using an edge DNS resolver implemented as a Virtual Network Function (VNF) to replicate the impact of the protocol from an IoT perspective and compare their performances under different conditions. We present our results for cache-based and non-cached responses and evaluate the corresponding security benefits. Our results and framework can greatly help consumers, manufacturers, and the research community decide and implement their DNS protocols depending on the given dynamic network conditions and enable robust Internet access via DNS for different devices.
△ Less
Submitted 13 February, 2025;
originally announced February 2025.
-
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Authors:
Ahmed Masry,
Juan A. Rodriguez,
Tianyu Zhang,
Suyuchen Wang,
Chao Wang,
Aarash Feizi,
Akshay Kalkunte Suresh,
Abhay Puri,
Xiangru Jian,
Pierre-André Noël,
Sathwik Tejaswi Madhusudhan,
Marco Pedersoli,
Bang Liu,
Nicolas Chapados,
Yoshua Bengio,
Enamul Hoque,
Christopher Pal,
Issam H. Laradji,
David Vazquez,
Perouz Taslakian,
Spandana Gella,
Sai Rajeswar
Abstract:
Aligning visual features with language embeddings is a key challenge in vision-language models (VLMs). The performance of such models hinges on having a good connector that maps visual features generated by a vision encoder to a shared embedding space with the LLM while preserving semantic similarity. Existing connectors, such as multilayer perceptrons (MLPs), often produce out-of-distribution or…
▽ More
Aligning visual features with language embeddings is a key challenge in vision-language models (VLMs). The performance of such models hinges on having a good connector that maps visual features generated by a vision encoder to a shared embedding space with the LLM while preserving semantic similarity. Existing connectors, such as multilayer perceptrons (MLPs), often produce out-of-distribution or noisy inputs, leading to misalignment between the modalities. In this work, we propose a novel vision-text alignment method, AlignVLM, that maps visual features to a weighted average of LLM text embeddings. Our approach leverages the linguistic priors encoded by the LLM to ensure that visual features are mapped to regions of the space that the LLM can effectively interpret. AlignVLM is particularly effective for document understanding tasks, where scanned document images must be accurately mapped to their textual content. Our extensive experiments show that AlignVLM achieves state-of-the-art performance compared to prior alignment methods. We provide further analysis demonstrating improved vision-text feature alignment and robustness to noise.
△ Less
Submitted 3 February, 2025;
originally announced February 2025.
-
DFCon: Attention-Driven Supervised Contrastive Learning for Robust Deepfake Detection
Authors:
MD Sadik Hossain Shanto,
Mahir Labib Dihan,
Souvik Ghosh,
Riad Ahmed Anonto,
Hafijul Hoque Chowdhury,
Abir Muhtasim,
Rakib Ahsan,
MD Tanvir Hassan,
MD Roqunuzzaman Sojib,
Sheikh Azizul Hakim,
M. Saifur Rahman
Abstract:
This report presents our approach for the IEEE SP Cup 2025: Deepfake Face Detection in the Wild (DFWild-Cup), focusing on detecting deepfakes across diverse datasets. Our methodology employs advanced backbone models, including MaxViT, CoAtNet, and EVA-02, fine-tuned using supervised contrastive loss to enhance feature separation. These models were specifically chosen for their complementary streng…
▽ More
This report presents our approach for the IEEE SP Cup 2025: Deepfake Face Detection in the Wild (DFWild-Cup), focusing on detecting deepfakes across diverse datasets. Our methodology employs advanced backbone models, including MaxViT, CoAtNet, and EVA-02, fine-tuned using supervised contrastive loss to enhance feature separation. These models were specifically chosen for their complementary strengths. Integration of convolution layers and strided attention in MaxViT is well-suited for detecting local features. In contrast, hybrid use of convolution and attention mechanisms in CoAtNet effectively captures multi-scale features. Robust pretraining with masked image modeling of EVA-02 excels at capturing global features. After training, we freeze the parameters of these models and train the classification heads. Finally, a majority voting ensemble is employed to combine the predictions from these models, improving robustness and generalization to unseen scenarios. The proposed system addresses the challenges of detecting deepfakes in real-world conditions and achieves a commendable accuracy of 95.83% on the validation dataset.
△ Less
Submitted 27 January, 2025;
originally announced January 2025.
-
A multi-purpose reciprocating probe drive system for studying the effect of gas-puffs on edge plasma dynamics in the ADITYA-U tokamak
Authors:
Kaushlender Singh,
Bharat Hegde,
Ashok K. Kumawat,
Ankit Kumar,
M. S. Khan,
Suman Dolui,
Injamul Hoque,
Tanmay Macwan,
Sharvil Patel,
Abha Kanik,
Komal Yadav,
Soumitra Banerjee,
Harshita Raj,
Devilal Kumawat,
Pramila Gautam,
Rohit Kumar,
Suman Aich,
Laxmikanta Pradhan,
Ankit Patel,
Kalpesh Galodiya,
Abhijeet Kumar,
Shwetang Pandya,
K. M. Patel,
K. A. Jadeja,
D. C. Raval
, et al. (2 additional authors not shown)
Abstract:
This article reports the development of a versatile high-speed reciprocating drive system (HRDS) with interchangeable probe heads to characterize the edge plasma region of ADITYA-U tokamak. This reciprocating probe drive system consisting of Langmuir and magnetic probe heads, is designed, fabricated, installed, and operated for studying the extent of fuel/impurity gas propagation and its influence…
▽ More
This article reports the development of a versatile high-speed reciprocating drive system (HRDS) with interchangeable probe heads to characterize the edge plasma region of ADITYA-U tokamak. This reciprocating probe drive system consisting of Langmuir and magnetic probe heads, is designed, fabricated, installed, and operated for studying the extent of fuel/impurity gas propagation and its influence on plasma dynamics in the far-edge region inside the last closed magnetic flux surface (LCFS). The HRDS is driven by a highly accurate, easy-to-control, dynamic, brushless, permanently excited synchronous servo motor operated by a PXI-commanded controller. The system is remotely operated and allows for precise control of the speed, acceleration, and distance traveled of the probe head on a shot-to-shot basis, facilitating seamless control of operations according to experimental requirements. Using this system, consisting of a linear array of Langmuir probes, measurements of plasma density, temperature, potential, and their fluctuations revealed that the fuel gas-puff impact these mean and fluctuating parameters up to three to four cm inside the LCFS. Attaching an array of magnetic probes to this system led to measurements of magnetic fluctuations inside the LCFS. The HRDS system is fully operational and serves as an important diagnostic tool for ADITYA-U tokamak.
△ Less
Submitted 8 January, 2025;
originally announced January 2025.
-
Stabilization of sawteeth instability by short gas pulse injection in ADITYA-U tokamak
Authors:
Suman Dolui,
Kaushlender Singh,
Bharat Hegde,
T. Macwan,
SK Injamul Hoque,
Umesh Nagora,
Jaya Kumar A.,
S. Purohit,
A. N. Adhiya,
K. A. Jadeja,
Harshita Raj,
Ankit Kumar,
Ashok K. Kumawat,
Suman Aich,
Rohit Kumar,
K. M. Patel,
P. Gautam,
Sharvil Patel,
N. Yadava,
N. Ramaiya,
M. K. Gupta,
S. K. Pathak,
M. B. Chowdhuri,
S. Sharma,
A. Kuley
, et al. (6 additional authors not shown)
Abstract:
Experiments on ADITYA-U tokamak show a marked enhancement in the sawtooth period by application of short gas puffs of fuel that cause a modification of the radial density profile. A consequent suppression of the trapped electron modes (TEMs) then leads to an increase in the core electron temperature. This slows down the heat propagation following a sawtooth crash, causing a delay in achieving the…
▽ More
Experiments on ADITYA-U tokamak show a marked enhancement in the sawtooth period by application of short gas puffs of fuel that cause a modification of the radial density profile. A consequent suppression of the trapped electron modes (TEMs) then leads to an increase in the core electron temperature. This slows down the heat propagation following a sawtooth crash, causing a delay in achieving the critical temperature gradient inside the q = 1 surface required for the next sawtooth crash to happen. The overall scenario has strong similarities with the behavior of sawtooth under electron cyclotron resonance heating (ECRH). Our findings suggest an alternate, simpler technique for sawtooth control that may be usefully employed in small/medium-sized tokamaks that do not have an ECRH or any other auxiliary heating facility.
△ Less
Submitted 3 January, 2025;
originally announced January 2025.
-
Securing Wi-Fi 6 Connection Establishment Against Relay and Spoofing Threats
Authors:
Naureen Hoque,
Hanif Rahbari
Abstract:
Wireless local area networks remain vulnerable to attacks initiated during the connection establishment (CE) phase. Current Wi-Fi security protocols fail to fully mitigate attacks like man-in-the-middle, preamble spoofing, and relaying. To fortify the CE phase, in this paper we design a backward-compatible scheme using a digital signature interwoven into the preambles at the physical (PHY) layer w…
▽ More
Wireless local area networks remain vulnerable to attacks initiated during the connection establishment (CE) phase. Current Wi-Fi security protocols fail to fully mitigate attacks like man-in-the-middle, preamble spoofing, and relaying. To fortify the CE phase, in this paper we design a backward-compatible scheme using a digital signature interwoven into the preambles at the physical (PHY) layer with time constraints to effectively counter those attacks. This approach slices a MAC-layer signature and embeds the slices within CE frame preambles without extending frame size, allowing one or multiple stations to concurrently verify their respective APs' transmissions. The concurrent CEs are supported by enabling the stations to analyze the consistent patterns of PHY-layer headers and identify whether the received frames are the anticipated ones from the expected APs, achieving 100% accuracy without needing to examine their MAC-layer headers. Additionally, we design and implement a fast relay attack to challenge our proposed defense and determine its effectiveness. We extend existing open-source tools to support IEEE 802.11ax to evaluate the effectiveness and practicality of our proposed scheme in a testbed consisting of USRPs, commercial APs, and Wi-Fi devices, and we show that our relay attack detection achieves 96-100% true positive rates. Finally, end-to-end formal security analyses confirm the security and correctness of the proposed solution.
△ Less
Submitted 2 January, 2025;
originally announced January 2025.
-
JWST-ALMA Study of a Hub-Filament System in the Nascent Phase
Authors:
N. K. Bhadari,
L. K. Dewangan,
O. R. Jadhav,
Ariful Hoque,
L. E. Pirogov,
Paul F. Goldsmith,
A. K. Maity,
Saurabh Sharma,
A. Haj Ismail,
Tapas Baug
Abstract:
Star clusters, including high-mass stars, form within hub-filament systems (HFSs). Observations of HFSs that remain unaffected by feedback from embedded stars are rare yet crucial for understanding the mass inflow process in high-mass star formation. Using the JWST NIRCAM images, Dewangan et al. 2024, reported that the high-mass protostar G11P1 is embedded in a candidate HFS (G11P1-HFS; $<0.6$ pc)…
▽ More
Star clusters, including high-mass stars, form within hub-filament systems (HFSs). Observations of HFSs that remain unaffected by feedback from embedded stars are rare yet crucial for understanding the mass inflow process in high-mass star formation. Using the JWST NIRCAM images, Dewangan et al. 2024, reported that the high-mass protostar G11P1 is embedded in a candidate HFS (G11P1-HFS; $<0.6$ pc). Utilizing ALMA N$_{2}$H$^{+}$(1-0) data, we confirm the presence of G11P1-HFS and study the dense gas kinematics. We analyzed the position-position-velocity (PPV) map and estimated on-sky velocity gradient ($V_g$) and gravity ($\mathcal{F}_{g}$) vectors. The spatial distribution of gas velocity and H$_2$ column density was examined. The steep $V_g$ of 5 km s$^{-1}$ pc$^{-1}$ and $-$7 km s$^{-1}$ pc$^{-1}$ toward either side of G11P1-hub, and the decreasing $V_g$ toward the hub, identify G11P1-HFS as a small-scale HFS in its nascent phase. $V_g$ and $\mathcal{F}_{g}$ align along the filaments, indicating gravity-driven flows. This work highlights the wiggled, funnel-shaped morphology of a HFS in PPV space, suggesting the importance of subfilaments or transverse gas flows in mass transportation to the hub.
△ Less
Submitted 31 December, 2024;
originally announced January 2025.
-
Parameterized families of quadratic fields with $n$-rank at least 2
Authors:
Azizul Hoque,
Srinivas Kotyada
Abstract:
We construct parameterized families of imaginary (resp. real) quadratic fields whose class groups have $n$-rank at least $2$.
We construct parameterized families of imaginary (resp. real) quadratic fields whose class groups have $n$-rank at least $2$.
△ Less
Submitted 28 December, 2024;
originally announced December 2024.
-
ARMADA: Augmented Reality for Robot Manipulation and Robot-Free Data Acquisition
Authors:
Nataliya Nechyporenko,
Ryan Hoque,
Christopher Webb,
Mouli Sivapurapu,
Jian Zhang
Abstract:
Teleoperation for robot imitation learning is bottlenecked by hardware availability. Can high-quality robot data be collected without a physical robot? We present a system for augmenting Apple Vision Pro with real-time virtual robot feedback. By providing users with an intuitive understanding of how their actions translate to robot motions, we enable the collection of natural barehanded human data…
▽ More
Teleoperation for robot imitation learning is bottlenecked by hardware availability. Can high-quality robot data be collected without a physical robot? We present a system for augmenting Apple Vision Pro with real-time virtual robot feedback. By providing users with an intuitive understanding of how their actions translate to robot motions, we enable the collection of natural barehanded human data that is compatible with the limitations of physical robot hardware. We conducted a user study with 15 participants demonstrating 3 different tasks each under 3 different feedback conditions and directly replayed the collected trajectories on physical robot hardware. Results suggest live robot feedback dramatically improves the quality of the collected data, suggesting a new avenue for scalable human data collection without access to robot hardware. Videos and more are available at https://nataliya.dev/armada.
△ Less
Submitted 13 December, 2024;
originally announced December 2024.
-
HadaCore: Tensor Core Accelerated Hadamard Transform Kernel
Authors:
Krish Agarwal,
Rishi Astra,
Adnan Hoque,
Mudhakar Srivatsa,
Raghu Ganti,
Less Wright,
Sijia Chen
Abstract:
We present HadaCore, a modified Fast Walsh-Hadamard Transform (FWHT) algorithm optimized for the Tensor Cores present in modern GPU hardware. HadaCore follows the recursive structure of the original FWHT algorithm, achieving the same asymptotic runtime complexity but leveraging a hardware-aware work decomposition that benefits from Tensor Core acceleration. This reduces bottlenecks from compute an…
▽ More
We present HadaCore, a modified Fast Walsh-Hadamard Transform (FWHT) algorithm optimized for the Tensor Cores present in modern GPU hardware. HadaCore follows the recursive structure of the original FWHT algorithm, achieving the same asymptotic runtime complexity but leveraging a hardware-aware work decomposition that benefits from Tensor Core acceleration. This reduces bottlenecks from compute and data exchange. On Nvidia A100 and H100 GPUs, HadaCore achieves speedups of 1.1-1.4x and 1.0-1.3x, with a peak gain of 3.5x and 3.6x respectively, when compared to the existing state-of-the-art implementation of the original algorithm. We also show that when using FP16 or BF16, our implementation is numerically accurate, enabling comparable accuracy on MMLU benchmarks when used in an end-to-end Llama3 inference run with quantized (FP8) attention.
△ Less
Submitted 11 December, 2024;
originally announced December 2024.
-
Accurate Water Level Monitoring in AWD Rice Cultivation Using Convolutional Neural Networks
Authors:
Ahmed Rafi Hasan,
Niloy Kumar Kundu,
Saad Hasan,
Mohammad Rashedul Hoque,
Swakkhar Shatabda
Abstract:
The Alternate Wetting and Drying (AWD) method is a rice-growing water management technique promoted as a sustainable alternative to Continuous Flooding (CF). Climate change has placed the agricultural sector in a challenging position, particularly as global water resources become increasingly scarce, affecting rice production on irrigated lowlands. Rice, a staple food for over half of the world's…
▽ More
The Alternate Wetting and Drying (AWD) method is a rice-growing water management technique promoted as a sustainable alternative to Continuous Flooding (CF). Climate change has placed the agricultural sector in a challenging position, particularly as global water resources become increasingly scarce, affecting rice production on irrigated lowlands. Rice, a staple food for over half of the world's population, demands significantly more water than other major crops. In Bangladesh, Boro rice, in particular, requires considerable water inputs during its cultivation. Traditionally, farmers manually measure water levels, a process that is both time-consuming and prone to errors. While ultrasonic sensors offer improvements in water height measurement, they still face limitations, such as susceptibility to weather conditions and environmental factors. To address these issues, we propose a novel approach that automates water height measurement using computer vision, specifically through a convolutional neural network (CNN). Our attention-based architecture achieved an $R^2$ score of 0.9885 and a Mean Squared Error (MSE) of 0.2766, providing a more accurate and efficient solution for managing AWD systems.
△ Less
Submitted 12 December, 2024; v1 submitted 11 December, 2024;
originally announced December 2024.
-
Scintillations in Southern Europe during the geomagnetic storm of June 2015: analysis of a plasma bubbles spill-over using ground-based data
Authors:
Anna Morozova,
Luca Spogli,
Teresa Barata,
Rayan Imam,
Emanuele Pica,
Juan Andrés Cahuasquí,
Mohammed Mainul Hoque,
Norbert Jakowski,
Daniela Estaço
Abstract:
The sensitivity of Global Navigation Satellite Systems (GNSS) receivers to ionospheric disturbances and their constant growth are nowadays resulting in an increased concern of GNSS-users about the impacts of ionospheric disturbances at mid-latitudes. The geomagnetic storm of June 2015 is an example of a rare phenomenon of a spill-over of equatorial plasma bubbles well North from their habitual. We…
▽ More
The sensitivity of Global Navigation Satellite Systems (GNSS) receivers to ionospheric disturbances and their constant growth are nowadays resulting in an increased concern of GNSS-users about the impacts of ionospheric disturbances at mid-latitudes. The geomagnetic storm of June 2015 is an example of a rare phenomenon of a spill-over of equatorial plasma bubbles well North from their habitual. We study the occurrence of small- and medium-scale irregularities in the North Atlantic Eastern-Mediterranean mid- and low-latitudinal zone by analysing the behaviour of the amplitude scintillation index S4 and of the rate of total electron content index (ROTI) during such a storm. In addition, large scale perturbations of the ionospheric electron density were studied using ground and space-born instruments, thus characterizing a complex perturbation behaviour over the region mentioned above. The multi-source data allows us to characterize the impact of irregularities of different scales to better understand the ionospheric dynamics and stress the importance of a proper monitoring of the ionosphere in the studied region.
△ Less
Submitted 9 December, 2024;
originally announced December 2024.
-
Zephyr quantum-assisted hierarchical Calo4pQVAE for particle-calorimeter interactions
Authors:
Ian Lu,
Hao Jia,
Sebastian Gonzalez,
Deniz Sogutlu,
J. Quetzalcoatl Toledo-Marin,
Sehmimul Hoque,
Abhishek Abhishek,
Colin Gay,
Roger Melko,
Eric Paquet,
Geoffrey Fox,
Maximilian Swiatlowski,
Wojciech Fedorko
Abstract:
With the approach of the High Luminosity Large Hadron Collider (HL-LHC) era set to begin particle collisions by the end of this decade, it is evident that the computational demands of traditional collision simulation methods are becoming increasingly unsustainable. Existing approaches, which rely heavily on first-principles Monte Carlo simulations for modeling event showers in calorimeters, are pr…
▽ More
With the approach of the High Luminosity Large Hadron Collider (HL-LHC) era set to begin particle collisions by the end of this decade, it is evident that the computational demands of traditional collision simulation methods are becoming increasingly unsustainable. Existing approaches, which rely heavily on first-principles Monte Carlo simulations for modeling event showers in calorimeters, are projected to require millions of CPU-years annually -- far exceeding current computational capacities. This bottleneck presents an exciting opportunity for advancements in computational physics by integrating deep generative models with quantum simulations. We propose a quantum-assisted hierarchical deep generative surrogate founded on a variational autoencoder (VAE) in combination with an energy conditioned restricted Boltzmann machine (RBM) embedded in the model's latent space as a prior. By mapping the topology of D-Wave's Zephyr quantum annealer (QA) into the nodes and couplings of a 4-partite RBM, we leverage quantum simulation to accelerate our shower generation times significantly. To evaluate our framework, we use Dataset 2 of the CaloChallenge 2022. Through the integration of classical computation and quantum simulation, this hybrid framework paves way for utilizing large-scale quantum simulations as priors in deep generative models.
△ Less
Submitted 5 December, 2024;
originally announced December 2024.
-
Sato-Tate Groups and Distributions of $y^\ell=x(x^\ell-1)$
Authors:
Heidi Goodson,
Rezwan Hoque
Abstract:
Let $C_\ell/\mathbb Q$ denote the curve with affine model $y^\ell=x(x^\ell-1)$, where $\ell\geq 3$ is prime. In this paper we study the limiting distributions of the normalized $L$-polynomials of the curves by computing their Sato-Tate groups and distributions. We also provide results for the number of points on the curves over finite fields, including a formula in terms of Jacobi sums when the fi…
▽ More
Let $C_\ell/\mathbb Q$ denote the curve with affine model $y^\ell=x(x^\ell-1)$, where $\ell\geq 3$ is prime. In this paper we study the limiting distributions of the normalized $L$-polynomials of the curves by computing their Sato-Tate groups and distributions. We also provide results for the number of points on the curves over finite fields, including a formula in terms of Jacobi sums when the field $\mathbb F_q$ satisfies $q\equiv 1 \pmod{\ell^2}$.
△ Less
Submitted 3 December, 2024;
originally announced December 2024.
-
The $SO(1,4)$ flux-balance laws of de Sitter at quadrupolar order
Authors:
Geoffrey Compère,
Sk Jahanur Hoque,
Emine Şeyma Kutluk
Abstract:
The linear solution for quadrupolar perturbations around de Sitter spacetime was recently constructed. In this paper, we provide the flux-balance laws for each background symmetry (dilatations, rotations, spatial translations and cosmological boosts) in terms of source moments at quadrupolar order. We write the dilatation flux-balance law in two distinct ways, which allows to contrast two distinct…
▽ More
The linear solution for quadrupolar perturbations around de Sitter spacetime was recently constructed. In this paper, we provide the flux-balance laws for each background symmetry (dilatations, rotations, spatial translations and cosmological boosts) in terms of source moments at quadrupolar order. We write the dilatation flux-balance law in two distinct ways, which allows to contrast two distinct proposals for the negative definite energy flux. The standard Poincaré flux balance laws at future null infinity are recovered in the flat limit of the $SO(1,4)$ flux-balance laws.
△ Less
Submitted 24 February, 2025; v1 submitted 25 November, 2024;
originally announced November 2024.
-
Machine-agnostic Automated Lumbar MRI Segmentation using a Cascaded Model Based on Generative Neurons
Authors:
Promit Basak,
Rusab Sarmun,
Saidul Kabir,
Israa Al-Hashimi,
Enamul Hoque Bhuiyan,
Anwarul Hasan,
Muhammad Salman Khan,
Muhammad E. H. Chowdhury
Abstract:
Automated lumbar spine segmentation is very crucial for modern diagnosis systems. In this study, we introduce a novel machine-agnostic approach for segmenting lumbar vertebrae and intervertebral discs from MRI images, employing a cascaded model that synergizes an ROI detection and a Self-organized Operational Neural Network (Self-ONN)-based encoder-decoder network for segmentation. Addressing the…
▽ More
Automated lumbar spine segmentation is very crucial for modern diagnosis systems. In this study, we introduce a novel machine-agnostic approach for segmenting lumbar vertebrae and intervertebral discs from MRI images, employing a cascaded model that synergizes an ROI detection and a Self-organized Operational Neural Network (Self-ONN)-based encoder-decoder network for segmentation. Addressing the challenge of diverse MRI modalities, our methodology capitalizes on a unique dataset comprising images from 12 scanners and 34 subjects, enhanced through strategic preprocessing and data augmentation techniques. The YOLOv8 medium model excels in ROI extraction, achieving an excellent performance of 0.916 mAP score. Significantly, our Self-ONN-based model, combined with a DenseNet121 encoder, demonstrates excellent performance in lumbar vertebrae and IVD segmentation with a mean Intersection over Union (IoU) of 83.66%, a sensitivity of 91.44%, and Dice Similarity Coefficient (DSC) of 91.03%, as validated through rigorous 10-fold cross-validation. This study not only showcases an effective approach to MRI segmentation in spine-related disorders but also sets the stage for future advancements in automated diagnostic tools, emphasizing the need for further dataset expansion and model refinement for broader clinical applicability.
△ Less
Submitted 23 November, 2024;
originally announced November 2024.
-
Hardware Accelerators for Artificial Intelligence
Authors:
S M Mojahidul Ahsan,
Anurag Dhungel,
Mrittika Chowdhury,
Md Sakib Hasan,
Tamzidul Hoque
Abstract:
In this chapter, we aim to explore an in-depth exploration of the specialized hardware accelerators designed to enhance Artificial Intelligence (AI) applications, focusing on their necessity, development, and impact on the field of AI. It covers the transition from traditional computing systems to advanced AI-specific hardware, addressing the growing demands of AI algorithms and the inefficiencies…
▽ More
In this chapter, we aim to explore an in-depth exploration of the specialized hardware accelerators designed to enhance Artificial Intelligence (AI) applications, focusing on their necessity, development, and impact on the field of AI. It covers the transition from traditional computing systems to advanced AI-specific hardware, addressing the growing demands of AI algorithms and the inefficiencies of conventional architectures. The discussion extends to various types of accelerators, including GPUs, FPGAs, and ASICs, and their roles in optimizing AI workloads. Additionally, it touches on the challenges and considerations in designing and implementing these accelerators, along with future prospects in the evolution of AI hardware. This comprehensive overview aims to equip readers with a clear understanding of the current landscape and future directions in AI hardware development, making it accessible to both experts and newcomers to the field.
△ Less
Submitted 18 December, 2024; v1 submitted 20 November, 2024;
originally announced November 2024.