Skip to main content

Showing 1–50 of 911 results for author: Islam, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.02092  [pdf, ps, other

    cs.LG cs.AI cs.CL cs.CV

    Energy-Based Transformers are Scalable Learners and Thinkers

    Authors: Alexi Gladstone, Ganesh Nanduru, Md Mofijul Islam, Peixuan Han, Hyeonjeong Ha, Aman Chadha, Yilun Du, Heng Ji, Jundong Li, Tariq Iqbal

    Abstract: Inference-time computation techniques, analogous to human System 2 Thinking, have recently become popular for improving model performances. However, most existing approaches suffer from several limitations: they are modality-specific (e.g., working only in text), problem-specific (e.g., verifiable domains like math and coding), or require additional supervision/training on top of unsupervised pret… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

  2. arXiv:2507.01788  [pdf, ps, other

    cs.CV cs.AI

    Are Vision Transformer Representations Semantically Meaningful? A Case Study in Medical Imaging

    Authors: Montasir Shams, Chashi Mahiul Islam, Shaeke Salman, Phat Tran, Xiuwen Liu

    Abstract: Vision transformers (ViTs) have rapidly gained prominence in medical imaging tasks such as disease classification, segmentation, and detection due to their superior accuracy compared to conventional deep learning models. However, due to their size and complex interactions via the self-attention mechanism, they are not well understood. In particular, it is unclear whether the representations produc… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

    Comments: 9 pages

  3. arXiv:2506.23714  [pdf, ps, other

    cs.CV cs.CL

    Towards an Automated Multimodal Approach for Video Summarization: Building a Bridge Between Text, Audio and Facial Cue-Based Summarization

    Authors: Md Moinul Islam, Sofoklis Kakouros, Janne Heikkilä, Mourad Oussalah

    Abstract: The increasing volume of video content in educational, professional, and social domains necessitates effective summarization techniques that go beyond traditional unimodal approaches. This paper proposes a behaviour-aware multimodal video summarization framework that integrates textual, audio, and visual cues to generate timestamp-aligned summaries. By extracting prosodic features, textual cues an… ▽ More

    Submitted 30 June, 2025; originally announced June 2025.

    Comments: Accepted to HHAI WS 2025: Workshops at the Fourth International Conference on Hybrid Human-Artificial Intelligence (HHAI)

  4. arXiv:2506.22742  [pdf, ps, other

    cs.SE cs.AI

    RAILS: Retrieval-Augmented Intelligence for Learning Software Development

    Authors: Wali Mohammad Abdullah, Md. Morshedul Islam, Devraj Parmar, Happy Hasmukhbhai Patel, Sindhuja Prabhakaran, Baidya Saha

    Abstract: Large Language Models (LLMs) like GPT-3.5-Turbo are increasingly used to assist software development, yet they often produce incomplete code or incorrect imports, especially when lacking access to external or project-specific documentation. We introduce RAILS (Retrieval-Augmented Intelligence for Learning Software Development), a framework that augments LLM prompts with semantically retrieved cont… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

  5. arXiv:2506.19870  [pdf

    cs.CR cs.AI cs.LG

    Secure Energy Transactions Using Blockchain Leveraging AI for Fraud Detection and Energy Market Stability

    Authors: Md Asif Ul Hoq Khan, MD Zahedul Islam, Istiaq Ahmed, Md Masud Karim Rabbi, Farhana Rahman Anonna, MD Abdul Fahim Zeeshan, Mehedi Hasan Ridoy, Bivash Ranjan Chowdhury, Md Nazmul Shakir Rabbi, GM Alamin Sadnan

    Abstract: Peer-to-peer trading and the move to decentralized grids have reshaped the energy markets in the United States. Notwithstanding, such developments lead to new challenges, mainly regarding the safety and authenticity of energy trade. This study aimed to develop and build a secure, intelligent, and efficient energy transaction system for the decentralized US energy market. This research interlinks t… ▽ More

    Submitted 21 June, 2025; originally announced June 2025.

  6. arXiv:2506.18927  [pdf, ps, other

    cs.LG

    From Tiny Machine Learning to Tiny Deep Learning: A Survey

    Authors: Shriyank Somvanshi, Md Monzurul Islam, Gaurab Chhetri, Rohit Chakraborty, Mahmuda Sultana Mimi, Sawgat Ahmed Shuvo, Kazi Sifatul Islam, Syed Aaqib Javed, Sharif Ahmed Rafat, Anandi Dutta, Subasish Das

    Abstract: The rapid growth of edge devices has driven the demand for deploying artificial intelligence (AI) at the edge, giving rise to Tiny Machine Learning (TinyML) and its evolving counterpart, Tiny Deep Learning (TinyDL). While TinyML initially focused on enabling simple inference tasks on microcontrollers, the emergence of TinyDL marks a paradigm shift toward deploying deep learning models on severely… ▽ More

    Submitted 25 June, 2025; v1 submitted 21 June, 2025; originally announced June 2025.

  7. arXiv:2506.16009  [pdf

    cs.LG

    Bridging Brain with Foundation Models through Self-Supervised Learning

    Authors: Hamdi Altaheri, Fakhri Karray, Md. Milon Islam, S M Taslim Uddin Raju, Amir-Hossein Karimi

    Abstract: Foundation models (FMs), powered by self-supervised learning (SSL), have redefined the capabilities of artificial intelligence, demonstrating exceptional performance in domains like natural language processing and computer vision. These advances present a transformative opportunity for brain signal analysis. Unlike traditional supervised learning, which is limited by the scarcity of labeled neural… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

  8. arXiv:2506.14629  [pdf, ps, other

    cs.CV cs.CL

    VisText-Mosquito: A Multimodal Dataset and Benchmark for AI-Based Mosquito Breeding Site Detection and Reasoning

    Authors: Md. Adnanul Islam, Md. Faiyaz Abdullah Sayeedi, Md. Asaduzzaman Shuvo, Muhammad Ziaur Rahman, Shahanur Rahman Bappy, Raiyan Rahman, Swakkhar Shatabda

    Abstract: Mosquito-borne diseases pose a major global health risk, requiring early detection and proactive control of breeding sites to prevent outbreaks. In this paper, we present VisText-Mosquito, a multimodal dataset that integrates visual and textual data to support automated detection, segmentation, and reasoning for mosquito breeding site analysis. The dataset includes 1,828 annotated images for objec… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  9. arXiv:2506.11451  [pdf, ps, other

    cs.SE

    Understanding the Issue Types in Open Source Blockchain-based Software Projects with the Transformer-based BERTopic

    Authors: Md Nahidul Islam Opu, Md Shahidul Islam, Sara Rouhani, Shaiful Chowdhury

    Abstract: Blockchain-based software systems are increasingly deployed across diverse domains, yet a systematic understanding of their development challenges remains limited. This paper presents a large-scale empirical study of 497,742 issues mined from 1,209 open-source blockchain projects hosted on GitHub. Employing BERTopic, a transformer-based topic modeling technique, we identify 49 distinct issue topic… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  10. arXiv:2506.08051  [pdf, ps, other

    cs.LG

    ST-GraphNet: A Spatio-Temporal Graph Neural Network for Understanding and Predicting Automated Vehicle Crash Severity

    Authors: Mahmuda Sultana Mimi, Md Monzurul Islam, Anannya Ghosh Tusti, Shriyank Somvanshi, Subasish Das

    Abstract: Understanding the spatial and temporal dynamics of automated vehicle (AV) crash severity is critical for advancing urban mobility safety and infrastructure planning. In this work, we introduce ST-GraphNet, a spatio-temporal graph neural network framework designed to model and predict AV crash severity by using both fine-grained and region-aggregated spatial graphs. Using a balanced dataset of 2,35… ▽ More

    Submitted 8 June, 2025; originally announced June 2025.

  11. arXiv:2506.06989  [pdf, ps, other

    cs.IR cs.LG

    Correcting for Position Bias in Learning to Rank: A Control Function Approach

    Authors: Md Aminul Islam, Kathryn Vasilaky, Elena Zheleva

    Abstract: Implicit feedback data, such as user clicks, is commonly used in learning-to-rank (LTR) systems because it is easy to collect and it often reflects user preferences. However, this data is prone to various biases, and training an LTR system directly on biased data can result in suboptimal ranking performance. One of the most prominent and well-studied biases in implicit feedback data is position bi… ▽ More

    Submitted 8 June, 2025; originally announced June 2025.

  12. arXiv:2506.04696  [pdf, ps, other

    cs.LG

    Enhanced Drought Analysis in Bangladesh: A Machine Learning Approach for Severity Classification Using Satellite Data

    Authors: Tonmoy Paul, Mrittika Devi Mati, Md. Mahmudul Islam

    Abstract: Drought poses a pervasive environmental challenge in Bangladesh, impacting agriculture, socio-economic stability, and food security due to its unique geographic and anthropogenic vulnerabilities. Traditional drought indices, such as the Standardized Precipitation Index (SPI) and Palmer Drought Severity Index (PDSI), often overlook crucial factors like soil moisture and temperature, limiting their… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

  13. arXiv:2506.04238  [pdf, other

    cs.NE cs.LG

    A Comprehensive Survey on Bio-Inspired Algorithms: Taxonomy, Applications, and Future Directions

    Authors: Shriyank Somvanshi, Md Monzurul Islam, Syed Aaqib Javed, Gaurab Chhetri, Kazi Sifatul Islam, Tausif Islam Chowdhury, Sazzad Bin Bashar Polock, Anandi Dutta, Subasish Das

    Abstract: Bio-inspired algorithms (BIAs) utilize natural processes such as evolution, swarm behavior, foraging, and plant growth to solve complex, nonlinear, high-dimensional optimization problems. This survey categorizes BIAs into eight groups: evolutionary, swarm intelligence, physics-inspired, ecosystem and plant-based, predator-prey, neural-inspired, human-inspired, and hybrid approaches, and reviews th… ▽ More

    Submitted 25 May, 2025; originally announced June 2025.

  14. arXiv:2506.03191  [pdf, ps, other

    cs.CV cs.AI

    Multimodal Generative AI with Autoregressive LLMs for Human Motion Understanding and Generation: A Way Forward

    Authors: Muhammad Islam, Tao Huang, Euijoon Ahn, Usman Naseem

    Abstract: This paper presents an in-depth survey on the use of multimodal Generative Artificial Intelligence (GenAI) and autoregressive Large Language Models (LLMs) for human motion understanding and generation, offering insights into emerging methods, architectures, and their potential to advance realistic and versatile motion synthesis. Focusing exclusively on text and motion modalities, this research inv… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

  15. arXiv:2506.03184  [pdf

    cs.CV cs.AI cs.LG

    Impact of Tuning Parameters in Deep Convolutional Neural Network Using a Crack Image Dataset

    Authors: Mahe Zabin, Ho-Jin Choi, Md. Monirul Islam, Jia Uddin

    Abstract: The performance of a classifier depends on the tuning of its parame ters. In this paper, we have experimented the impact of various tuning parameters on the performance of a deep convolutional neural network (DCNN). In the ex perimental evaluation, we have considered a DCNN classifier that consists of 2 convolutional layers (CL), 2 pooling layers (PL), 1 dropout, and a dense layer. To observe the… ▽ More

    Submitted 30 May, 2025; originally announced June 2025.

    Comments: 8 pages, 2 figures, published at Proceedings of the 15th KIPS International Conference on Ubiquitous Information Technologies and Applications (CUTE 2021), Jeju, Repubilc of Korea

  16. arXiv:2506.03160  [pdf

    cs.LG

    Applying MambaAttention, TabPFN, and TabTransformers to Classify SAE Automation Levels in Crashes

    Authors: Shriyank Somvanshi, Anannya Ghosh Tusti, Mahmuda Sultana Mimi, Md Monzurul Islam, Sazzad Bin Bashar Polock, Anandi Dutta, Subasish Das

    Abstract: The increasing presence of automated vehicles (AVs) presents new challenges for crash classification and safety analysis. Accurately identifying the SAE automation level involved in each crash is essential to understanding crash dynamics and system accountability. However, existing approaches often overlook automation-specific factors and lack model sophistication to capture distinctions between d… ▽ More

    Submitted 22 May, 2025; originally announced June 2025.

  17. Dual encoding feature filtering generalized attention UNET for retinal vessel segmentation

    Authors: Md Tauhidul Islam, Wu Da-Wen, Tang Qing-Qing, Zhao Kai-Yang, Yin Teng, Li Yan-Fei, Shang Wen-Yi, Liu Jing-Yu, Zhang Hai-Xian

    Abstract: Retinal blood vessel segmentation is crucial for diagnosing ocular and cardiovascular diseases. Although the introduction of U-Net in 2015 by Olaf Ronneberger significantly advanced this field, yet issues like limited training data, imbalance data distribution, and inadequate feature extraction persist, hindering both the segmentation performance and optimal model generalization. Addressing these… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    ACM Class: I.4; I.5

    Journal ref: J Sichuan Univ: Nat Sci Ed, 2025, 62: 79-95.

  18. arXiv:2506.01587  [pdf, other

    cs.CL

    Unified Large Language Models for Misinformation Detection in Low-Resource Linguistic Settings

    Authors: Muhammad Islam, Javed Ali Khan, Mohammed Abaker, Ali Daud, Azeem Irshad

    Abstract: The rapid expansion of social media platforms has significantly increased the dissemination of forged content and misinformation, making the detection of fake news a critical area of research. Although fact-checking efforts predominantly focus on English-language news, there is a noticeable gap in resources and strategies to detect news in regional languages, such as Urdu. Advanced Fake News Detec… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  19. arXiv:2506.00735  [pdf, ps, other

    cs.CV

    Involution-Infused DenseNet with Two-Step Compression for Resource-Efficient Plant Disease Classification

    Authors: T. Ahmed, S. Jannat, Md. F. Islam, J. Noor

    Abstract: Agriculture is vital for global food security, but crops are vulnerable to diseases that impact yield and quality. While Convolutional Neural Networks (CNNs) accurately classify plant diseases using leaf images, their high computational demands hinder their deployment in resource-constrained settings such as smartphones, edge devices, and real-time monitoring systems. This study proposes a two-ste… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

  20. arXiv:2505.24002  [pdf, ps, other

    cs.CV

    DGIQA: Depth-guided Feature Attention and Refinement for Generalizable Image Quality Assessment

    Authors: Vaishnav Ramesh, Junliang Liu, Haining Wang, Md Jahidul Islam

    Abstract: A long-held challenge in no-reference image quality assessment (NR-IQA) learning from human subjective perception is the lack of objective generalization to unseen natural distortions. To address this, we integrate a novel Depth-Guided cross-attention and refinement (Depth-CAR) mechanism, which distills scene depth and spatial features into a structure-aware representation for improved NR-IQA. Thi… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: 18 pages

  21. IKIWISI: An Interactive Visual Pattern Generator for Evaluating the Reliability of Vision-Language Models Without Ground Truth

    Authors: Md Touhidul Islam, Imran Kabir, Md Alimoor Reza, Syed Masum Billah

    Abstract: We present IKIWISI ("I Know It When I See It"), an interactive visual pattern generator for assessing vision-language models in video object recognition when ground truth is unavailable. IKIWISI transforms model outputs into a binary heatmap where green cells indicate object presence and red cells indicate object absence. This visualization leverages humans' innate pattern recognition abilities to… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

    Comments: Accepted at DIS'25 (Funchal, Portugal)

  22. arXiv:2505.21715  [pdf, ps, other

    eess.IV cs.AI cs.CV

    Privacy-Preserving Chest X-ray Report Generation via Multimodal Federated Learning with ViT and GPT-2

    Authors: Md. Zahid Hossain, Mustofa Ahmed, Most. Sharmin Sultana Samu, Md. Rakibul Islam

    Abstract: The automated generation of radiology reports from chest X-ray images holds significant promise in enhancing diagnostic workflows while preserving patient privacy. Traditional centralized approaches often require sensitive data transfer, posing privacy concerns. To address this, the study proposes a Multimodal Federated Learning framework for chest X-ray report generation using the IU-Xray dataset… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: Preprint, manuscript under-review

  23. arXiv:2505.20324  [pdf, ps, other

    cs.SE cs.AI

    Evaluating the Energy-Efficiency of the Code Generated by LLMs

    Authors: Md Arman Islam, Devi Varaprasad Jonnala, Ritika Rekhi, Pratik Pokharel, Siddharth Cilamkoti, Asif Imran, Tevfik Kosar, Bekir Turkkan

    Abstract: As the quality of code generated by Large Language Models (LLMs) improves, their adoption in the software industry for automated code generation continues to grow. Researchers primarily focus on enhancing the functional correctness of the generated code while commonly overlooking its energy efficiency and environmental impact. This paper investigates the energy efficiency of the code generated by… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

  24. Improving Bangla Linguistics: Advanced LSTM, Bi-LSTM, and Seq2Seq Models for Translating Sylheti to Modern Bangla

    Authors: Sourav Kumar Das, Md. Julkar Naeen, MD. Jahidul Islam, Md. Anisul Haque Sajeeb, Narayan Ranjan Chakraborty, Mayen Uddin Mojumdar

    Abstract: Bangla or Bengali is the national language of Bangladesh, people from different regions don't talk in proper Bangla. Every division of Bangladesh has its own local language like Sylheti, Chittagong etc. In recent years some papers were published on Bangla language like sentiment analysis, fake news detection and classifications, but a few of them were on Bangla languages. This research is for the… ▽ More

    Submitted 24 May, 2025; originally announced May 2025.

    Comments: 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)

    Journal ref: 2024 15th Int. Conf. on Computing Communication and Networking Technologies (ICCCNT), Kamand, India, pp. 1-7, 2024

  25. arXiv:2505.17236  [pdf

    cs.CR cs.DC

    LogStamping: A blockchain-based log auditing approach for large-scale systems

    Authors: Md Shariful Islam, M. Sohel Rahman

    Abstract: Log management is crucial for ensuring the security, integrity, and compliance of modern information systems. Traditional log management solutions face challenges in achieving tamper-proofing, scalability, and real-time processing in distributed environments. This paper presents a blockchain-based log management framework that addresses these limitations by leveraging blockchain's decentralized, i… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    Comments: 7 Figures, 2 tables

  26. arXiv:2505.16290  [pdf

    cs.SE cs.AI

    Multimodal Generative AI for Story Point Estimation in Software Development

    Authors: Mohammad Rubyet Islam, Peter Sandborn

    Abstract: This research explores the application of Multimodal Generative AI to enhance story point estimation in Agile software development. By integrating text, image, and categorical data using advanced models like BERT, CNN, and XGBoost, our approach surpasses the limitations of traditional single-modal estimation methods. The results demonstrate strong accuracy for simpler story points, while also high… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    MSC Class: 68T07; 68T45 ACM Class: I.2.6; I.2.10; D.2.9; H.2.8

    Journal ref: A revised version of this work is published in the proceedings of the IEEE Conference on Artificial Intelligence 2025

  27. arXiv:2505.14727  [pdf

    cs.LG q-fin.CP

    The Evolution of Alpha in Finance Harnessing Human Insight and LLM Agents

    Authors: Mohammad Rubyet Islam

    Abstract: The pursuit of alpha returns that exceed market benchmarks has undergone a profound transformation, evolving from intuition-driven investing to autonomous, AI powered systems. This paper introduces a comprehensive five stage taxonomy that traces this progression across manual strategies, statistical models, classical machine learning, deep learning, and agentic architectures powered by large langu… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

    MSC Class: 91G70 Statistical methods; risk measures 91B84 Economic models (financial models; industrial models; growth models) ACM Class: I.2.6; I.5.1; I.2.7

  28. arXiv:2505.12333  [pdf

    cs.CE

    Predicting Gas Well Performance with Decline Curve Analysis: A Case Study on Semutang Gas Field

    Authors: Md. Shakil Rahaman, Ahmed Sakib, Ataharuse Samad, Md. Ashraful Islam

    Abstract: Decline-curve analysis (DCA) is a widely utilized method for production forecasting and estimating remaining reserves in gas reservoir. Based on the assumptions that past production trend can be mathematically characterized and used to predict future performance. It relies on historical production data and assumes that production methods remain unchanged throughout the analysis. This method is par… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.

    Comments: 8th International Conference on Mechanical, Industrial and Energy Engineering 2024

  29. arXiv:2505.11845  [pdf, ps, other

    cs.CV

    ElderFallGuard: Real-Time IoT and Computer Vision-Based Fall Detection System for Elderly Safety

    Authors: Tasrifur Riahi, Md. Azizul Hakim Bappy, Md. Mehedi Islam

    Abstract: For the elderly population, falls pose a serious and increasing risk of serious injury and loss of independence. In order to overcome this difficulty, we present ElderFallGuard: A Computer Vision Based IoT Solution for Elderly Fall Detection and Notification, a cutting-edge, non-invasive system intended for quick caregiver alerts and real-time fall detection. Our approach leverages the power of co… ▽ More

    Submitted 17 May, 2025; originally announced May 2025.

    Comments: 9 page, 1 table, 5 figure

  30. arXiv:2505.08704  [pdf, ps, other

    cs.AI cs.CL

    LLM-based Prompt Ensemble for Reliable Medical Entity Recognition from EHRs

    Authors: K M Sajjadul Islam, Ayesha Siddika Nipu, Jiawei Wu, Praveen Madiraju

    Abstract: Electronic Health Records (EHRs) are digital records of patient information, often containing unstructured clinical text. Named Entity Recognition (NER) is essential in EHRs for extracting key medical entities like problems, tests, and treatments to support downstream clinical applications. This paper explores prompt-based medical entity recognition using large language models (LLMs), specifically… ▽ More

    Submitted 25 May, 2025; v1 submitted 13 May, 2025; originally announced May 2025.

    Comments: IEEE 26th International Conference on Information Reuse and Integration for Data Science (IRI 2025), San Jose, CA, USA

  31. arXiv:2505.08468  [pdf, ps, other

    cs.CL cs.CV

    Judging the Judges: Can Large Vision-Language Models Fairly Evaluate Chart Comprehension and Reasoning?

    Authors: Md Tahmid Rahman Laskar, Mohammed Saidul Islam, Ridwan Mahbub, Ahmed Masry, Mizanur Rahman, Amran Bhuiyan, Mir Tafseer Nayeem, Shafiq Joty, Enamul Hoque, Jimmy Huang

    Abstract: Charts are ubiquitous as they help people understand and reason with data. Recently, various downstream tasks, such as chart question answering, chart2text, and fact-checking, have emerged. Large Vision-Language Models (LVLMs) show promise in tackling these tasks, but their evaluation is costly and time-consuming, limiting real-world deployment. While using LVLMs as judges to assess the chart comp… ▽ More

    Submitted 7 July, 2025; v1 submitted 13 May, 2025; originally announced May 2025.

    Comments: Accepted at ACL 2025 Industry Track

  32. arXiv:2505.05538  [pdf, ps, other

    cs.LG cs.AI eess.SP

    Cardioformer: Advancing AI in ECG Analysis with Multi-Granularity Patching and ResNet

    Authors: Md Kamrujjaman Mobin, Md Saiful Islam, Sadik Al Barid, Md Masum

    Abstract: Electrocardiogram (ECG) classification is crucial for automated cardiac disease diagnosis, yet existing methods often struggle to capture local morphological details and long-range temporal dependencies simultaneously. To address these challenges, we propose Cardioformer, a novel multi-granularity hybrid model that integrates cross-channel patching, hierarchical residual learning, and a two-stage… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

  33. arXiv:2505.04948   

    cs.IR cs.CL

    Prompt-Based LLMs for Position Bias-Aware Reranking in Personalized Recommendations

    Authors: Md Aminul Islam, Ahmed Sayeed Faruk

    Abstract: Recommender systems are essential for delivering personalized content across digital platforms by modeling user preferences and behaviors. Recently, large language models (LLMs) have been adopted for prompt-based recommendation due to their ability to generate personalized outputs without task-specific training. However, LLM-based methods face limitations such as limited context window size, ineff… ▽ More

    Submitted 26 May, 2025; v1 submitted 8 May, 2025; originally announced May 2025.

    Comments: We have decided to withdraw the manuscript as it requires substantial revisions that go beyond what is appropriate for a versioned update on arXiv. We plan to resubmit once the necessary improvements are made

  34. arXiv:2505.04308  [pdf

    cs.CR cs.AI

    Guardians of the Web: The Evolution and Future of Website Information Security

    Authors: Md Saiful Islam, Li Xiangdong

    Abstract: Website information security has become a critical concern in the digital age. This article explores the evolution of website information security, examining its historical development, current practices, and future directions. The early beginnings from the 1960s to the 1980s laid the groundwork for modern cybersecurity, with the development of ARPANET, TCP/IP, public-key cryptography, and the fir… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

    Comments: 22 pages

    ACM Class: F.2.2, I.2.7

  35. arXiv:2505.03799  [pdf, other

    cs.LG cs.AI cs.CL

    Scalability Matters: Overcoming Challenges in InstructGLM with Similarity-Degree-Based Sampling

    Authors: Hyun Lee, Chris Yi, Maminur Islam, B. D. S. Aritra

    Abstract: Large Language Models (LLMs) have demonstrated strong capabilities in various natural language processing tasks; however, their application to graph-related problems remains limited, primarily due to scalability constraints and the absence of dedicated mechanisms for processing graph structures. Existing approaches predominantly integrate LLMs with Graph Neural Networks (GNNs), using GNNs as featu… ▽ More

    Submitted 2 May, 2025; originally announced May 2025.

    Comments: To be published in International Joint Conference on Neural Networks (IJCNN), 2025

  36. arXiv:2505.03770  [pdf, other

    cs.AI

    Proceedings of 1st Workshop on Advancing Artificial Intelligence through Theory of Mind

    Authors: Mouad Abrini, Omri Abend, Dina Acklin, Henny Admoni, Gregor Aichinger, Nitay Alon, Zahra Ashktorab, Ashish Atreja, Moises Auron, Alexander Aufreiter, Raghav Awasthi, Soumya Banerjee, Joe M. Barnby, Rhea Basappa, Severin Bergsmann, Djallel Bouneffouf, Patrick Callaghan, Marc Cavazza, Thierry Chaminade, Sonia Chernova, Mohamed Chetouan, Moumita Choudhury, Axel Cleeremans, Jacek B. Cywinski, Fabio Cuzzolin , et al. (83 additional authors not shown)

    Abstract: This volume includes a selection of papers presented at the Workshop on Advancing Artificial Intelligence through Theory of Mind held at AAAI 2025 in Philadelphia US on 3rd March 2025. The purpose of this volume is to provide an open access and curated anthology for the ToM and AI research community.

    Submitted 28 April, 2025; originally announced May 2025.

    Comments: workshop proceedings

  37. arXiv:2505.01766  [pdf, other

    cs.CV cs.RO

    Multimodal Graph Representation Learning for Robust Surgical Workflow Recognition with Adversarial Feature Disentanglement

    Authors: Long Bai, Boyi Ma, Ruohan Wang, Guankun Wang, Beilei Cui, Zhongliang Jiang, Mobarakol Islam, Zhe Min, Jiewen Lai, Nassir Navab, Hongliang Ren

    Abstract: Surgical workflow recognition is vital for automating tasks, supporting decision-making, and training novice surgeons, ultimately improving patient safety and standardizing procedures. However, data corruption can lead to performance degradation due to issues like occlusion from bleeding or smoke in surgical scenes and problems with data storage and transmission. In this case, we explore a robust… ▽ More

    Submitted 3 May, 2025; originally announced May 2025.

    Comments: Accepted by Information Fusion

  38. arXiv:2505.01635  [pdf

    cs.ET cs.AI

    Dendritic Computing with Multi-Gate Ferroelectric Field-Effect Transistors

    Authors: A N M Nafiul Islam, Xuezhong Niu, Jiahui Duan, Shubham Kumar, Kai Ni, Abhronil Sengupta

    Abstract: Although inspired by neuronal systems in the brain, artificial neural networks generally employ point-neurons, which offer far less computational complexity than their biological counterparts. Neurons have dendritic arbors that connect to different sets of synapses and offer local non-linear accumulation - playing a pivotal role in processing and learning. Inspired by this, we propose a novel neur… ▽ More

    Submitted 2 May, 2025; originally announced May 2025.

  39. arXiv:2505.01429  [pdf, other

    cs.CV

    Explainable AI-Driven Detection of Human Monkeypox Using Deep Learning and Vision Transformers: A Comprehensive Analysis

    Authors: Md. Zahid Hossain, Md. Rakibul Islam, Most. Sharmin Sultana Samu

    Abstract: Since mpox can spread from person to person, it is a zoonotic viral illness that poses a significant public health concern. It is difficult to make an early clinical diagnosis because of how closely its symptoms match those of measles and chickenpox. Medical imaging combined with deep learning (DL) techniques has shown promise in improving disease detection by analyzing affected skin areas. Our st… ▽ More

    Submitted 3 April, 2025; originally announced May 2025.

  40. Durghotona GPT: A Web Scraping and Large Language Model Based Framework to Generate Road Accident Dataset Automatically in Bangladesh

    Authors: MD Thamed Bin Zaman Chowdhury, Moazzem Hossain, Md. Ridwanul Islam

    Abstract: Road accidents pose significant concerns globally. They lead to large financial losses, injuries, disabilities, and societal challenges. Accurate and timely accident data is essential for predicting and mitigating these events. This paper presents a novel framework named 'Durghotona GPT' that integrates web scraping and Large Language Models (LLMs) to automate the generation of comprehensive accid… ▽ More

    Submitted 23 April, 2025; originally announced April 2025.

    Comments: It has been accepted in IEEE 27th International Conference on Computer and Information Technology (ICCIT). Now, we are waiting for it to get published in IEEE Xplore

  41. arXiv:2504.20123  [pdf, other

    cond-mat.mtrl-sci cs.CE

    Multiscale modelling of thermally stressed superelastic polyimide

    Authors: Jerome Samuel S, Puneet Kumar Patra, Md Rushdie Ibne Islam

    Abstract: Many thermo-mechanical processes, such as thermal expansion and stress relaxation, originate at the atomistic scale. We develop a sequential multiscale approach to study thermally stressed superelastic polyimide to explore these effects. The continuum-scale smoothed particle hydrodynamics (SPH) model is coupled with atomistic molecular dynamics (MD) through constitutive modelling, where thermo-mec… ▽ More

    Submitted 28 April, 2025; originally announced April 2025.

    Comments: 25 pages, 17 figures

  42. arXiv:2504.17725  [pdf, other

    cs.NI

    STGen: A Novel Lightweight IoT Testbed for Generating Sensor Traffic for the Experimentation of IoT Protocol and its Application in Hybrid Network

    Authors: Hasan MA Islam, S. Nath, M. Rahman, N. Shahriar, M. K. M. Khan, R. Islam

    Abstract: A Wireless Sensor Network (WSN) is a network that does not rely on a fixed infrastructure and consists of numerous sensors, such as temperature, humidity, GPS, and cameras, equipped with onboard processors that manage and monitor the environment in a specific area. As a result, building a real sensor network testbed for verifying, validating, or experimenting with a newly designed protocol present… ▽ More

    Submitted 24 April, 2025; originally announced April 2025.

    Comments: 23 Pages, 12 Figures, Submitted to ACM Transactions on Sensor Networks

  43. arXiv:2504.16741  [pdf, other

    cs.HC cs.IR

    Search Timelines: Visualizing Search History to Enable Cross-Session Exploratory Search

    Authors: Orland Hoeber, Md Nazmul Islam, Miriam Boon, Dale Storie, Veronica Ramshaw

    Abstract: Purpose: The timespan over which exploratory searching can occur, as well as the scope and volume of the search activities undertaken, can make it difficult for searchers to remember key details about their search activities. These difficulties are present both in the midst of searching as well as when resuming a search that spans multiple sessions. In this paper, we present a search interface des… ▽ More

    Submitted 23 April, 2025; originally announced April 2025.

  44. arXiv:2504.14466  [pdf, other

    cs.ET

    A Bio-inspired Asymmetric Double-Gate Ferroelectric FET for Emulating Astrocyte and Dendrite Dynamics in Neuromorphic Systems

    Authors: Zhouhang Jiang, A N M Nafiul Islam, Zhuangyu Han, Zijian Zhao, Franz Müller, Jiahui Duan, Halid Mulaosmanovic, Stefan Dünkel, Sven Beyer, Sourav Dutta, Vijaykrishnan Narayanan, Thomas Kämpfe, Suma George Cardwell, Frances Chance, Abhronil Sengupta, Kai Ni

    Abstract: Neuromorphic systems seek to replicate the functionalities of biological neural networks to attain significant improvements in performance and efficiency of AI computing platforms. However, these systems have generally remained limited to emulation of simple neurons and synapses; and ignored higher order functionalities enabled by other components of the brain like astrocytes and dendrites. In thi… ▽ More

    Submitted 19 April, 2025; originally announced April 2025.

    Comments: 37 pages, 6 figure, 2 tables

  45. arXiv:2504.14068  [pdf, other

    cs.LG cs.HC

    Contextual Embedding-based Clustering to Identify Topics for Healthcare Service Improvement

    Authors: K M Sajjadul Islam, Ravi Teja Karri, Srujan Vegesna, Jiawei Wu, Praveen Madiraju

    Abstract: Understanding patient feedback is crucial for improving healthcare services, yet analyzing unlabeled short-text feedback presents significant challenges due to limited data and domain-specific nuances. Traditional supervised learning approaches require extensive labeled datasets, making unsupervised methods more viable for uncovering meaningful insights from patient feedback. This study explores u… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

    Comments: Full version of the paper accepted at the 2025 IEEE COMPSAC, Toronto, Canada

  46. arXiv:2504.13990  [pdf, other

    cs.LG cs.AI eess.SY

    PC-DeepNet: A GNSS Positioning Error Minimization Framework Using Permutation-Invariant Deep Neural Network

    Authors: M. Humayun Kabir, Md. Ali Hasan, Md. Shafiqul Islam, Kyeongjun Ko, Wonjae Shin

    Abstract: Global navigation satellite systems (GNSS) face significant challenges in urban and sub-urban areas due to non-line-of-sight (NLOS) propagation, multipath effects, and low received power levels, resulting in highly non-linear and non-Gaussian measurement error distributions. In light of this, conventional model-based positioning approaches, which rely on Gaussian error approximations, struggle to… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

    Comments: 31 pages, 14 figures, 6 tables

  47. arXiv:2504.13272  [pdf, other

    cs.SE

    Using LLMs for Library Migration

    Authors: Md Mohayeminul Islam, Ajay Kumar Jha, May Mahmoud, Ildar Akhmetov, Sarah Nadi

    Abstract: Library migration is the process of replacing a used software library with another library that provides similar functionality. Manual library migration is time-consuming and error prone, as it requires developers to understand the APIs of both libraries, map them, and perform the necessary code transformations. Due to its difficulty, most of the existing automated techniques and tooling stop at t… ▽ More

    Submitted 17 April, 2025; originally announced April 2025.

  48. Saliency-Aware Diffusion Reconstruction for Effective Invisible Watermark Removal

    Authors: Inzamamul Alam, Md Tanvir Islam, Simon S. Woo

    Abstract: As digital content becomes increasingly ubiquitous, the need for robust watermark removal techniques has grown due to the inadequacy of existing embedding techniques, which lack robustness. This paper introduces a novel Saliency-Aware Diffusion Reconstruction (SADRE) framework for watermark elimination on the web, combining adaptive noise injection, region-specific perturbations, and advanced diff… ▽ More

    Submitted 17 April, 2025; originally announced April 2025.

    Comments: Accepted at The Web Conference 2025

    ACM Class: I.4.5; I.5.4

  49. arXiv:2504.12711  [pdf, other

    cs.CV cs.AI eess.IV

    NTIRE 2025 Challenge on Day and Night Raindrop Removal for Dual-Focused Images: Methods and Results

    Authors: Xin Li, Yeying Jin, Xin Jin, Zongwei Wu, Bingchen Li, Yufei Wang, Wenhan Yang, Yu Li, Zhibo Chen, Bihan Wen, Robby T. Tan, Radu Timofte, Qiyu Rong, Hongyuan Jing, Mengmeng Zhang, Jinglong Li, Xiangyu Lu, Yi Ren, Yuting Liu, Meng Zhang, Xiang Chen, Qiyuan Guan, Jiangxin Dong, Jinshan Pan, Conglin Gou , et al. (112 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2025 Challenge on Day and Night Raindrop Removal for Dual-Focused Images. This challenge received a wide range of impressive solutions, which are developed and evaluated using our collected real-world Raindrop Clarity dataset. Unlike existing deraining datasets, our Raindrop Clarity dataset is more diverse and challenging in degradation types and contents, which includ… ▽ More

    Submitted 19 April, 2025; v1 submitted 17 April, 2025; originally announced April 2025.

    Comments: Challenge Report of CVPR NTIRE 2025; 26 pages; Methods from 32 teams

  50. arXiv:2504.09896  [pdf, other

    cs.CL

    TWSSenti: A Novel Hybrid Framework for Topic-Wise Sentiment Analysis on Social Media Using Transformer Models

    Authors: Aish Albladi, Md Kaosar Uddin, Minarul Islam, Cheryl Seals

    Abstract: Sentiment analysis is a crucial task in natural language processing (NLP) that enables the extraction of meaningful insights from textual data, particularly from dynamic platforms like Twitter and IMDB. This study explores a hybrid framework combining transformer-based models, specifically BERT, GPT-2, RoBERTa, XLNet, and DistilBERT, to improve sentiment classification accuracy and robustness. The… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

    Comments: 41 pages, 12 figures, includes algorithm and comparative tables