Skip to main content

Showing 1–50 of 476 results for author: Hasan, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.08985  [pdf, other

    cs.GR

    Position-Normal Manifold for Efficient Glint Rendering on High-Resolution Normal Maps

    Authors: Liwen Wu, Fujun Luan, Miloš Hašan, Ravi Ramamoorthi

    Abstract: Detailed microstructures on specular objects often exhibit intriguing glinty patterns under high-frequency lighting, which is challenging to render using a conventional normal-mapped BRDF. In this paper, we present a manifold-based formulation of the glint normal distribution functions (NDF) that precisely captures the surface normal distributions over queried footprints. The manifold-based formul… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

  2. arXiv:2505.08889  [pdf, other

    cs.GR cs.CV

    IntrinsicEdit: Precise generative image manipulation in intrinsic space

    Authors: Linjie Lyu, Valentin Deschaintre, Yannick Hold-Geoffroy, Miloš Hašan, Jae Shin Yoon, Thomas Leimkühler, Christian Theobalt, Iliyan Georgiev

    Abstract: Generative diffusion models have advanced image editing with high-quality results and intuitive interfaces such as prompts and semantic drawing. However, these interfaces lack precise control, and the associated methods typically specialize on a single editing task. We introduce a versatile, generative workflow that operates in an intrinsic-image latent space, enabling semantic, local manipulation… ▽ More

    Submitted 15 May, 2025; v1 submitted 13 May, 2025; originally announced May 2025.

    Comments: SIGGRAPH 2025 Journal track

  3. arXiv:2505.07036  [pdf

    cs.LG cs.AI

    Predicting Diabetes Using Machine Learning: A Comparative Study of Classifiers

    Authors: Mahade Hasan, Farhana Yasmin

    Abstract: Diabetes remains a significant health challenge globally, contributing to severe complications like kidney disease, vision loss, and heart issues. The application of machine learning (ML) in healthcare enables efficient and accurate disease prediction, offering avenues for early intervention and patient support. Our study introduces an innovative diabetes prediction framework, leveraging both trad… ▽ More

    Submitted 11 May, 2025; originally announced May 2025.

  4. arXiv:2505.06528  [pdf, other

    cs.CV

    Unmasking Deep Fakes: Leveraging Deep Learning for Video Authenticity Detection

    Authors: Mahmudul Hasan

    Abstract: Deepfake videos, produced through advanced artificial intelligence methods now a days, pose a new challenge to the truthfulness of the digital media. As Deepfake becomes more convincing day by day, detecting them requires advanced methods capable of identifying subtle inconsistencies. The primary motivation of this paper is to recognize deepfake videos using deep learning techniques, specifically… ▽ More

    Submitted 10 May, 2025; originally announced May 2025.

  5. arXiv:2505.06502  [pdf, ps, other

    eess.IV cs.CE cs.CV cs.LG

    PC-SRGAN: Physically Consistent Super-Resolution Generative Adversarial Network for General Transient Simulations

    Authors: Md Rakibul Hasan, Pouria Behnoudfar, Dan MacKinlay, Thomas Poulet

    Abstract: Machine Learning, particularly Generative Adversarial Networks (GANs), has revolutionised Super Resolution (SR). However, generated images often lack physical meaningfulness, which is essential for scientific applications. Our approach, PC-SRGAN, enhances image resolution while ensuring physical consistency for interpretable simulations. PC-SRGAN significantly improves both the Peak Signal-to-Nois… ▽ More

    Submitted 10 May, 2025; originally announced May 2025.

  6. arXiv:2505.06454  [pdf, ps, other

    cs.LG cs.CR

    Sponge Attacks on Sensing AI: Energy-Latency Vulnerabilities and Defense via Model Pruning

    Authors: Syed Mhamudul Hasan, Hussein Zangoti, Iraklis Anagnostopoulos, Abdur R. Shahid

    Abstract: Recent studies have shown that sponge attacks can significantly increase the energy consumption and inference latency of deep neural networks (DNNs). However, prior work has focused primarily on computer vision and natural language processing tasks, overlooking the growing use of lightweight AI models in sensing-based applications on resource-constrained devices, such as those in Internet of Thing… ▽ More

    Submitted 9 May, 2025; originally announced May 2025.

  7. arXiv:2505.02694  [pdf, other

    cs.HC cs.AI

    AI Standardized Patient Improves Human Conversations in Advanced Cancer Care

    Authors: Kurtis Haut, Masum Hasan, Thomas Carroll, Ronald Epstein, Taylan Sen, Ehsan Hoque

    Abstract: Serious illness communication (SIC) in end-of-life care faces challenges such as emotional stress, cultural barriers, and balancing hope with honesty. Despite its importance, one of the few available ways for clinicians to practice SIC is with standardized patients, which is expensive, time-consuming, and inflexible. In this paper, we present SOPHIE, an AI-powered standardized patient simulation a… ▽ More

    Submitted 5 May, 2025; originally announced May 2025.

    Comments: 20 pages, 6 figures, 4 tables, submitting to New England Journal of Medicine (NEJM)

  8. arXiv:2504.14764  [pdf, other

    cs.HC cs.DB

    Steering Semantic Data Processing With DocWrangler

    Authors: Shreya Shankar, Bhavya Chopra, Mawil Hasan, Stephen Lee, Björn Hartmann, Joseph M. Hellerstein, Aditya G. Parameswaran, Eugene Wu

    Abstract: Unstructured text has long been difficult to automatically analyze at scale. Large language models (LLMs) now offer a way forward by enabling {\em semantic data processing}, where familiar data processing operators (e.g., map, reduce, filter) are powered by LLMs instead of code. However, building effective semantic data processing pipelines presents a departure from traditional data pipelines: use… ▽ More

    Submitted 20 April, 2025; originally announced April 2025.

    Comments: 18 pages; 11 figures; 3 tables

  9. arXiv:2504.13990  [pdf, other

    cs.LG cs.AI eess.SY

    PC-DeepNet: A GNSS Positioning Error Minimization Framework Using Permutation-Invariant Deep Neural Network

    Authors: M. Humayun Kabir, Md. Ali Hasan, Md. Shafiqul Islam, Kyeongjun Ko, Wonjae Shin

    Abstract: Global navigation satellite systems (GNSS) face significant challenges in urban and sub-urban areas due to non-line-of-sight (NLOS) propagation, multipath effects, and low received power levels, resulting in highly non-linear and non-Gaussian measurement error distributions. In light of this, conventional model-based positioning approaches, which rely on Gaussian error approximations, struggle to… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

    Comments: 31 pages, 14 figures, 6 tables

  10. arXiv:2504.10808  [pdf, other

    cs.CV cs.HC cs.LG

    Tabular foundation model to detect empathy from visual cues

    Authors: Md Rakibul Hasan, Shafin Rahman, Md Zakir Hossain, Aneesh Krishna, Tom Gedeon

    Abstract: Detecting empathy from video interactions is an emerging area of research. Video datasets, however, are often released as extracted features (i.e., tabular data) rather than raw footage due to privacy and ethical concerns. Prior research on such tabular datasets established tree-based classical machine learning approaches as the best-performing models. Motivated by the recent success of textual fo… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

  11. arXiv:2504.06598  [pdf, other

    cs.GR

    Stochastic Ray Tracing of 3D Transparent Gaussians

    Authors: Xin Sun, Iliyan Georgiev, Yun Fei, Miloš Hašan

    Abstract: 3D Gaussian splatting has recently been widely adopted as a 3D representation for novel-view synthesis, relighting, and text-to-3D generation tasks, offering realistic and detailed results through a collection of explicit 3D Gaussians carrying opacities and view-dependent colors. However, efficient rendering of many transparent primitives remains a significant challenge. Existing approaches either… ▽ More

    Submitted 10 April, 2025; v1 submitted 9 April, 2025; originally announced April 2025.

    Comments: 10 pages, 6 figures, 5 tables

  12. arXiv:2504.05995  [pdf, other

    cs.CL cs.AI

    NativQA Framework: Enabling LLMs with Native, Local, and Everyday Knowledge

    Authors: Firoj Alam, Md Arid Hasan, Sahinur Rahman Laskar, Mucahid Kutlu, Shammur Absar Chowdhury

    Abstract: The rapid advancement of large language models (LLMs) has raised concerns about cultural bias, fairness, and their applicability in diverse linguistic and underrepresented regional contexts. To enhance and benchmark the capabilities of LLMs, there is a need to develop large-scale resources focused on multilingual, local, and cultural contexts. In this study, we propose a framework, NativQA, that c… ▽ More

    Submitted 8 April, 2025; originally announced April 2025.

    Comments: LLMs, Native, Multilingual, Language Diversity, Contextual Understanding, Minority Languages, Culturally Informed, Foundation Models, Large Language Models

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

  13. arXiv:2504.04556  [pdf, other

    cs.DS

    Online Facility Assignments on Polygons

    Authors: Sumaiya Malik, Reyan Ahmed, Md. Manzurul Hasan

    Abstract: We study the online facility assignment problem on regular polygons, where all sides are of equal length. The influence of specific geometric settings has remained mostly unexplored, even though classical online facility assignment problems have mainly dealt with linear and general metric spaces. We fill this gap by considering the following four basic geometric settings: equilateral triangles, re… ▽ More

    Submitted 6 April, 2025; originally announced April 2025.

  14. arXiv:2504.02045  [pdf, other

    cs.GR cs.CV

    WorldPrompter: Traversable Text-to-Scene Generation

    Authors: Zhaoyang Zhang, Yannick Hold-Geoffroy, Miloš Hašan, Chen Ziwen, Fujun Luan, Julie Dorsey, Yiwei Hu

    Abstract: Scene-level 3D generation is a challenging research topic, with most existing methods generating only partial scenes and offering limited navigational freedom. We introduce WorldPrompter, a novel generative pipeline for synthesizing traversable 3D scenes from text prompts. We leverage panoramic videos as an intermediate representation to model the 360° details of a scene. WorldPrompter incorporate… ▽ More

    Submitted 2 April, 2025; originally announced April 2025.

  15. arXiv:2504.00485  [pdf

    cs.LG cs.AI

    Enhancing stroke disease classification through machine learning models by feature selection techniques

    Authors: Mahade Hasan, Farhana Yasmin, Xue Yu

    Abstract: Heart disease remains a leading cause of mortality and morbidity worldwide, necessitating the development of accurate and reliable predictive models to facilitate early detection and intervention. While state of the art work has focused on various machine learning approaches for predicting heart disease, but they could not able to achieve remarkable accuracy. In response to this need, we applied n… ▽ More

    Submitted 11 May, 2025; v1 submitted 1 April, 2025; originally announced April 2025.

  16. arXiv:2503.24349  [pdf, other

    cs.SE

    Faster Releases, Fewer Risks: A Study on Maven Artifact Vulnerabilities and Lifecycle Management

    Authors: Md Shafiullah Shafin, Md Fazle Rabbi, S. M. Mahedy Hasan, Minhaz F. Zibran

    Abstract: In modern software ecosystems, dependency management plays a critical role in ensuring secure and maintainable applications. However, understanding the relationship between release practices and their impact on vulnerabilities and update cycles remains a challenge. In this study, we analyze the release histories of 10,000 Maven artifacts, covering over 203,000 releases and 1.7 million dependencies… ▽ More

    Submitted 31 March, 2025; originally announced March 2025.

  17. arXiv:2503.23764  [pdf, other

    cs.CV cs.AI

    WaveFormer: A 3D Transformer with Wavelet-Driven Feature Representation for Efficient Medical Image Segmentation

    Authors: Md Mahfuz Al Hasan, Mahdi Zaman, Abdul Jawad, Alberto Santamaria-Pang, Ho Hin Lee, Ivan Tarapov, Kyle See, Md Shah Imran, Antika Roy, Yaser Pourmohammadi Fallah, Navid Asadizanjani, Reza Forghani

    Abstract: Transformer-based architectures have advanced medical image analysis by effectively modeling long-range dependencies, yet they often struggle in 3D settings due to substantial memory overhead and insufficient capture of fine-grained local features. We address these limitations with WaveFormer, a novel 3D-transformer that: i) leverages the fundamental frequency-domain properties of features for con… ▽ More

    Submitted 31 March, 2025; v1 submitted 31 March, 2025; originally announced March 2025.

  18. arXiv:2503.22902  [pdf, other

    cs.SE

    Insights into Dependency Maintenance Trends in the Maven Ecosystem

    Authors: Barisha Chowdhury, Md Fazle Rabbi, S. M. Mahedy Hasan, Minhaz F. Zibran

    Abstract: As modern software development increasingly relies on reusable libraries and components, managing dependencies has become critical for ensuring software stability and security. However, challenges such as outdated dependencies, missed releases, and the complexity of interdependent libraries can significantly impact project maintenance. In this paper, we present a quantitative analysis of the Neo4j… ▽ More

    Submitted 28 March, 2025; originally announced March 2025.

  19. arXiv:2503.22710  [pdf

    cs.CR cs.CY

    Assessing the influence of cybersecurity threats and risks on the adoption and growth of digital banking: a systematic literature review

    Authors: Md. Waliullah, Md Zahin Hossain George, Md Tarek Hasan, Md Khorshed Alam, Mosa Sumaiya Khatun Munira, Noor Alam Siddiqui

    Abstract: The rapid digitalization of banking services has significantly transformed financial transactions, offering enhanced convenience and efficiency for consumers. However, the increasing reliance on digital banking has also exposed financial institutions and users to a wide range of cybersecurity threats, including phishing, malware, ransomware, data breaches, and unauthorized access. This study syste… ▽ More

    Submitted 22 March, 2025; originally announced March 2025.

    Comments: 32 pages, 13 figures

  20. Leveraging Language Models for Analyzing Longitudinal Experiential Data in Education

    Authors: Ahatsham Hayat, Bilal Khan, Mohammad Rashedul Hasan

    Abstract: We propose a novel approach to leveraging pre-trained language models (LMs) for early forecasting of academic trajectories in STEM students using high-dimensional longitudinal experiential data. This data, which captures students' study-related activities, behaviors, and psychological states, offers valuable insights for forecasting-based interventions. Key challenges in handling such data include… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

  21. arXiv:2503.15442  [pdf, other

    cs.DS

    A Space-Efficient Algorithm for Longest Common Almost Increasing Subsequence of Two Sequences

    Authors: Md Tanzeem Rahat, Md. Manzurul Hasan, Debajyoti Mondal

    Abstract: Let $A$ and $B$ be two number sequences of length $n$ and $m$, respectively, where $m\le n$. Given a positive number $δ$, a common almost increasing sequence $s_1\ldots s_k$ is a common subsequence for both $A$ and $B$ such that for all $2\le i\le k$, $s_i+δ> \max_{1\le j < i} s_j$. The LCaIS problem seeks to find the longest common almost increasing subsequence (LCaIS) of $A$ and $B$. An LCaIS ca… ▽ More

    Submitted 7 May, 2025; v1 submitted 19 March, 2025; originally announced March 2025.

    MSC Class: 68W01; 68R01 ACM Class: G.2.1

  22. Designing and Deploying AI Models for Sustainable Logistics Optimization: A Case Study on Eco-Efficient Supply Chains in the USA

    Authors: Reza E Rabbi Shawon, MD Rokibul Hasan, Md Anisur Rahman, Mohamed Ghandri, Iman Ahmed Lamari, Mohammed Kawsar, Rubi Akter

    Abstract: The rapid evolution of Artificial Intelligence (AI) and Machine Learning (ML) has significantly transformed logistics and supply chain management, particularly in the pursuit of sustainability and eco-efficiency. This study explores AI-based methodologies for optimizing logistics operations in the USA, focusing on reducing environmental impact, improving fuel efficiency, and minimizing costs. Key… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

  23. arXiv:2503.13399  [pdf, other

    cs.CV cs.AI cs.CL cs.LG q-bio.CB

    MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research

    Authors: James Burgess, Jeffrey J Nirschl, Laura Bravo-Sánchez, Alejandro Lozano, Sanket Rajan Gupte, Jesus G. Galaz-Montoya, Yuhui Zhang, Yuchang Su, Disha Bhowmik, Zachary Coman, Sarina M. Hasan, Alexandra Johannesson, William D. Leineweber, Malvika G Nair, Ridhi Yarlagadda, Connor Zuraski, Wah Chiu, Sarah Cohen, Jan N. Hansen, Manuel D Leonetti, Chad Liu, Emma Lundberg, Serena Yeung-Levy

    Abstract: Scientific research demands sophisticated reasoning over multimodal data, a challenge especially prevalent in biology. Despite recent advances in multimodal large language models (MLLMs) for AI-assisted research, existing multimodal reasoning benchmarks only target up to college-level difficulty, while research-level benchmarks emphasize lower-level perception, falling short of the complex multimo… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

    Comments: CVPR 2025 (Conference on Computer Vision and Pattern Recognition) Project page at https://jmhb0.github.io/microvqa Benchmark at https://huggingface.co/datasets/jmhb/microvqa

  24. FDCT: Frequency-Aware Decomposition and Cross-Modal Token-Alignment for Multi-Sensor Target Classification

    Authors: Shoaib Meraj Sami, Md Mahedi Hasan, Nasser M. Nasrabadi, Raghuveer Rao

    Abstract: In automatic target recognition (ATR) systems, sensors may fail to capture discriminative, fine-grained detail features due to environmental conditions, noise created by CMOS chips, occlusion, parallaxes, and sensor misalignment. Therefore, multi-sensor image fusion is an effective choice to overcome these constraints. However, multi-modal image sensors are heterogeneous and have domain and granul… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

    Comments: 12 pages Accepted in the IEEE Transactions on Aerospace and Electronic Systems

  25. arXiv:2503.09746  [pdf, other

    cs.LG cs.AI stat.ML

    Solving Bayesian inverse problems with diffusion priors and off-policy RL

    Authors: Luca Scimeca, Siddarth Venkatraman, Moksh Jain, Minsu Kim, Marcin Sendera, Mohsin Hasan, Luke Rowe, Sarthak Mittal, Pablo Lemos, Emmanuel Bengio, Alexandre Adam, Jarrid Rector-Brooks, Yashar Hezaveh, Laurence Perreault-Levasseur, Yoshua Bengio, Glen Berseth, Nikolay Malkin

    Abstract: This paper presents a practical application of Relative Trajectory Balance (RTB), a recently introduced off-policy reinforcement learning (RL) objective that can asymptotically solve Bayesian inverse problems optimally. We extend the original work by using RTB to train conditional diffusion model posteriors from pretrained unconditional priors for challenging linear and non-linear inverse problems… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

    Comments: Accepted as workshop paper at DeLTa workshop, ICLR 2025. arXiv admin note: substantial text overlap with arXiv:2405.20971

  26. arXiv:2503.05750  [pdf, other

    cs.CL cs.AI cs.LG

    CSTRL: Context-Driven Sequential Transfer Learning for Abstractive Radiology Report Summarization

    Authors: Mst. Fahmida Sultana Naznin, Adnan Ibney Faruq, Mostafa Rifat Tazwar, Md Jobayer, Md. Mehedi Hasan Shawon, Md Rakibul Hasan

    Abstract: A radiology report comprises several sections, including the Findings and Impression of the diagnosis. Automatically generating the Impression from the Findings is crucial for reducing radiologists' workload and improving diagnostic accuracy. Pretrained models that excel in common abstractive summarization problems encounter challenges when applied to specialized medical domains largely due to the… ▽ More

    Submitted 21 February, 2025; originally announced March 2025.

    Comments: 11-pages main paper with 2-pages appendices

    MSC Class: 68T50 ACM Class: I.2.7

  27. arXiv:2503.05511  [pdf, other

    cs.GR

    Free Your Hands: Lightweight Relightable Turntable Capture Pipeline

    Authors: Jiahui Fan, Fujun Luan, Jian Yang, Miloš Hašan, Beibei Wang

    Abstract: Novel view synthesis (NVS) from multiple captured photos of an object is a widely studied problem. Achieving high quality typically requires dense sampling of input views, which can lead to frustrating and tedious manual labor. Manually positioning cameras to maintain an optimal desired distribution can be difficult for humans, and if a good distribution is found, it is not easy to replicate. Addi… ▽ More

    Submitted 14 April, 2025; v1 submitted 7 March, 2025; originally announced March 2025.

  28. arXiv:2503.04204  [pdf, other

    cs.CV cs.LG

    FUSE: First-Order and Second-Order Unified SynthEsis in Stochastic Optimization

    Authors: Zhanhong Jiang, Md Zahid Hasan, Aditya Balu, Joshua R. Waite, Genyi Huang, Soumik Sarkar

    Abstract: Stochastic optimization methods have actively been playing a critical role in modern machine learning algorithms to deliver decent performance. While numerous works have proposed and developed diverse approaches, first-order and second-order methods are in entirely different situations. The former is significantly pivotal and dominating in emerging deep learning but only leads convergence to a sta… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

    Comments: 6 pages, 7 figures

  29. arXiv:2503.02993  [pdf

    cs.CL cs.IR

    Zero-Shot Multi-Label Classification of Bangla Documents: Large Decoders Vs. Classic Encoders

    Authors: Souvika Sarkar, Md. Najib Hasan, Santu Karmaker

    Abstract: Bangla, a language spoken by over 300 million native speakers and ranked as the sixth most spoken language worldwide, presents unique challenges in natural language processing (NLP) due to its complex morphological characteristics and limited resources. While recent Large Decoder Based models (LLMs), such as GPT, LLaMA, and DeepSeek, have demonstrated excellent performance across many NLP tasks, t… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

  30. arXiv:2503.02360  [pdf, other

    cs.CV cs.AI cs.HC cs.LG

    BdSLW401: Transformer-Based Word-Level Bangla Sign Language Recognition Using Relative Quantization Encoding (RQE)

    Authors: Husne Ara Rubaiyeat, Njayou Youssouf, Md Kamrul Hasan, Hasan Mahmud

    Abstract: Sign language recognition (SLR) for low-resource languages like Bangla suffers from signer variability, viewpoint variations, and limited annotated datasets. In this paper, we present BdSLW401, a large-scale, multi-view, word-level Bangla Sign Language (BdSL) dataset with 401 signs and 102,176 video samples from 18 signers in front and lateral views. To improve transformer-based SLR, we introduce… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

  31. arXiv:2503.02345  [pdf, other

    quant-ph cs.AI cs.CV cs.LG

    CQ CNN: A Hybrid Classical Quantum Convolutional Neural Network for Alzheimer's Disease Detection Using Diffusion Generated and U Net Segmented 3D MRI

    Authors: Mominul Islam, Mohammad Junayed Hasan, M. R. C. Mahdy

    Abstract: The detection of Alzheimer disease (AD) from clinical MRI data is an active area of research in medical imaging. Recent advances in quantum computing, particularly the integration of parameterized quantum circuits (PQCs) with classical machine learning architectures, offer new opportunities to develop models that may outperform traditional methods. However, quantum machine learning (QML) remains i… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

    Comments: Application of hybrid quantum-classical machine learning for (early stage) disease detection

  32. arXiv:2502.17843  [pdf, other

    cs.CV

    Automatic Vehicle Detection using DETR: A Transformer-Based Approach for Navigating Treacherous Roads

    Authors: Istiaq Ahmed Fahad, Abdullah Ibne Hanif Arean, Nazmus Sakib Ahmed, Mahmudul Hasan

    Abstract: Automatic Vehicle Detection (AVD) in diverse driving environments presents unique challenges due to varying lighting conditions, road types, and vehicle types. Traditional methods, such as YOLO and Faster R-CNN, often struggle to cope with these complexities. As computer vision evolves, combining Convolutional Neural Networks (CNNs) with Transformer-based approaches offers promising opportunities… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

  33. arXiv:2502.16612  [pdf, other

    cs.CL cs.AI

    MemeIntel: Explainable Detection of Propagandistic and Hateful Memes

    Authors: Mohamed Bayan Kmainasi, Abul Hasnat, Md Arid Hasan, Ali Ezzat Shahroor, Firoj Alam

    Abstract: The proliferation of multimodal content on social media presents significant challenges in understanding and moderating complex, context-dependent issues such as misinformation, hate speech, and propaganda. While efforts have been made to develop resources and propose new methods for automatic detection, limited attention has been given to label detection and the generation of explanation-based ra… ▽ More

    Submitted 23 February, 2025; originally announced February 2025.

    Comments: disinformation, misinformation, factuality, harmfulness, fake news, propaganda, hateful meme, multimodality, text, images

    MSC Class: 68T50 ACM Class: I.2.7

  34. arXiv:2502.16550  [pdf, other

    cs.CL

    Reasoning About Persuasion: Can LLMs Enable Explainable Propaganda Detection?

    Authors: Maram Hasanain, Md Arid Hasan, Mohamed Bayan Kmainasi, Elisa Sartori, Ali Ezzat Shahroor, Giovanni Da San Martino, Firoj Alam

    Abstract: There has been significant research on propagandistic content detection across different modalities and languages. However, most studies have primarily focused on detection, with little attention given to explanations justifying the predicted label. This is largely due to the lack of resources that provide explanations alongside annotated labels. To address this issue, we propose a multilingual (i… ▽ More

    Submitted 23 February, 2025; originally announced February 2025.

  35. arXiv:2502.09731  [pdf, other

    cs.CV cs.AI

    A CNN Approach to Automated Detection and Classification of Brain Tumors

    Authors: Md. Zahid Hasan, Abdullah Tamim, D. M. Asadujjaman, Md. Mahfujur Rahman, Md. Abu Ahnaf Mollick, Nosin Anjum Dristi, Abdullah-Al-Noman

    Abstract: Brain tumors require an assessment to ensure timely diagnosis and effective patient treatment. Morphological factors such as size, location, texture, and variable appearance complicate tumor inspection. Medical imaging presents challenges, including noise and incomplete images. This research article presents a methodology for processing Magnetic Resonance Imaging (MRI) data, encompassing technique… ▽ More

    Submitted 13 February, 2025; originally announced February 2025.

    MSC Class: 68T07 ACM Class: J.3

  36. arXiv:2502.07928  [pdf, other

    cs.SE

    Distributed Approach to Haskell Based Applications Refactoring with LLMs Based Multi-Agent Systems

    Authors: Shahbaz Siddeeq, Zeeshan Rasheed, Malik Abdul Sami, Mahade Hasan, Muhammad Waseem, Jussi Rasku, Mika Saari, Kai-Kristian Kemell, Pekka Abrahamsson

    Abstract: We present a large language models (LLMs) based multi-agent system to automate the refactoring of Haskell codebases. The multi-agent system consists of specialized agents performing tasks such as context analysis, refactoring, validation, and testing. Refactoring improvements are using metrics such as cyclomatic complexity, run-time, and memory allocation. Experimental evaluations conducted on Has… ▽ More

    Submitted 11 February, 2025; originally announced February 2025.

  37. arXiv:2502.06999  [pdf, other

    cs.LG

    Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models

    Authors: Siddarth Venkatraman, Mohsin Hasan, Minsu Kim, Luca Scimeca, Marcin Sendera, Yoshua Bengio, Glen Berseth, Nikolay Malkin

    Abstract: Any well-behaved generative model over a variable $\mathbf{x}$ can be expressed as a deterministic transformation of an exogenous ('outsourced') Gaussian noise variable $\mathbf{z}$: $\mathbf{x}=f_θ(\mathbf{z})$. In such a model (e.g., a VAE, GAN, or continuous-time flow-based model), sampling of the target variable $\mathbf{x} \sim p_θ(\mathbf{x})$ is straightforward, but sampling from a posterio… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  38. arXiv:2502.04566  [pdf, other

    cs.CV

    An Optimized YOLOv5 Based Approach For Real-time Vehicle Detection At Road Intersections Using Fisheye Cameras

    Authors: Md. Jahin Alam, Muhammad Zubair Hasan, Md Maisoon Rahman, Md Awsafur Rahman, Najibul Haque Sarker, Shariar Azad, Tasnim Nishat Islam, Bishmoy Paul, Tanvir Anjum, Barproda Halder, Shaikh Anowarul Fattah

    Abstract: Real time vehicle detection is a challenging task for urban traffic surveillance. Increase in urbanization leads to increase in accidents and traffic congestion in junction areas resulting in delayed travel time. In order to solve these problems, an intelligent system utilizing automatic detection and tracking system is significant. But this becomes a challenging task at road intersection areas wh… ▽ More

    Submitted 6 February, 2025; originally announced February 2025.

  39. arXiv:2502.00032  [pdf, other

    cs.DB cs.AI cs.IR

    Querying Databases with Function Calling

    Authors: Connor Shorten, Charles Pierse, Thomas Benjamin Smith, Karel D'Oosterlinck, Tuana Celik, Erika Cardenas, Leonie Monigatti, Mohd Shukri Hasan, Edward Schmuhl, Daniel Williams, Aravind Kesiraju, Bob van Luijt

    Abstract: The capabilities of Large Language Models (LLMs) are rapidly accelerating largely thanks to their integration with external tools. Querying databases is among the most effective of these integrations, enabling LLMs to access private or continually updating data. While Function Calling is the most common method for interfacing external tools to LLMs, its application to database querying as a tool h… ▽ More

    Submitted 23 January, 2025; originally announced February 2025.

    Comments: Preprint. 23 pages, 7 figures

  40. arXiv:2501.18880  [pdf, other

    cs.CV cs.LG

    RLS3: RL-Based Synthetic Sample Selection to Enhance Spatial Reasoning in Vision-Language Models for Indoor Autonomous Perception

    Authors: Joshua R. Waite, Md. Zahid Hasan, Qisai Liu, Zhanhong Jiang, Chinmay Hegde, Soumik Sarkar

    Abstract: Vision-language model (VLM) fine-tuning for application-specific visual grounding based on natural language instructions has become one of the most popular approaches for learning-enabled autonomous systems. However, such fine-tuning relies heavily on high-quality datasets to achieve successful performance in various downstream tasks. Additionally, VLMs often encounter limitations due to insuffici… ▽ More

    Submitted 30 January, 2025; originally announced January 2025.

    Comments: ICCPS 2025 accepted paper, 10 pages, 9 figures

  41. arXiv:2501.14929  [pdf, other

    cs.CV cs.AI

    Motion-enhancement to Echocardiography Segmentation via Inserting a Temporal Attention Module: An Efficient, Adaptable, and Scalable Approach

    Authors: Md. Kamrul Hasan, Guang Yang, Choon Hwai Yap

    Abstract: Cardiac anatomy segmentation is essential for clinical assessment of cardiac function and disease diagnosis to inform treatment and intervention. In performing segmentation, deep learning (DL) algorithms improved accuracy significantly compared to traditional image processing approaches. More recently, studies showed that enhancing DL segmentation with motion information can further improve it. A… ▽ More

    Submitted 24 January, 2025; originally announced January 2025.

  42. arXiv:2501.10364  [pdf

    cs.CY

    AI-Enhanced Decision-Making for Sustainable Supply Chains: Reducing Carbon Footprints in the USA

    Authors: MD Rokibul Hasan

    Abstract: Organizations increasingly need to reassess their supply chain strategies in the rapidly modernizing world towards sustainability. This is particularly true in the United States, where supply chains are very extensive and consume a large number of resources. This research paper discusses how AI can support decision-making for sustainable supply chains with a special focus on carbon footprints. The… ▽ More

    Submitted 8 December, 2024; originally announced January 2025.

  43. arXiv:2501.09506  [pdf

    cs.LG cs.SD eess.AS eess.IV

    Multimodal Marvels of Deep Learning in Medical Diagnosis: A Comprehensive Review of COVID-19 Detection

    Authors: Md Shofiqul Islam, Khondokar Fida Hasan, Hasibul Hossain Shajeeb, Humayan Kabir Rana, Md Saifur Rahmand, Md Munirul Hasan, AKM Azad, Ibrahim Abdullah, Mohammad Ali Moni

    Abstract: This study presents a comprehensive review of the potential of multimodal deep learning (DL) in medical diagnosis, using COVID-19 as a case example. Motivated by the success of artificial intelligence applications during the COVID-19 pandemic, this research aims to uncover the capabilities of DL in disease screening, prediction, and classification, and to derive insights that enhance the resilienc… ▽ More

    Submitted 21 January, 2025; v1 submitted 16 January, 2025; originally announced January 2025.

    Comments: 43 pages

  44. arXiv:2501.06659  [pdf, other

    cs.DB cs.CV

    TWIX: Automatically Reconstructing Structured Data from Templatized Documents

    Authors: Yiming Lin, Mawil Hasan, Rohan Kosalge, Alvin Cheung, Aditya G. Parameswaran

    Abstract: Many documents, that we call templatized documents, are programmatically generated by populating fields in a visual template. Effective data extraction from these documents is crucial to supporting downstream analytical tasks. Current data extraction tools often struggle with complex document layouts, incur high latency and/or cost on large datasets, and often require significant human effort, whe… ▽ More

    Submitted 11 January, 2025; originally announced January 2025.

  45. arXiv:2501.05147  [pdf

    cs.CV cs.AI cs.RO

    A Systematic Literature Review on Deep Learning-based Depth Estimation in Computer Vision

    Authors: Ali Rohan, Md Junayed Hasan, Andrei Petrovski

    Abstract: Depth estimation (DE) provides spatial information about a scene and enables tasks such as 3D reconstruction, object detection, and scene understanding. Recently, there has been an increasing interest in using deep learning (DL)-based methods for DE. Traditional techniques rely on handcrafted features that often struggle to generalise to diverse scenes and require extensive manual tuning. However,… ▽ More

    Submitted 9 January, 2025; originally announced January 2025.

  46. arXiv:2501.04627  [pdf

    cs.AR

    Design of a 6-bit Threshold Inverter Quantization (TIQ) Flash Analog to Digital Converter (ADC)

    Authors: Noyon Kumar Sarkar, Moumita Roy, Md. Tariq Hasan

    Abstract: An ADC is used to convert analog signals into binary signals. Compared with many other types of ADCs, flash converters are incredibly quick. A typical Flash ADC consists of 2n resistors, 2n-1 op-amp comparators, and an encoder which requires more area. The resistors and comparators can be eliminated by using threshold inverter quantization (TIQ) comparators. As a voltage comparator, TIQ technique… ▽ More

    Submitted 8 January, 2025; originally announced January 2025.

  47. arXiv:2501.00691  [pdf, other

    cs.CL cs.LG

    Labels Generated by Large Language Model Helps Measuring People's Empathy in Vitro

    Authors: Md Rakibul Hasan, Yue Yao, Md Zakir Hossain, Aneesh Krishna, Imre Rudas, Shafin Rahman, Tom Gedeon

    Abstract: Large language models (LLMs) have revolutionised numerous fields, with LLM-as-a-service (LLMSaaS) having a strong generalisation ability that offers accessible solutions directly without the need for costly training. In contrast to the widely studied prompt engineering for task solving directly (in vivo), this paper explores its potential in in-vitro applications. These involve using LLM to genera… ▽ More

    Submitted 31 December, 2024; originally announced January 2025.

    Comments: This work has been submitted to the IEEE for possible publication

  48. Adaptive Tabu Dropout for Regularization of Deep Neural Network

    Authors: Md. Tarek Hasan, Arifa Akter, Mohammad Nazmush Shamael, Md Al Emran Hossain, H. M. Mutasim Billah, Sumayra Islam, Swakkhar Shatabda

    Abstract: Dropout is an effective strategy for the regularization of deep neural networks. Applying tabu to the units that have been dropped in the recent epoch and retaining them for training ensures diversification in dropout. In this paper, we improve the Tabu Dropout mechanism for training deep neural networks in two ways. Firstly, we propose to use tabu tenure, or the number of epochs a particular unit… ▽ More

    Submitted 31 December, 2024; originally announced January 2025.

    Journal ref: Neural Information Processing, ICONIP 2022, Lecture Notes in Computer Science 13623, Springer Cham, 2023, 334-345

  49. arXiv:2501.00316  [pdf, other

    cs.CL

    MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models

    Authors: Mahir Labib Dihan, Md Tanvir Hassan, Md Tanvir Parvez, Md Hasebul Hasan, Md Almash Alam, Muhammad Aamir Cheema, Mohammed Eunus Ali, Md Rizwan Parvez

    Abstract: Recent advancements in foundation models have enhanced AI systems' capabilities in autonomous tool usage and reasoning. However, their ability in location or map-based reasoning - which improves daily life by optimizing navigation, facilitating resource discovery, and streamlining logistics - has not been systematically studied. To bridge this gap, we introduce MapEval, a benchmark designed to ass… ▽ More

    Submitted 31 December, 2024; originally announced January 2025.

    Comments: 40 pages, 21 figures

  50. arXiv:2412.17765  [pdf, ps, other

    cs.LG

    HyperQ-Opt: Q-learning for Hyperparameter Optimization

    Authors: Md. Tarek Hasan

    Abstract: Hyperparameter optimization (HPO) is critical for enhancing the performance of machine learning models, yet it often involves a computationally intensive search across a large parameter space. Traditional approaches such as Grid Search and Random Search suffer from inefficiency and limited scalability, while surrogate models like Sequential Model-based Bayesian Optimization (SMBO) rely heavily on… ▽ More

    Submitted 23 December, 2024; originally announced December 2024.