Skip to main content

Showing 1–39 of 39 results for author: Hossain, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.10154  [pdf, ps, other

    cs.CL cs.LG

    Analyzing Emotions in Bangla Social Media Comments Using Machine Learning and LIME

    Authors: Bidyarthi Paul, SM Musfiqur Rahman, Dipta Biswas, Md. Ziaul Hasan, Md. Zahid Hossain

    Abstract: Research on understanding emotions in written language continues to expand, especially for understudied languages with distinctive regional expressions and cultural features, such as Bangla. This study examines emotion analysis using 22,698 social media comments from the EmoNoBa dataset. For language analysis, we employ machine learning models: Linear SVM, KNN, and Random Forest with n-gram data f… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  2. arXiv:2505.21715  [pdf, ps, other

    eess.IV cs.AI cs.CV

    Privacy-Preserving Chest X-ray Report Generation via Multimodal Federated Learning with ViT and GPT-2

    Authors: Md. Zahid Hossain, Mustofa Ahmed, Most. Sharmin Sultana Samu, Md. Rakibul Islam

    Abstract: The automated generation of radiology reports from chest X-ray images holds significant promise in enhancing diagnostic workflows while preserving patient privacy. Traditional centralized approaches often require sensitive data transfer, posing privacy concerns. To address this, the study proposes a Multimodal Federated Learning framework for chest X-ray report generation using the IU-Xray dataset… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: Preprint, manuscript under-review

  3. arXiv:2505.12552  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    FreqSelect: Frequency-Aware fMRI-to-Image Reconstruction

    Authors: Junliang Ye, Lei Wang, Md Zakir Hossain

    Abstract: Reconstructing natural images from functional magnetic resonance imaging (fMRI) data remains a core challenge in natural decoding due to the mismatch between the richness of visual stimuli and the noisy, low resolution nature of fMRI signals. While recent two-stage models, combining deep variational autoencoders (VAEs) with diffusion models, have advanced this task, they treat all spatial-frequenc… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.

    Comments: Research report

  4. arXiv:2505.12433  [pdf, ps, other

    cs.CV cs.AI cs.LG

    SRLoRA: Subspace Recomposition in Low-Rank Adaptation via Importance-Based Fusion and Reinitialization

    Authors: Haodong Yang, Lei Wang, Md Zakir Hossain

    Abstract: Low-Rank Adaptation (LoRA) is a widely adopted parameter-efficient fine-tuning (PEFT) method that injects two trainable low-rank matrices (A and B) into frozen pretrained models. While efficient, LoRA constrains updates to a fixed low-rank subspace (Delta W = BA), which can limit representational capacity and hinder downstream performance. We introduce Subspace Recomposition in Low-Rank Adaptation… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.

    Comments: Research report

  5. arXiv:2505.01429  [pdf, other

    cs.CV

    Explainable AI-Driven Detection of Human Monkeypox Using Deep Learning and Vision Transformers: A Comprehensive Analysis

    Authors: Md. Zahid Hossain, Md. Rakibul Islam, Most. Sharmin Sultana Samu

    Abstract: Since mpox can spread from person to person, it is a zoonotic viral illness that poses a significant public health concern. It is difficult to make an early clinical diagnosis because of how closely its symptoms match those of measles and chickenpox. Medical imaging combined with deep learning (DL) techniques has shown promise in improving disease detection by analyzing affected skin areas. Our st… ▽ More

    Submitted 3 April, 2025; originally announced May 2025.

  6. arXiv:2504.10808  [pdf, other

    cs.CV cs.HC cs.LG

    Tabular foundation model to detect empathy from visual cues

    Authors: Md Rakibul Hasan, Shafin Rahman, Md Zakir Hossain, Aneesh Krishna, Tom Gedeon

    Abstract: Detecting empathy from video interactions is an emerging area of research. Video datasets, however, are often released as extracted features (i.e., tabular data) rather than raw footage due to privacy and ethical concerns. Prior research on such tabular datasets established tree-based classical machine learning approaches as the best-performing models. Motivated by the recent success of textual fo… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

  7. arXiv:2503.16585  [pdf, other

    cs.CL cs.CV cs.DC cs.LG

    Distributed LLMs and Multimodal Large Language Models: A Survey on Advances, Challenges, and Future Directions

    Authors: Hadi Amini, Md Jueal Mia, Yasaman Saadati, Ahmed Imteaj, Seyedsina Nabavirazavi, Urmish Thakker, Md Zarif Hossain, Awal Ahmed Fime, S. S. Iyengar

    Abstract: Language models (LMs) are machine learning models designed to predict linguistic patterns by estimating the probability of word sequences based on large-scale datasets, such as text. LMs have a wide range of applications in natural language processing (NLP) tasks, including autocomplete and machine translation. Although larger datasets typically enhance LM performance, scalability remains a challe… ▽ More

    Submitted 20 March, 2025; originally announced March 2025.

  8. arXiv:2503.07883  [pdf, other

    cs.LG

    Cross-platform Prediction of Depression Treatment Outcome Using Location Sensory Data on Smartphones

    Authors: Soumyashree Sahoo, Chinmaey Shende, Md. Zakir Hossain, Parit Patel, Yushuo Niu, Xinyu Wang, Shweta Ware, Jinbo Bi, Jayesh Kamath, Alexander Russel, Dongjin Song, Qian Yang, Bing Wang

    Abstract: Currently, depression treatment relies on closely monitoring patients response to treatment and adjusting the treatment as needed. Using self-reported or physician-administrated questionnaires to monitor treatment response is, however, burdensome, costly and suffers from recall bias. In this paper, we explore using location sensory data collected passively on smartphones to predict treatment outco… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

  9. arXiv:2501.14249  [pdf, other

    cs.LG cs.AI cs.CL

    Humanity's Last Exam

    Authors: Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Richard Ren, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Dmitry Dodonov, Tung Nguyen, Jaeho Lee, Daron Anderson, Mikhail Doroshenko, Alun Cennyth Stokes , et al. (1084 additional authors not shown)

    Abstract: Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of… ▽ More

    Submitted 19 April, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

    Comments: 29 pages, 6 figures

  10. arXiv:2501.12356  [pdf, other

    cs.CV

    Vision-Language Models for Automated Chest X-ray Interpretation: Leveraging ViT and GPT-2

    Authors: Md. Rakibul Islam, Md. Zahid Hossain, Mustofa Ahmed, Most. Sharmin Sultana Samu

    Abstract: Radiology plays a pivotal role in modern medicine due to its non-invasive diagnostic capabilities. However, the manual generation of unstructured medical reports is time consuming and prone to errors. It creates a significant bottleneck in clinical workflows. Despite advancements in AI-generated radiology reports, challenges remain in achieving detailed and accurate report generation. In this stud… ▽ More

    Submitted 21 January, 2025; originally announced January 2025.

    Comments: Preprint, manuscript under-review

  11. arXiv:2501.02442  [pdf, other

    cs.CV

    Unsupervised Search for Ethnic Minorities' Medical Segmentation Training Set

    Authors: Yixiao Chen, Yue Yao, Ruining Yang, Md Zakir Hossain, Ashu Gupta, Tom Gedeon

    Abstract: This article investigates the critical issue of dataset bias in medical imaging, with a particular emphasis on racial disparities caused by uneven population distribution in dataset collection. Our analysis reveals that medical segmentation datasets are significantly biased, primarily influenced by the demographic composition of their collection sites. For instance, Scanning Laser Ophthalmoscopy (… ▽ More

    Submitted 5 January, 2025; originally announced January 2025.

  12. arXiv:2501.00691  [pdf, other

    cs.CL cs.LG

    Labels Generated by Large Language Model Helps Measuring People's Empathy in Vitro

    Authors: Md Rakibul Hasan, Yue Yao, Md Zakir Hossain, Aneesh Krishna, Imre Rudas, Shafin Rahman, Tom Gedeon

    Abstract: Large language models (LLMs) have revolutionised numerous fields, with LLM-as-a-service (LLMSaaS) having a strong generalisation ability that offers accessible solutions directly without the need for costly training. In contrast to the widely studied prompt engineering for task solving directly (in vivo), this paper explores its potential in in-vitro applications. These involve using LLM to genera… ▽ More

    Submitted 31 December, 2024; originally announced January 2025.

    Comments: This work has been submitted to the IEEE for possible publication

  13. arXiv:2412.20674  [pdf, other

    cs.DC cs.CR cs.LG

    Blockchain-Empowered Cyber-Secure Federated Learning for Trustworthy Edge Computing

    Authors: Ervin Moore, Ahmed Imteaj, Md Zarif Hossain, Shabnam Rezapour, M. Hadi Amini

    Abstract: Federated Learning (FL) is a privacy-preserving distributed machine learning scheme, where each participant data remains on the participating devices and only the local model generated utilizing the local computational power is transmitted throughout the database. However, the distributed computational nature of FL creates the necessity to develop a mechanism that can remotely trigger any network… ▽ More

    Submitted 29 December, 2024; originally announced December 2024.

  14. arXiv:2410.17783  [pdf, other

    cs.CL cs.HC

    Leveraging the Domain Adaptation of Retrieval Augmented Generation Models for Question Answering and Reducing Hallucination

    Authors: Salman Rakin, Md. A. R. Shibly, Zahin M. Hossain, Zeeshan Khan, Md. Mostofa Akbar

    Abstract: While ongoing advancements in Large Language Models have demonstrated remarkable success across various NLP tasks, Retrieval Augmented Generation Model stands out to be highly effective on downstream applications like Question Answering. Recently, RAG-end2end model further optimized the architecture and achieved notable performance improvements on domain adaptation. However, the effectiveness of t… ▽ More

    Submitted 23 October, 2024; originally announced October 2024.

    Comments: Initial Version fine-tuned on HotelConvQA

  15. arXiv:2410.00028  [pdf, other

    eess.SP cs.LG

    Machine Learning to Detect Anxiety Disorders from Error-Related Negativity and EEG Signals

    Authors: Ramya Chandrasekar, Md Rakibul Hasan, Shreya Ghosh, Tom Gedeon, Md Zakir Hossain

    Abstract: Anxiety is a common mental health condition characterised by excessive worry, fear and apprehension about everyday situations. Even with significant progress over the past few years, predicting anxiety from electroencephalographic (EEG) signals, specifically using error-related negativity (ERN), still remains challenging. Following the PRISMA protocol, this paper systematically reviews 54 research… ▽ More

    Submitted 16 September, 2024; originally announced October 2024.

  16. arXiv:2409.07353  [pdf, other

    cs.CV cs.AI

    Securing Vision-Language Models with a Robust Encoder Against Jailbreak and Adversarial Attacks

    Authors: Md Zarif Hossain, Ahmed Imteaj

    Abstract: Large Vision-Language Models (LVLMs), trained on multimodal big datasets, have significantly advanced AI by excelling in vision-language tasks. However, these models remain vulnerable to adversarial attacks, particularly jailbreak attacks, which bypass safety protocols and cause the model to generate misleading or harmful responses. This vulnerability stems from both the inherent susceptibilities… ▽ More

    Submitted 11 September, 2024; originally announced September 2024.

  17. arXiv:2409.05347  [pdf, other

    cs.LG cs.AI

    TriplePlay: Enhancing Federated Learning with CLIP for Non-IID Data and Resource Efficiency

    Authors: Ahmed Imteaj, Md Zarif Hossain, Saika Zaman, Abdur R. Shahid

    Abstract: The rapid advancement and increasing complexity of pretrained models, exemplified by CLIP, offer significant opportunities as well as challenges for Federated Learning (FL), a critical component of privacy-preserving artificial intelligence. This research delves into the intricacies of integrating large foundation models like CLIP within FL frameworks to enhance privacy, efficiency, and adaptabili… ▽ More

    Submitted 8 October, 2024; v1 submitted 9 September, 2024; originally announced September 2024.

  18. arXiv:2407.14971  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Sim-CLIP: Unsupervised Siamese Adversarial Fine-Tuning for Robust and Semantically-Rich Vision-Language Models

    Authors: Md Zarif Hossain, Ahmed Imteaj

    Abstract: Vision-language models (VLMs) have achieved significant strides in recent times specially in multimodal tasks, yet they remain susceptible to adversarial attacks on their vision components. To address this, we propose Sim-CLIP, an unsupervised adversarial fine-tuning method that enhances the robustness of the widely-used CLIP vision encoder against such attacks while maintaining semantic richness… ▽ More

    Submitted 15 November, 2024; v1 submitted 20 July, 2024; originally announced July 2024.

  19. MADE-for-ASD: A Multi-Atlas Deep Ensemble Network for Diagnosing Autism Spectrum Disorder

    Authors: Xuehan Liu, Md Rakibul Hasan, Tom Gedeon, Md Zakir Hossain

    Abstract: In response to the global need for efficient early diagnosis of Autism Spectrum Disorder (ASD), this paper bridges the gap between traditional, time-consuming diagnostic methods and potential automated solutions. We propose a multi-atlas deep ensemble network, MADE-for-ASD, that integrates multiple atlases of the brain's functional magnetic resonance imaging (fMRI) data through a weighted deep ens… ▽ More

    Submitted 3 September, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: Xuehan Liu and Md Rakibul Hasan contributed equally to this work

    Journal ref: Computers in Biology and Medicine, Volume 182, November 2024

  20. arXiv:2405.09570  [pdf, other

    eess.SP cs.LG cs.SD eess.AS

    FunnelNet: An End-to-End Deep Learning Framework to Monitor Digital Heart Murmur in Real-Time

    Authors: Md Jobayer, Md. Mehedi Hasan Shawon, Md Rakibul Hasan, Shreya Ghosh, Tom Gedeon, Md Zakir Hossain

    Abstract: Objective: Heart murmurs are abnormal sounds caused by turbulent blood flow within the heart. Several diagnostic methods are available to detect heart murmurs and their severity, such as cardiac auscultation, echocardiography, phonocardiogram (PCG), etc. However, these methods have limitations, including extensive training and experience among healthcare providers, cost and accessibility of echoca… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 8-page main paper and 4-page supplementary material

  21. arXiv:2403.07483  [pdf, other

    cs.LG cs.AI

    DiabetesNet: A Deep Learning Approach to Diabetes Diagnosis

    Authors: Zeyu Zhang, Khandaker Asif Ahmed, Md Rakibul Hasan, Tom Gedeon, Md Zakir Hossain

    Abstract: Diabetes, resulting from inadequate insulin production or utilization, causes extensive harm to the body. Existing diagnostic methods are often invasive and come with drawbacks, such as cost constraints. Although there are machine learning models like Classwise k Nearest Neighbor (CkNN) and General Regression Neural Network (GRNN), they struggle with imbalanced data and result in under-performance… ▽ More

    Submitted 21 September, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: Accepted to ACIIDS 2024

  22. arXiv:2401.14772  [pdf, other

    cs.CV

    Spatial Transcriptomics Analysis of Zero-shot Gene Expression Prediction

    Authors: Yan Yang, Md Zakir Hossain, Xuesong Li, Shafin Rahman, Eric Stone

    Abstract: Spatial transcriptomics (ST) captures gene expression within distinct regions (i.e., windows) of a tissue slide. Traditional supervised learning frameworks applied to model ST are constrained to predicting expression from slide image windows for gene types seen during training, failing to generalize to unseen gene types. To overcome this limitation, we propose a semantic guided network (SGN), a pi… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  23. arXiv:2311.00721  [pdf, other

    cs.HC cs.LG cs.SI

    Empathy Detection from Text, Audiovisual, Audio or Physiological Signals: A Systematic Review of Task Formulations and Machine Learning Methods

    Authors: Md Rakibul Hasan, Md Zakir Hossain, Shreya Ghosh, Aneesh Krishna, Tom Gedeon

    Abstract: Empathy indicates an individual's ability to understand others. Over the past few years, empathy has drawn attention from various disciplines, including but not limited to Affective Computing, Cognitive Science, and Psychology. Detecting empathy has potential applications in society, healthcare and education. Despite being a broad and overlapping topic, the avenue of empathy detection leveraging M… ▽ More

    Submitted 20 May, 2025; v1 submitted 30 October, 2023; originally announced November 2023.

    Comments: This work has been submitted to the IEEE for possible publication

  24. arXiv:2305.01154  [pdf, other

    cs.LG cs.DC

    FedAVO: Improving Communication Efficiency in Federated Learning with African Vultures Optimizer

    Authors: Md Zarif Hossain, Ahmed Imteaj

    Abstract: Federated Learning (FL), a distributed machine learning technique has recently experienced tremendous growth in popularity due to its emphasis on user data privacy. However, the distributed computations of FL can result in constrained communication and drawn-out learning processes, necessitating the client-server communication cost optimization. The ratio of chosen clients and the quantity of loca… ▽ More

    Submitted 8 December, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

    Comments: 8 pages

  25. arXiv:2303.12772  [pdf, other

    cs.CL cs.AI

    Interpretable Bangla Sarcasm Detection using BERT and Explainable AI

    Authors: Ramisa Anan, Tasnim Sakib Apon, Zeba Tahsin Hossain, Elizabeth Antora Modhu, Sudipta Mondal, MD. Golam Rabiul Alam

    Abstract: A positive phrase or a sentence with an underlying negative motive is usually defined as sarcasm that is widely used in today's social media platforms such as Facebook, Twitter, Reddit, etc. In recent times active users in social media platforms are increasing dramatically which raises the need for an automated NLP-based system that can be utilized in various tasks such as determining market deman… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

  26. arXiv:2212.11211  [pdf, other

    cs.CV

    Land Cover and Land Use Detection using Semi-Supervised Learning

    Authors: Fahmida Tasnim Lisa, Md. Zarif Hossain, Sharmin Naj Mou, Shahriar Ivan, Md. Hasanul Kabir

    Abstract: Semi-supervised learning (SSL) has made significant strides in the field of remote sensing. Finding a large number of labeled datasets for SSL methods is uncommon, and manually labeling datasets is expensive and time-consuming. Furthermore, accurately identifying remote sensing satellite images is more complicated than it is for conventional images. Class-imbalanced datasets are another prevalent… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

  27. arXiv:2211.06366  [pdf, other

    cs.CL stat.AP

    Analysis of Male and Female Speakers' Word Choices in Public Speeches

    Authors: Md Zobaer Hossain, Ahnaf Mozib Samin

    Abstract: The extent to which men and women use language differently has been questioned previously. Finding clear and consistent gender differences in language is not conclusive in general, and the research is heavily influenced by the context and method employed to identify the difference. In addition, the majority of the research was conducted in written form, and the sample was collected in writing. The… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

  28. arXiv:2210.16721  [pdf, other

    cs.CV

    Exemplar Guided Deep Neural Network for Spatial Transcriptomics Analysis of Gene Expression Prediction

    Authors: Yan Yang, Md Zakir Hossain, Eric A Stone, Shafin Rahman

    Abstract: Spatial transcriptomics (ST) is essential for understanding diseases and developing novel treatments. It measures gene expression of each fine-grained area (i.e., different windows) in the tissue slide with low throughput. This paper proposes an Exemplar Guided Network (EGN) to accurately and efficiently predict gene expression directly from each window of a tissue slide image. We apply exemplar l… ▽ More

    Submitted 29 October, 2022; originally announced October 2022.

  29. arXiv:2210.04240  [pdf, other

    cs.CV

    Less is More: Facial Landmarks can Recognize a Spontaneous Smile

    Authors: Md. Tahrim Faroque, Yan Yang, Md Zakir Hossain, Sheikh Motahar Naim, Nabeel Mohammed, Shafin Rahman

    Abstract: Smile veracity classification is a task of interpreting social interactions. Broadly, it distinguishes between spontaneous and posed smiles. Previous approaches used hand-engineered features from facial landmarks or considered raw smile videos in an end-to-end manner to perform smile classification tasks. Feature-based methods require intervention from human experts on feature engineering and heav… ▽ More

    Submitted 9 October, 2022; originally announced October 2022.

  30. arXiv:2203.13132  [pdf, other

    q-bio.QM cs.LG q-bio.BM

    DPST: De Novo Peptide Sequencing with Amino-Acid-Aware Transformers

    Authors: Yan Yang, Zakir Hossain, Khandaker Asif, Liyuan Pan, Shafin Rahman, Eric Stone

    Abstract: De novo peptide sequencing aims to recover amino acid sequences of a peptide from tandem mass spectrometry (MS) data. Existing approaches for de novo analysis enumerate MS evidence for all amino acid classes during inference. It leads to over-trimming on receptive fields of MS data and restricts MS evidence associated with following undecoded amino acids. Our approach, DPST, circumvents these limi… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

  31. arXiv:2102.03994  [pdf

    cs.HC

    Observers Pupillary Responses in Recognising Real and Posed Smiles: A Preliminary Study

    Authors: Ruiqi Chen, Atiqul Islam, Tom Gedeon, Md Zakir Hossain

    Abstract: Pupillary responses (PR) change differently for different types of stimuli. This study aims to check whether observers PR can recognise real and posed smiles from a set of smile images and videos. We showed the smile images and smile videos stimuli to observers, and recorded their pupillary responses considering four different situations, namely paired videos, paired images, single videos, and sin… ▽ More

    Submitted 7 February, 2021; originally announced February 2021.

    Comments: 7 pages, 5 figures

  32. arXiv:2011.14785  [pdf, other

    cs.CV

    S2FGAN: Semantically Aware Interactive Sketch-to-Face Translation

    Authors: Yan Yang, Md Zakir Hossain, Tom Gedeon, Shafin Rahman

    Abstract: Interactive facial image manipulation attempts to edit single and multiple face attributes using a photo-realistic face and/or semantic mask as input. In the absence of the photo-realistic image (only sketch/mask available), previous methods only retrieve the original face but ignore the potential of aiding model controllability and diversity in the translation process. This paper proposes a sketc… ▽ More

    Submitted 11 October, 2021; v1 submitted 30 November, 2020; originally announced November 2020.

  33. Prediction of Temperature and Rainfall in Bangladesh using Long Short Term Memory Recurrent Neural Networks

    Authors: Mohammad Mahmudur Rahman Khan, Md. Abu Bakr Siddique, Shadman Sakib, Anas Aziz, Ihtyaz Kader Tasawar, Ziad Hossain

    Abstract: Temperature and rainfall have a significant impact on economic growth as well as the outbreak of seasonal diseases in a region. In spite of that inadequate studies have been carried out for analyzing the weather pattern of Bangladesh implementing the artificial neural network. Therefore, in this study, we are implementing a Long Short-term Memory (LSTM) model to forecast the month-wise temperature… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

    Comments: 4th International Symposium on Multidisciplinary Studies and Innovative Technologies, IEEE, 22-24 October, 2020, TURKEY

    Journal ref: 2020 4th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT)

  34. arXiv:2010.03203  [pdf, other

    cs.CV

    RealSmileNet: A Deep End-To-End Network for Spontaneous and Posed Smile Recognition

    Authors: Yan Yang, Md Zakir Hossain, Tom Gedeon, Shafin Rahman

    Abstract: Smiles play a vital role in the understanding of social interactions within different communities, and reveal the physical state of mind of people in both real and deceptive ways. Several methods have been proposed to recognize spontaneous and posed smiles. All follow a feature-engineering based pipeline requiring costly pre-processing steps such as manual annotation of face landmarks, tracking, s… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

    Comments: Accepted by ACCV

  35. arXiv:2004.08789  [pdf, other

    cs.CL

    BanFakeNews: A Dataset for Detecting Fake News in Bangla

    Authors: Md Zobaer Hossain, Md Ashraful Rahman, Md Saiful Islam, Sudipta Kar

    Abstract: Observing the damages that can be done by the rapid propagation of fake news in various sectors like politics and finance, automatic identification of fake news using linguistic analysis has drawn the attention of the research community. However, such methods are largely being developed for English where low resource languages remain out of the focus. But the risks spawned by fake and manipulative… ▽ More

    Submitted 19 April, 2020; originally announced April 2020.

    Comments: LREC 2020

  36. arXiv:1810.04020  [pdf, other

    cs.CV cs.LG stat.ML

    A Comprehensive Survey of Deep Learning for Image Captioning

    Authors: Md. Zakir Hossain, Ferdous Sohel, Mohd Fairuz Shiratuddin, Hamid Laga

    Abstract: Generating a description of an image is called image captioning. Image captioning requires to recognize the important objects, their attributes and their relationships in an image. It also needs to generate syntactically and semantically correct sentences. Deep learning-based techniques are capable of handling the complexities and challenges of image captioning. In this survey paper, we aim to pre… ▽ More

    Submitted 14 October, 2018; v1 submitted 6 October, 2018; originally announced October 2018.

    Comments: 36 Pages, Accepted as a Journal Paper in ACM Computing Surveys (October 2018)

  37. arXiv:1206.0238  [pdf, other

    cs.CV

    Rapid Feature Extraction for Optical Character Recognition

    Authors: M. Zahid Hossain, M. Ashraful Amin, Hong Yan

    Abstract: Feature extraction is one of the fundamental problems of character recognition. The performance of character recognition system is depends on proper feature extraction and correct classifier selection. In this article, a rapid feature extraction method is proposed and named as Celled Projection (CP) that compute the projection of each section formed through partitioning an image. The recognition p… ▽ More

    Submitted 1 June, 2012; originally announced June 2012.

    Comments: 5 pages, 1 figure

    ACM Class: I.5.2; I.7.5

  38. arXiv:1004.4580  [pdf

    cs.CY

    In Quest of the Better Mobile Broadband Solution for South Asia Taking WiMAX and LTE into Consideration

    Authors: Nafiz Imtiaz Bin Hamid, Md. R. H. Khandokar, Taskin Jamal, Md. A. Shoeb, Md. Zakir Hossain

    Abstract: Internet generation is growing accustomed to having broadband access wherever they go and not just at home or in the office, which turns mobile broadband into a reality. This paper aims to look for a suitable mobile broadband solution in the South Asian region through comparative analysis in various perspectives. Both WiMAX and LTE are 4G technologies designed to move data rather than voice having… ▽ More

    Submitted 26 April, 2010; originally announced April 2010.

    Comments: N. I. B. Hamid, Md. R. H. Khandokar, T. Jamal, Md. A. Shoeb and Md. Z. Hossain, "In Quest of the Better Mobile Broadband Solution for South Asia Taking WiMAX and LTE into Consideration", Journal of Telecommunications, Volume 2, Issue 1, p86-94, April 2010

    Journal ref: Journal of Telecommunications, Volume 2, Issue 1, p86-94, April 2010

  39. arXiv:1004.1788  [pdf

    cs.NI

    Mobile Broadband Possibilities considering the Arrival of IEEE 802.16m & LTE with an Emphasis on South Asia

    Authors: Nafiz Imtiaz Bin Hamid, Md. Zakir Hossain, Md. R. H. Khandokar, Taskin Jamal, Md. A. Shoeb

    Abstract: This paper intends to look deeper into finding an ideal mobile broadband solution. Special stress has been put in the South Asian region through some comparative analysis. Proving their competency in numerous aspects, WiMAX and LTE already have already made a strong position in telecommunication industry. Both WiMAX and LTE are 4G technologies designed to move data rather than voice having IP netw… ▽ More

    Submitted 11 April, 2010; originally announced April 2010.

    Comments: IEEE Publication format, ISSN 1947 5500, http://sites.google.com/site/ijcsis/

    Journal ref: IJCSIS, Vol. 7 No. 3, March 2010, 267-275