Skip to main content

Showing 1–19 of 19 results for author: Bhatia, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.13458  [pdf, ps, other

    cs.CV cs.CL

    Leveraging Vision-Language Pre-training for Human Activity Recognition in Still Images

    Authors: Cristina Mahanta, Gagan Bhatia

    Abstract: Recognising human activity in a single photo enables indexing, safety and assistive applications, yet lacks motion cues. Using 285 MSCOCO images labelled as walking, running, sitting, and standing, scratch CNNs scored 41% accuracy. Fine-tuning multimodal CLIP raised this to 76%, demonstrating that contrastive vision-language pre-training decisively improves still-image action recognition in real-w… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

  2. arXiv:2505.16088  [pdf, ps, other

    cs.CL cs.AI

    Date Fragments: A Hidden Bottleneck of Tokenization for Temporal Reasoning

    Authors: Gagan Bhatia, Maxime Peyrard, Wei Zhao

    Abstract: Modern BPE tokenizers often split calendar dates into meaningless fragments, e.g., 20250312 $\rightarrow$ 202, 503, 12, inflating token counts and obscuring the inherent structure needed for robust temporal reasoning. In this work, we (1) introduce a simple yet interpretable metric, termed date fragmentation ratio, that measures how faithfully a tokenizer preserves multi-digit date components; (2)… ▽ More

    Submitted 25 May, 2025; v1 submitted 21 May, 2025; originally announced May 2025.

  3. arXiv:2412.13377  [pdf, other

    cs.CL cs.AI

    DateLogicQA: Benchmarking Temporal Biases in Large Language Models

    Authors: Gagan Bhatia, MingZe Tang, Cristina Mahanta, Madiha Kazi

    Abstract: This paper introduces DateLogicQA, a benchmark with 190 questions covering diverse date formats, temporal contexts, and reasoning types. We propose the Semantic Integrity Metric to assess tokenization quality and analyse two biases: Representation-Level Bias, affecting embeddings, and Logical-Level Bias, influencing reasoning outputs. Our findings provide a comprehensive evaluation of LLMs' capabi… ▽ More

    Submitted 19 May, 2025; v1 submitted 17 December, 2024; originally announced December 2024.

  4. arXiv:2411.01192  [pdf, other

    cs.CL

    Swan and ArabicMTEB: Dialect-Aware, Arabic-Centric, Cross-Lingual, and Cross-Cultural Embedding Models and Benchmarks

    Authors: Gagan Bhatia, El Moatez Billah Nagoudi, Abdellah El Mekki, Fakhraddin Alwajih, Muhammad Abdul-Mageed

    Abstract: We introduce {\bf Swan}, a family of embedding models centred around the Arabic language, addressing both small-scale and large-scale use cases. Swan includes two variants: Swan-Small, based on ARBERTv2, and Swan-Large, built on ArMistral, a pretrained Arabic large language model. To evaluate these models, we propose ArabicMTEB, a comprehensive benchmark suite that assesses cross-lingual, multi-di… ▽ More

    Submitted 11 February, 2025; v1 submitted 2 November, 2024; originally announced November 2024.

  5. arXiv:2407.18129  [pdf, other

    cs.CL cs.AI

    Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic

    Authors: Fakhraddin Alwajih, Gagan Bhatia, Muhammad Abdul-Mageed

    Abstract: Recent advancements have significantly enhanced the capabilities of Multimodal Large Language Models (MLLMs) in generating and understanding image-to-text content. Despite these successes, progress is predominantly limited to English due to the scarcity of high quality multimodal resources in other languages. This limitation impedes the development of competitive models in languages such as Arabic… ▽ More

    Submitted 26 July, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

  6. arXiv:2407.16528  [pdf, other

    eess.SP cs.ET

    Analysis of 3GPP and Ray-Tracing Based Channel Model for 5G Industrial Network Planning

    Authors: Gurjot Singh Bhatia, Yoann Corre, Linus Thrybom, M. Di Renzo

    Abstract: Appropriate channel models tailored to the specific needs of industrial environments are crucial for the 5G private industrial network design and guiding deployment strategies. This paper scrutinizes the applicability of 3GPP's channel model for industrial scenarios. The challenges in accurately modeling industrial channels are addressed, and a refinement strategy is proposed employing a ray-traci… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: copyright 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  7. arXiv:2407.13559  [pdf, other

    cs.CV cs.AI cs.CL

    Qalam : A Multimodal LLM for Arabic Optical Character and Handwriting Recognition

    Authors: Gagan Bhatia, El Moatez Billah Nagoudi, Fakhraddin Alwajih, Muhammad Abdul-Mageed

    Abstract: Arabic Optical Character Recognition (OCR) and Handwriting Recognition (HWR) pose unique challenges due to the cursive and context-sensitive nature of the Arabic script. This study introduces Qalam, a novel foundation model designed for Arabic OCR and HWR, built on a SwinV2 encoder and RoBERTa decoder architecture. Our model significantly outperforms existing methods, achieving a Word Error Rate (… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  8. arXiv:2403.01031  [pdf, other

    cs.CL cs.AI

    Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks

    Authors: Fakhraddin Alwajih, El Moatez Billah Nagoudi, Gagan Bhatia, Abdelrahman Mohamed, Muhammad Abdul-Mageed

    Abstract: Multimodal large language models (MLLMs) have proven effective in a wide range of tasks requiring complex reasoning and linguistic comprehension. However, due to a lack of high-quality multimodal resources in languages other than English, success of MLLMs remains relatively limited to English-based settings. This poses significant challenges in developing comparable models for other languages, inc… ▽ More

    Submitted 24 May, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

  9. arXiv:2402.10986  [pdf, other

    cs.CL cs.AI

    FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models

    Authors: Gagan Bhatia, El Moatez Billah Nagoudi, Hasan Cavusoglu, Muhammad Abdul-Mageed

    Abstract: We introduce FinTral, a suite of state-of-the-art multimodal large language models (LLMs) built upon the Mistral-7b model and tailored for financial analysis. FinTral integrates textual, numerical, tabular, and image data. We enhance FinTral with domain-specific pretraining, instruction fine-tuning, and RLAIF training by exploiting a large collection of textual and visual datasets we curate for th… ▽ More

    Submitted 14 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  10. arXiv:2312.08400  [pdf, other

    cs.CL cs.AI

    Beyond English: Evaluating LLMs for Arabic Grammatical Error Correction

    Authors: Sang Yun Kwon, Gagan Bhatia, El Moatez Billah Nagoudi, Muhammad Abdul-Mageed

    Abstract: Large language models (LLMs) finetuned to follow human instruction have recently exhibited significant capabilities in various English NLP tasks. However, their performance in grammatical error correction (GEC), especially on languages other than English, remains significantly unexplored. In this work, we evaluate the abilities of instruction finetuned LLMs in Arabic GEC, a complex task due to Ara… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

    Comments: arXiv admin note: text overlap with arXiv:2308.04492

  11. arXiv:2309.06101  [pdf, other

    eess.SP cs.NI

    Tuning of Ray-Based Channel Model for 5G Indoor Industrial Scenarios

    Authors: Gurjot Singh Bhatia, Yoann Corre, Marco Di Renzo

    Abstract: This paper presents an innovative method that can be used to produce deterministic channel models for 5G industrial internet-of-things (IIoT) scenarios. Ray-tracing (RT) channel emulation can capture many of the specific properties of a propagation scenario, which is incredibly beneficial when facing various industrial environments and deployment setups. But the environment's complexity, composed… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: copyright 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  12. arXiv:2308.04492  [pdf, other

    cs.AI

    ChatGPT for Arabic Grammatical Error Correction

    Authors: Sang Yun Kwon, Gagan Bhatia, El Moatez Billah Nagoud, Muhammad Abdul-Mageed

    Abstract: Recently, large language models (LLMs) fine-tuned to follow human instruction have exhibited significant capabilities in various English NLP tasks. However, their performance in grammatical error correction (GEC) tasks, particularly in non-English languages, remains significantly unexplored. In this paper, we delve into abilities of instruction fine-tuned LLMs in Arabic GEC, a task made complex du… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

  13. arXiv:2306.01408  [pdf, other

    eess.SP cs.NI

    Efficient Ray-Tracing Channel Emulation in Industrial Environments: An Analysis of Propagation Model Impact

    Authors: Gurjot Singh Bhatia, Yoann Corre, M. Di Renzo

    Abstract: Industrial environments are considered to be severe from the point of view of electromagnetic (EM) wave propagation. When dealing with a wide range of industrial environments and deployment setups, ray-tracing channel emulation can capture many distinctive characteristics of a propagation scenario. Ray-tracing tools often require a detailed and accurate description of the propagation scenario. Con… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: copyright 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  14. arXiv:2304.13292  [pdf, other

    cs.CL

    Zero-Shot Slot and Intent Detection in Low-Resource Languages

    Authors: Sang Yun Kwon, Gagan Bhatia, El Moatez Billah Nagoudi, Alcides Alcoba Inciarte, Muhammad Abdul-Mageed

    Abstract: Intent detection and slot filling are critical tasks in spoken and natural language understanding for task-oriented dialog systems. In this work we describe our participation in the slot and intent detection for low-resource language varieties (SID4LR; Aepli et al. (2023)). We investigate the slot and intent detection (SID) tasks using a wide range of models and settings. Given the recent success… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

    Comments: VarDial @ EACL

  15. arXiv:2304.11256  [pdf, other

    cs.CL

    UBC-DLNLP at SemEval-2023 Task 12: Impact of Transfer Learning on African Sentiment Analysis

    Authors: Gagan Bhatia, Ife Adebara, AbdelRahim Elmadany, Muhammad Abdul-Mageed

    Abstract: We describe our contribution to the SemEVAl 2023 AfriSenti-SemEval shared task, where we tackle the task of sentiment analysis in 14 different African languages. We develop both monolingual and multilingual models under a full supervised setting (subtasks A and B). We also develop models for the zero-shot setting (subtask C). Our approach involves experimenting with transfer learning using six lan… ▽ More

    Submitted 25 April, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

    Comments: AfriSenti 2023 @ ACL 2023

  16. arXiv:2007.08003  [pdf

    cs.CY cs.CV cs.LG cs.SD eess.AS

    Stutter Diagnosis and Therapy System Based on Deep Learning

    Authors: Gresha Bhatia, Binoy Saha, Mansi Khamkar, Ashish Chandwani, Reshma Khot

    Abstract: Stuttering, also called stammering, is a communication disorder that breaks the continuity of the speech. This program of work is an attempt to develop automatic recognition procedures to assess stuttered dysfluencies and use these assessments to filter out speech therapies for an individual. Stuttering may be in the form of repetitions, prolongations or abnormal stoppages of sounds and syllables.… ▽ More

    Submitted 13 July, 2020; originally announced July 2020.

    Comments: About stutter classification, severity diagnosis and therapy recommendation

  17. arXiv:2006.14782  [pdf, other

    cs.CR cs.HC

    WorkerRep: Immutable Reputation System For Crowdsourcing Platform Based on Blockchain

    Authors: Gurpriya Kaur Bhatia, Shubham Gupta, Alpana Dubey, Ponnurangam Kumaraguru

    Abstract: Crowdsourcing is a process wherein an individual or an organisation utilizes the talent pool present over the Internet to accomplish their task. The existing crowdsourcing platforms and their reputation computation are centralised and hence prone to various attacks or malicious manipulation of the data by the central entity. A few distributed crowdsourcing platforms have been proposed but they lac… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

  18. arXiv:1111.1086  [pdf

    cs.AR cs.DS cs.MS

    Design and Simulation of an 8-bit Dedicated Processor for calculating the Sine and Cosine of an Angle using the CORDIC Algorithm

    Authors: Aman Chadha, Divya Jyoti, M. G. Bhatia

    Abstract: This paper describes the design and simulation of an 8-bit dedicated processor for calculating the Sine and Cosine of an Angle using CORDIC Algorithm (COordinate Rotation DIgital Computer), a simple and efficient algorithm to calculate hyperbolic and trigonometric functions. We have proposed a dedicated processor system, modeled by writing appropriate programs in VHDL, for calculating the Sine and… ▽ More

    Submitted 4 November, 2011; originally announced November 2011.

    Comments: CORDIC, VHDL, dedicated processor, datapath, finite state machine

    Journal ref: Proceedings of the 2011 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC); IEEE Xplore: CFB1120J-ART; ISBN: 978-1-61284-694-1; Print Version: CFB1120J-PRT; ISBN: 978-1-61284-766-5

  19. arXiv:1011.6180  [pdf

    cs.NI

    Adapting MAC 802.11 Adapting MAC 802.11 for Performance Optimization of MANET using Cross Layer Interaction

    Authors: Gaurav Bhatia, Vivek Kumar

    Abstract: In this research, we study the optimization challenges of MANET and cross-layer technique to improve its performance. We propose an adaptive retransmission limits algorithm for IEEE 802.11 MAC to reduce the false link failures and predict the node mobility. We implemented cross layer interaction between physical and MAC layers. The MAC layer utilizes the physical layer information for differentiat… ▽ More

    Submitted 29 November, 2010; originally announced November 2010.

    Journal ref: International Journal of Wireless & Mobile Networks (IJWMN) Vol.2, No.4, November 2010