-
How Students (Really) Use ChatGPT: Uncovering Experiences Among Undergraduate Students
Authors:
Tawfiq Ammari,
Meilun Chen,
S M Mehedi Zaman,
Kiran Garimella
Abstract:
This study investigates how undergraduate students engage with ChatGPT in self-directed learning contexts. Analyzing naturalistic interaction logs, we identify five dominant use categories of ChatGPT: information seeking, content generation, language refinement, metacognitive engagement, and conversational repair. Behavioral modeling reveals that structured, goal-driven tasks like coding, multiple…
▽ More
This study investigates how undergraduate students engage with ChatGPT in self-directed learning contexts. Analyzing naturalistic interaction logs, we identify five dominant use categories of ChatGPT: information seeking, content generation, language refinement, metacognitive engagement, and conversational repair. Behavioral modeling reveals that structured, goal-driven tasks like coding, multiple-choice solving, and job application writing are strong predictors of continued use. Drawing on Self-Directed Learning (SDL) and the Uses and Gratifications Theory (UGT), we show how students actively manage ChatGPT's affordances and limitations through prompt adaptation, follow-ups, and emotional regulation. Rather than disengaging after breakdowns, students often persist through clarification and repair, treating the assistant as both tool and learning partner. We also offer design and policy recommendations to support transparent, responsive, and pedagogically grounded integration of generative AI in higher education.
△ Less
Submitted 8 September, 2025; v1 submitted 29 May, 2025;
originally announced May 2025.
-
"Sorry, Come Again?" Prompting -- Enhancing Comprehension and Diminishing Hallucination with [PAUSE]-injected Optimal Paraphrasing
Authors:
Vipula Rawte,
S. M Towhidul Islam Tonmoy,
S M Mehedi Zaman,
Prachi Priya,
Aman Chadha,
Amit P. Sheth,
Amitava Das
Abstract:
Hallucination has emerged as the most vulnerable aspect of contemporary Large Language Models (LLMs). In this paper, we introduce the Sorry, Come Again (SCA) prompting, aimed to avoid LLM hallucinations by enhancing comprehension through: (i) optimal paraphrasing and (ii) injecting [PAUSE] tokens to delay LLM generation. First, we provide an in-depth analysis of linguistic nuances: formality, read…
▽ More
Hallucination has emerged as the most vulnerable aspect of contemporary Large Language Models (LLMs). In this paper, we introduce the Sorry, Come Again (SCA) prompting, aimed to avoid LLM hallucinations by enhancing comprehension through: (i) optimal paraphrasing and (ii) injecting [PAUSE] tokens to delay LLM generation. First, we provide an in-depth analysis of linguistic nuances: formality, readability, and concreteness of prompts for 21 LLMs, and elucidate how these nuances contribute to hallucinated generation. Prompts with lower readability, formality, or concreteness pose comprehension challenges for LLMs, similar to those faced by humans. In such scenarios, an LLM tends to speculate and generate content based on its imagination (associative memory) to fill these information gaps. Although these speculations may occasionally align with factual information, their accuracy is not assured, often resulting in hallucination. Recent studies reveal that an LLM often neglects the middle sections of extended prompts, a phenomenon termed as lost in the middle. While a specific paraphrase may suit one LLM, the same paraphrased version may elicit a different response from another LLM. Therefore, we propose an optimal paraphrasing technique to identify the most comprehensible paraphrase of a given prompt, evaluated using Integrated Gradient (and its variations) to guarantee that the LLM accurately processes all words. While reading lengthy sentences, humans often pause at various points to better comprehend the meaning read thus far. We have fine-tuned an LLM with injected [PAUSE] tokens, allowing the LLM to pause while reading lengthier prompts. This has brought several key contributions: (i) determining the optimal position to inject [PAUSE], (ii) determining the number of [PAUSE] tokens to be inserted, and (iii) introducing reverse proxy tuning to fine-tune the LLM for [PAUSE] insertion.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey
Authors:
Saurav Pawar,
S. M Towhidul Islam Tonmoy,
S M Mehedi Zaman,
Vinija Jain,
Aman Chadha,
Amitava Das
Abstract:
The advent of Large Language Models (LLMs) represents a notable breakthrough in Natural Language Processing (NLP), contributing to substantial progress in both text comprehension and generation. However, amidst these advancements, it is noteworthy that LLMs often face a limitation in terms of context length extrapolation. Understanding and extending the context length for LLMs is crucial in enhanc…
▽ More
The advent of Large Language Models (LLMs) represents a notable breakthrough in Natural Language Processing (NLP), contributing to substantial progress in both text comprehension and generation. However, amidst these advancements, it is noteworthy that LLMs often face a limitation in terms of context length extrapolation. Understanding and extending the context length for LLMs is crucial in enhancing their performance across various NLP applications. In this survey paper, we delve into the multifaceted aspects of exploring why it is essential, and the potential transformations that superior techniques could bring to NLP applications. We study the inherent challenges associated with extending context length and present an organized overview of the existing strategies employed by researchers. Additionally, we discuss the intricacies of evaluating context extension techniques and highlight the open challenges that researchers face in this domain. Furthermore, we explore whether there is a consensus within the research community regarding evaluation standards and identify areas where further agreement is needed. This comprehensive survey aims to serve as a valuable resource for researchers, guiding them through the nuances of context length extension techniques and fostering discussions on future advancements in this evolving field.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models
Authors:
S. M Towhidul Islam Tonmoy,
S M Mehedi Zaman,
Vinija Jain,
Anku Rani,
Vipula Rawte,
Aman Chadha,
Amitava Das
Abstract:
As Large Language Models (LLMs) continue to advance in their ability to write human-like text, a key challenge remains around their tendency to hallucinate generating content that appears factual but is ungrounded. This issue of hallucination is arguably the biggest hindrance to safely deploying these powerful LLMs into real-world production systems that impact people's lives. The journey toward w…
▽ More
As Large Language Models (LLMs) continue to advance in their ability to write human-like text, a key challenge remains around their tendency to hallucinate generating content that appears factual but is ungrounded. This issue of hallucination is arguably the biggest hindrance to safely deploying these powerful LLMs into real-world production systems that impact people's lives. The journey toward widespread adoption of LLMs in practical settings heavily relies on addressing and mitigating hallucinations. Unlike traditional AI systems focused on limited tasks, LLMs have been exposed to vast amounts of online text data during training. While this allows them to display impressive language fluency, it also means they are capable of extrapolating information from the biases in training data, misinterpreting ambiguous prompts, or modifying the information to align superficially with the input. This becomes hugely alarming when we rely on language generation capabilities for sensitive applications, such as summarizing medical records, financial analysis reports, etc. This paper presents a comprehensive survey of over 32 techniques developed to mitigate hallucination in LLMs. Notable among these are Retrieval Augmented Generation (Lewis et al, 2021), Knowledge Retrieval (Varshney et al,2023), CoNLI (Lei et al, 2023), and CoVe (Dhuliawala et al, 2023). Furthermore, we introduce a detailed taxonomy categorizing these methods based on various parameters, such as dataset utilization, common tasks, feedback mechanisms, and retriever types. This classification helps distinguish the diverse approaches specifically designed to tackle hallucination issues in LLMs. Additionally, we analyze the challenges and limitations inherent in these techniques, providing a solid foundation for future research in addressing hallucinations and related phenomena within the realm of LLMs.
△ Less
Submitted 8 January, 2024; v1 submitted 2 January, 2024;
originally announced January 2024.
-
Counter Turing Test CT^2: AI-Generated Text Detection is Not as Easy as You May Think -- Introducing AI Detectability Index
Authors:
Megha Chakraborty,
S. M Towhidul Islam Tonmoy,
S M Mehedi Zaman,
Krish Sharma,
Niyar R Barman,
Chandan Gupta,
Shreya Gautam,
Tanay Kumar,
Vinija Jain,
Aman Chadha,
Amit P. Sheth,
Amitava Das
Abstract:
With the rise of prolific ChatGPT, the risk and consequences of AI-generated text has increased alarmingly. To address the inevitable question of ownership attribution for AI-generated artifacts, the US Copyright Office released a statement stating that 'If a work's traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it'.…
▽ More
With the rise of prolific ChatGPT, the risk and consequences of AI-generated text has increased alarmingly. To address the inevitable question of ownership attribution for AI-generated artifacts, the US Copyright Office released a statement stating that 'If a work's traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it'. Furthermore, both the US and the EU governments have recently drafted their initial proposals regarding the regulatory framework for AI. Given this cynosural spotlight on generative AI, AI-generated text detection (AGTD) has emerged as a topic that has already received immediate attention in research, with some initial methods having been proposed, soon followed by emergence of techniques to bypass detection. This paper introduces the Counter Turing Test (CT^2), a benchmark consisting of techniques aiming to offer a comprehensive evaluation of the robustness of existing AGTD techniques. Our empirical findings unequivocally highlight the fragility of the proposed AGTD methods under scrutiny. Amidst the extensive deliberations on policy-making for regulating AI development, it is of utmost importance to assess the detectability of content generated by LLMs. Thus, to establish a quantifiable spectrum facilitating the evaluation and ranking of LLMs according to their detectability levels, we propose the AI Detectability Index (ADI). We conduct a thorough examination of 15 contemporary LLMs, empirically demonstrating that larger LLMs tend to have a higher ADI, indicating they are less detectable compared to smaller LLMs. We firmly believe that ADI holds significant value as a tool for the wider NLP community, with the potential to serve as a rubric in AI-related policy-making.
△ Less
Submitted 23 October, 2023; v1 submitted 8 October, 2023;
originally announced October 2023.
-
Exploring the Relationship between LLM Hallucinations and Prompt Linguistic Nuances: Readability, Formality, and Concreteness
Authors:
Vipula Rawte,
Prachi Priya,
S. M Towhidul Islam Tonmoy,
S M Mehedi Zaman,
Amit Sheth,
Amitava Das
Abstract:
As Large Language Models (LLMs) have advanced, they have brought forth new challenges, with one of the prominent issues being LLM hallucination. While various mitigation techniques are emerging to address hallucination, it is equally crucial to delve into its underlying causes. Consequently, in this preliminary exploratory investigation, we examine how linguistic factors in prompts, specifically r…
▽ More
As Large Language Models (LLMs) have advanced, they have brought forth new challenges, with one of the prominent issues being LLM hallucination. While various mitigation techniques are emerging to address hallucination, it is equally crucial to delve into its underlying causes. Consequently, in this preliminary exploratory investigation, we examine how linguistic factors in prompts, specifically readability, formality, and concreteness, influence the occurrence of hallucinations. Our experimental results suggest that prompts characterized by greater formality and concreteness tend to result in reduced hallucination. However, the outcomes pertaining to readability are somewhat inconclusive, showing a mixed pattern.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
CompTLL-UNet: Compressed Domain Text-Line Localization in Challenging Handwritten Documents using Deep Feature Learning from JPEG Coefficients
Authors:
Bulla Rajesh,
Sk Mahafuz Zaman,
Mohammed Javed,
P. Nagabhushan
Abstract:
Automatic localization of text-lines in handwritten documents is still an open and challenging research problem. Various writing issues such as uneven spacing between the lines, oscillating and touching text, and the presence of skew become much more challenging when the case of complex handwritten document images are considered for segmentation directly in their respective compressed representati…
▽ More
Automatic localization of text-lines in handwritten documents is still an open and challenging research problem. Various writing issues such as uneven spacing between the lines, oscillating and touching text, and the presence of skew become much more challenging when the case of complex handwritten document images are considered for segmentation directly in their respective compressed representation. This is because, the conventional way of processing compressed documents is through decompression, but here in this paper, we propose an idea that employs deep feature learning directly from the JPEG compressed coefficients without full decompression to accomplish text-line localization in the JPEG compressed domain. A modified U-Net architecture known as Compressed Text-Line Localization Network (CompTLL-UNet) is designed to accomplish it. The model is trained and tested with JPEG compressed version of benchmark datasets including ICDAR2017 (cBAD) and ICDAR2019 (cBAD), reporting the state-of-the-art performance with reduced storage and computational costs in the JPEG compressed domain.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
OOG- Optuna Optimized GAN Sampling Technique for Tabular Imbalanced Malware Data
Authors:
S. M Towhidul Islam Tonmoy,
S. M Mehedi Zaman
Abstract:
Cyberspace occupies a large portion of people's life in the age of modern technology, and while there are those who utilize it for good, there are also those who do not. Malware is an application whose construction was not motivated by a benign goal and it can harm, steal, or even alter personal information and secure applications and software. Thus, there are numerous techniques to avoid malware,…
▽ More
Cyberspace occupies a large portion of people's life in the age of modern technology, and while there are those who utilize it for good, there are also those who do not. Malware is an application whose construction was not motivated by a benign goal and it can harm, steal, or even alter personal information and secure applications and software. Thus, there are numerous techniques to avoid malware, one of which is to develop samples of malware so that the system can be updated with the growing number of malwares, allowing it to recognize when malwares attempt to enter. The Generative Adversarial Network (GAN) sampling technique has been used in this study to generate new malware samples. GANs have multiple variants, and in order to determine which variant is optimal for a given dataset sample, their parameters must be modified. This study employs Optuna, an autonomous hyperparameter tuning algorithm, to determine the optimal settings for the dataset under consideration. In this study, the architecture of the Optuna Optimized GAN (OOG) method is shown, along with scores of 98.06%, 99.00%, 97.23%, and 98.04% for accuracy, precision, recall and f1 score respectively. After tweaking the hyperparameters of five supervised boosting algorithms, XGBoost, LightGBM, CatBoost, Extra Trees Classifier, and Gradient Boosting Classifier, the methodology of this paper additionally employs the weighted ensemble technique to acquire this result. In addition to comparing existing efforts in this domain, the study demonstrates how promising GAN is in comparison to other sampling techniques such as SMOTE.
△ Less
Submitted 25 November, 2022;
originally announced December 2022.
-
Survival Prediction of Heart Failure Patients using Stacked Ensemble Machine Learning Algorithm
Authors:
S. M Mehedi Zaman,
Wasay Mahmood Qureshi,
Md. Mohsin Sarker Raihan,
Ocean Monjur,
Abdullah Bin Shams
Abstract:
Cardiovascular disease, especially heart failure is one of the major health hazard issues of our time and is a leading cause of death worldwide. Advancement in data mining techniques using machine learning (ML) models is paving promising prediction approaches. Data mining is the process of converting massive volumes of raw data created by the healthcare institutions into meaningful information tha…
▽ More
Cardiovascular disease, especially heart failure is one of the major health hazard issues of our time and is a leading cause of death worldwide. Advancement in data mining techniques using machine learning (ML) models is paving promising prediction approaches. Data mining is the process of converting massive volumes of raw data created by the healthcare institutions into meaningful information that can aid in making predictions and crucial decisions. Collecting various follow-up data from patients who have had heart failures, analyzing those data, and utilizing several ML models to predict the survival possibility of cardiovascular patients is the key aim of this study. Due to the imbalance of the classes in the dataset, Synthetic Minority Oversampling Technique (SMOTE) has been implemented. Two unsupervised models (K-Means and Fuzzy C-Means clustering) and three supervised classifiers (Random Forest, XGBoost and Decision Tree) have been used in our study. After thorough investigation, our results demonstrate a superior performance of the supervised ML algorithms over unsupervised models. Moreover, we designed and propose a supervised stacked ensemble learning model that can achieve an accuracy, precision, recall and F1 score of 99.98%. Our study shows that only certain attributes collected from the patients are imperative to successfully predict the surviving possibility post heart failure, using supervised ML algorithms.
△ Less
Submitted 30 August, 2021;
originally announced August 2021.
-
A Case Study to Identify the Hindrances to Widespread Adoption of Electric Vehicles in Qatar
Authors:
Amith Khandakar,
Annaufal Rizqullah,
Anas Ashraf Abdou Berbar,
Mohammad Rafi Ahmed,
Atif Iqbal,
Muhammad E. H. Chowdhury,
S. M. Ashfaq Uz Zaman
Abstract:
The adoption of electric vehicles (EVs) have proven to be a crucial factor to decreasing the emission of greenhouse gases (GHG) into the atmosphere. However, there are various hurdles that impede people from purchasing EVs. For example, long charging time, short driving range, cost and insufficient charging infrastructures available, etc. This article reports the public perception of EV-adoption u…
▽ More
The adoption of electric vehicles (EVs) have proven to be a crucial factor to decreasing the emission of greenhouse gases (GHG) into the atmosphere. However, there are various hurdles that impede people from purchasing EVs. For example, long charging time, short driving range, cost and insufficient charging infrastructures available, etc. This article reports the public perception of EV-adoption using statistical analyses and proposes some recommendations for improving EV-adoption in Qatar. User perspectives on EV-adoption barriers in Qatar were investigated based on survey questionnaires. The survey questionnaires were based on similar studies done in other regions of the world. The study attempted to look at different perspectives of the adoption of EV, when asked to a person who is aware of EVs or a person who may or may not be aware of EVs. Cumulative survey responses from the two groups were compared and analyzed using a two sample t-test statistical analysis. Detailed analyses showed that among various major hindrances raising of public awareness of such greener modes of transportation, the availability of charging options in more places and policy incentives towards EVs would play a major role in EV-adoption. The authors provide recommendations that along with government incentives could help make a gradual shift to a greater number of EVs convenient for people of Qatar. The proposed systematic approach for such a study and analysis may help in streamlining research on policies, infrastructures and technologies for efficient penetration of EVs in Qatar.
△ Less
Submitted 27 June, 2020;
originally announced June 2020.