-
UniToMBench: Integrating Perspective-Taking to Improve Theory of Mind in LLMs
Authors:
Prameshwar Thiyagarajan,
Vaishnavi Parimi,
Shamant Sai,
Soumil Garg,
Zhangir Meirbek,
Nitin Yarlagadda,
Kevin Zhu,
Chris Kim
Abstract:
Theory of Mind (ToM), the ability to understand the mental states of oneself and others, remains a challenging area for large language models (LLMs), which often fail to predict human mental states accurately. In this paper, we introduce UniToMBench, a unified benchmark that integrates the strengths of SimToM and TOMBENCH to systematically improve and assess ToM capabilities in LLMs by integrating…
▽ More
Theory of Mind (ToM), the ability to understand the mental states of oneself and others, remains a challenging area for large language models (LLMs), which often fail to predict human mental states accurately. In this paper, we introduce UniToMBench, a unified benchmark that integrates the strengths of SimToM and TOMBENCH to systematically improve and assess ToM capabilities in LLMs by integrating multi-interaction task designs and evolving story scenarios. Supported by a custom dataset of over 1,000 hand-written scenarios, UniToMBench combines perspective-taking techniques with diverse evaluation metrics to better stimulate social cognition in LLMs. Through evaluation, we observe that while models like GPT-4o and GPT-4o Mini show consistently high accuracy in tasks involving emotional and belief-related scenarios, with results usually above 80%, there is significant variability in their performance across knowledge-based tasks. These results highlight both the strengths and limitations of current LLMs in ToM-related tasks, underscoring the value of UniToMBench as a comprehensive tool for future development. Our code is publicly available here: https://github.com/Shamant/unifiedtombenchmark.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
QGAPHEnsemble : Combining Hybrid QLSTM Network Ensemble via Adaptive Weighting for Short Term Weather Forecasting
Authors:
Anuvab Sen,
Udayon Sen,
Mayukhi Paul,
Apurba Prasad Padhy,
Sujith Sai,
Aakash Mallik,
Chhandak Mallick
Abstract:
Accurate weather forecasting holds significant importance, serving as a crucial tool for decision-making in various industrial sectors. The limitations of statistical models, assuming independence among data points, highlight the need for advanced methodologies. The correlation between meteorological variables necessitate models capable of capturing complex dependencies. This research highlights t…
▽ More
Accurate weather forecasting holds significant importance, serving as a crucial tool for decision-making in various industrial sectors. The limitations of statistical models, assuming independence among data points, highlight the need for advanced methodologies. The correlation between meteorological variables necessitate models capable of capturing complex dependencies. This research highlights the practical efficacy of employing advanced machine learning techniques proposing GenHybQLSTM and BO-QEnsemble architecture based on adaptive weight adjustment strategy. Through comprehensive hyper-parameter optimization using hybrid quantum genetic particle swarm optimisation algorithm and Bayesian Optimization, our model demonstrates a substantial improvement in the accuracy and reliability of meteorological predictions through the assessment of performance metrics such as MSE (Mean Squared Error) and MAPE (Mean Absolute Percentage Prediction Error). The paper highlights the importance of optimized ensemble techniques to improve the performance the given weather forecasting task.
△ Less
Submitted 18 January, 2025;
originally announced January 2025.
-
Resource-Efficient Transformer Architecture: Optimizing Memory and Execution Time for Real-Time Applications
Authors:
Krisvarish V,
Priyadarshini T,
K P Abhishek Sri Saai,
Vaidehi Vijayakumar
Abstract:
This paper describes a memory-efficient transformer model designed to drive a reduction in memory usage and execution time by substantial orders of magnitude without impairing the model's performance near that of the original model. Recently, new architectures of transformers were presented, focused on parameter efficiency and computational optimization; however, such models usually require consid…
▽ More
This paper describes a memory-efficient transformer model designed to drive a reduction in memory usage and execution time by substantial orders of magnitude without impairing the model's performance near that of the original model. Recently, new architectures of transformers were presented, focused on parameter efficiency and computational optimization; however, such models usually require considerable resources in terms of hardware when deployed in real-world applications on edge devices. This approach addresses this concern by halving embedding size and applying targeted techniques such as parameter pruning and quantization to optimize the memory footprint with minimum sacrifices in terms of accuracy. Experimental results include a 52% reduction in memory usage and a 33% decrease in execution time, resulting in better efficiency than state-of-the-art models. This work compared our model with existing compelling architectures, such as MobileBERT and DistilBERT, and proved its feasibility in the domain of resource-friendly deep learning architectures, mainly for applications in real-time and in resource-constrained applications.
△ Less
Submitted 25 December, 2024;
originally announced January 2025.
-
Passage-specific Prompt Tuning for Passage Reranking in Question Answering with Large Language Models
Authors:
Xuyang Wu,
Zhiyuan Peng,
Krishna Sravanthi Rajanala Sai,
Hsin-Tai Wu,
Yi Fang
Abstract:
Effective passage retrieval and reranking methods have been widely utilized to identify suitable candidates in open-domain question answering tasks, recent studies have resorted to LLMs for reranking the retrieved passages by the log-likelihood of the question conditioned on each passage. Although these methods have demonstrated promising results, the performance is notably sensitive to the human-…
▽ More
Effective passage retrieval and reranking methods have been widely utilized to identify suitable candidates in open-domain question answering tasks, recent studies have resorted to LLMs for reranking the retrieved passages by the log-likelihood of the question conditioned on each passage. Although these methods have demonstrated promising results, the performance is notably sensitive to the human-written prompt (or hard prompt), and fine-tuning LLMs can be computationally intensive and time-consuming. Furthermore, this approach limits the leverage of question-passage relevance pairs and passage-specific knowledge to enhance the ranking capabilities of LLMs. In this paper, we propose passage-specific prompt tuning for reranking in open-domain question answering (PSPT): a parameter-efficient method that fine-tunes learnable passage-specific soft prompts, incorporating passage-specific knowledge from a limited set of question-passage relevance pairs. The method involves ranking retrieved passages based on the log-likelihood of the model generating the question conditioned on each passage and the learned soft prompt. We conducted extensive experiments utilizing the Llama-2-chat-7B model across three publicly available open-domain question answering datasets and the results demonstrate the effectiveness of the proposed approach.
△ Less
Submitted 20 June, 2024; v1 submitted 31 May, 2024;
originally announced May 2024.
-
Design Of Rubble Analyzer Probe Using ML For Earthquake
Authors:
Abhishek Sebastian,
R Pragna,
K Vishal Vythianathan,
Dasaraju Sohan Sai,
U Shiva Sri Hari Al,
R Anirudh,
Apurv Choudhary
Abstract:
The earthquake rubble analyzer uses machine learning to detect human presence via ambient sounds, achieving 97.45% accuracy. It also provides real-time environmental data, aiding in assessing survival prospects for trapped individuals, crucial for post-earthquake rescue efforts
The earthquake rubble analyzer uses machine learning to detect human presence via ambient sounds, achieving 97.45% accuracy. It also provides real-time environmental data, aiding in assessing survival prospects for trapped individuals, crucial for post-earthquake rescue efforts
△ Less
Submitted 24 October, 2023;
originally announced November 2023.
-
A transformer-based deep learning approach for classifying brain metastases into primary organ sites using clinical whole brain MRI
Authors:
Qing Lyu,
Sanjeev V. Namjoshi,
Emory McTyre,
Umit Topaloglu,
Richard Barcus,
Michael D. Chan,
Christina K. Cramer,
Waldemar Debinski,
Metin N. Gurcan,
Glenn J. Lesser,
Hui-Kuan Lin,
Reginald F. Munden,
Boris C. Pasche,
Kiran Kumar Solingapuram Sai,
Roy E. Strowd,
Stephen B. Tatter,
Kounosuke Watabe,
Wei Zhang,
Ge Wang,
Christopher T. Whitlow
Abstract:
Treatment decisions for brain metastatic disease rely on knowledge of the primary organ site, and currently made with biopsy and histology. Here we develop a novel deep learning approach for accurate non-invasive digital histology with whole-brain MRI data. Our IRB-approved single-site retrospective study was comprised of patients (n=1,399) referred for MRI treatment-planning and gamma knife radio…
▽ More
Treatment decisions for brain metastatic disease rely on knowledge of the primary organ site, and currently made with biopsy and histology. Here we develop a novel deep learning approach for accurate non-invasive digital histology with whole-brain MRI data. Our IRB-approved single-site retrospective study was comprised of patients (n=1,399) referred for MRI treatment-planning and gamma knife radiosurgery over 21 years. Contrast-enhanced T1-weighted and T2-weighted Fluid-Attenuated Inversion Recovery brain MRI exams (n=1,582) were preprocessed and input to the proposed deep learning workflow for tumor segmentation, modality transfer, and primary site classification into one of five classes. Ten-fold cross-validation generated overall AUC of 0.878 (95%CI:0.873,0.883), lung class AUC of 0.889 (95%CI:0.883,0.895), breast class AUC of 0.873 (95%CI:0.860,0.886), melanoma class AUC of 0.852 (95%CI:0.842,0.862), renal class AUC of 0.830 (95%CI:0.809,0.851), and other class AUC of 0.822 (95%CI:0.805,0.839). These data establish that whole-brain imaging features are discriminative to allow accurate diagnosis of the primary organ site of malignancy. Our end-to-end deep radiomic approach has great potential for classifying metastatic tumor types from whole-brain MRI images. Further refinement may offer an invaluable clinical tool to expedite primary cancer site identification for precision treatment and improved outcomes.
△ Less
Submitted 20 April, 2022; v1 submitted 7 October, 2021;
originally announced October 2021.
-
High Accurate Unhealthy Leaf Detection
Authors:
S. Mohan Sai,
G. Gopichand,
C. Vikas Reddy,
K. Mona Teja
Abstract:
India is an agriculture-dependent country. As we all know that farming is the backbone of our country it is our responsibility to preserve the crops. However, we cannot stop the destruction of crops by natural calamities at least we have to try to protect our crops from diseases. To, detect a plant disease we need a fast automatic way. So, this paper presents a model to identify the particular dis…
▽ More
India is an agriculture-dependent country. As we all know that farming is the backbone of our country it is our responsibility to preserve the crops. However, we cannot stop the destruction of crops by natural calamities at least we have to try to protect our crops from diseases. To, detect a plant disease we need a fast automatic way. So, this paper presents a model to identify the particular disease of plant leaves at early stages so that we can prevent or take a remedy to stop spreading of the disease. This proposed model is made into five sessions. Image preprocessing includes the enhancement of the low light image done using inception modules in CNN. Low-resolution image enhancement is done using an Adversarial Neural Network. This also includes Conversion of RGB Image to YCrCb color space. Next, this paper presents a methodology for image segmentation which is an important aspect for identifying the disease symptoms. This segmentation is done using the genetic algorithm. Due to this process the segmentation of the leaf Image this helps in detection of the leaf mage automatically and classifying. Texture extraction is done using the statistical model called GLCM and finally, the classification of the diseases is done using the SVM using Different Kernels with the high accuracy.
△ Less
Submitted 14 August, 2019;
originally announced August 2019.