-
Sketch Down the FLOPs: Towards Efficient Networks for Human Sketch
Authors:
Aneeshan Sain,
Subhajit Maity,
Pinaki Nath Chowdhury,
Subhadeep Koley,
Ayan Kumar Bhunia,
Yi-Zhe Song
Abstract:
As sketch research has collectively matured over time, its adaptation for at-mass commercialisation emerges on the immediate horizon. Despite an already mature research endeavour for photos, there is no research on the efficient inference specifically designed for sketch data. In this paper, we first demonstrate existing state-of-the-art efficient light-weight models designed for photos do not wor…
▽ More
As sketch research has collectively matured over time, its adaptation for at-mass commercialisation emerges on the immediate horizon. Despite an already mature research endeavour for photos, there is no research on the efficient inference specifically designed for sketch data. In this paper, we first demonstrate existing state-of-the-art efficient light-weight models designed for photos do not work on sketches. We then propose two sketch-specific components which work in a plug-n-play manner on any photo efficient network to adapt them to work on sketch data. We specifically chose fine-grained sketch-based image retrieval (FG-SBIR) as a demonstrator as the most recognised sketch problem with immediate commercial value. Technically speaking, we first propose a cross-modal knowledge distillation network to transfer existing photo efficient networks to be compatible with sketch, which brings down number of FLOPs and model parameters by 97.96% percent and 84.89% respectively. We then exploit the abstract trait of sketch to introduce a RL-based canvas selector that dynamically adjusts to the abstraction level which further cuts down number of FLOPs by two thirds. The end result is an overall reduction of 99.37% of FLOPs (from 40.18G to 0.254G) when compared with a full network, while retaining the accuracy (33.03% vs 32.77%) -- finally making an efficient network for the sparse sketch data that exhibit even fewer FLOPs than the best photo counterpart.
△ Less
Submitted 29 May, 2025;
originally announced May 2025.
-
STRIVE: A Think & Improve Approach with Iterative Refinement for Enhancing Question Quality Estimation
Authors:
Aniket Deroy,
Subhankar Maity
Abstract:
Automatically assessing question quality is crucial for educators as it saves time, ensures consistency, and provides immediate feedback for refining teaching materials. We propose a novel methodology called STRIVE (Structured Thinking and Refinement with multiLLMs for Improving Verified Question Estimation) using a series of Large Language Models (LLMs) for automatic question evaluation. This app…
▽ More
Automatically assessing question quality is crucial for educators as it saves time, ensures consistency, and provides immediate feedback for refining teaching materials. We propose a novel methodology called STRIVE (Structured Thinking and Refinement with multiLLMs for Improving Verified Question Estimation) using a series of Large Language Models (LLMs) for automatic question evaluation. This approach aims to improve the accuracy and depth of question quality assessment, ultimately supporting diverse learners and enhancing educational practices. The method estimates question quality in an automated manner by generating multiple evaluations based on the strengths and weaknesses of the provided question and then choosing the best solution generated by the LLM. Then the process is improved by iterative review and response with another LLM until the evaluation metric values converge. This sophisticated method of evaluating question quality improves the estimation of question quality by automating the task of question quality evaluation. Correlation scores show that using this proposed method helps to improve correlation with human judgments compared to the baseline method. Error analysis shows that metrics like relevance and appropriateness improve significantly relative to human judgments by using STRIVE.
△ Less
Submitted 8 April, 2025;
originally announced April 2025.
-
Towards Smarter Hiring: Are Zero-Shot and Few-Shot Pre-trained LLMs Ready for HR Spoken Interview Transcript Analysis?
Authors:
Subhankar Maity,
Aniket Deroy,
Sudeshna Sarkar
Abstract:
This research paper presents a comprehensive analysis of the performance of prominent pre-trained large language models (LLMs), including GPT-4 Turbo, GPT-3.5 Turbo, text-davinci-003, text-babbage-001, text-curie-001, text-ada-001, llama-2-7b-chat, llama-2-13b-chat, and llama-2-70b-chat, in comparison to expert human evaluators in providing scores, identifying errors, and offering feedback and imp…
▽ More
This research paper presents a comprehensive analysis of the performance of prominent pre-trained large language models (LLMs), including GPT-4 Turbo, GPT-3.5 Turbo, text-davinci-003, text-babbage-001, text-curie-001, text-ada-001, llama-2-7b-chat, llama-2-13b-chat, and llama-2-70b-chat, in comparison to expert human evaluators in providing scores, identifying errors, and offering feedback and improvement suggestions to candidates during mock HR (Human Resources) interviews. We introduce a dataset called HURIT (Human Resource Interview Transcripts), which comprises 3,890 HR interview transcripts sourced from real-world HR interview scenarios. Our findings reveal that pre-trained LLMs, particularly GPT-4 Turbo and GPT-3.5 Turbo, exhibit commendable performance and are capable of producing evaluations comparable to those of expert human evaluators. Although these LLMs demonstrate proficiency in providing scores comparable to human experts in terms of human evaluation metrics, they frequently fail to identify errors and offer specific actionable advice for candidate performance improvement in HR interviews. Our research suggests that the current state-of-the-art pre-trained LLMs are not fully conducive for automatic deployment in an HR interview assessment. Instead, our findings advocate for a human-in-the-loop approach, to incorporate manual checks for inconsistencies and provisions for improving feedback quality as a more suitable strategy.
△ Less
Submitted 8 April, 2025;
originally announced April 2025.
-
Leveraging Prompt-Tuning for Bengali Grammatical Error Explanation Using Large Language Models
Authors:
Subhankar Maity,
Aniket Deroy
Abstract:
We propose a novel three-step prompt-tuning method for Bengali Grammatical Error Explanation (BGEE) using state-of-the-art large language models (LLMs) such as GPT-4, GPT-3.5 Turbo, and Llama-2-70b. Our approach involves identifying and categorizing grammatical errors in Bengali sentences, generating corrected versions of the sentences, and providing natural language explanations for each identifi…
▽ More
We propose a novel three-step prompt-tuning method for Bengali Grammatical Error Explanation (BGEE) using state-of-the-art large language models (LLMs) such as GPT-4, GPT-3.5 Turbo, and Llama-2-70b. Our approach involves identifying and categorizing grammatical errors in Bengali sentences, generating corrected versions of the sentences, and providing natural language explanations for each identified error. We evaluate the performance of our BGEE system using both automated evaluation metrics and human evaluation conducted by experienced Bengali language experts. Our proposed prompt-tuning approach shows that GPT-4, the best performing LLM, surpasses the baseline model in automated evaluation metrics, with a 5.26% improvement in F1 score and a 6.95% improvement in exact match. Furthermore, compared to the previous baseline, GPT-4 demonstrates a decrease of 25.51% in wrong error type and a decrease of 26.27% in wrong error explanation. However, the results still lag behind the human baseline.
△ Less
Submitted 7 April, 2025;
originally announced April 2025.
-
Kolmogorov-Arnold Attention: Is Learnable Attention Better For Vision Transformers?
Authors:
Subhajit Maity,
Killian Hitsman,
Xin Li,
Aritra Dutta
Abstract:
Kolmogorov-Arnold networks (KANs) are a remarkable innovation consisting of learnable activation functions with the potential to capture more complex relationships from data. Presently, KANs are deployed by replacing multilayer perceptrons (MLPs) in deep networks, including advanced architectures such as vision Transformers (ViTs). This work asks whether a similar replacement in the attention can…
▽ More
Kolmogorov-Arnold networks (KANs) are a remarkable innovation consisting of learnable activation functions with the potential to capture more complex relationships from data. Presently, KANs are deployed by replacing multilayer perceptrons (MLPs) in deep networks, including advanced architectures such as vision Transformers (ViTs). This work asks whether a similar replacement in the attention can bring benefits. In this paper, we design the first learnable attention called Kolmogorov-Arnold Attention (KArAt) for ViTs that can operate on any basis, ranging from Fourier, Wavelets, Splines, to Rational Functions. However, learnable activations in attention cause a memory explosion. To remedy this, we propose a modular version of KArAt that uses a low-rank approximation. By adopting the Fourier basis, Fourier-KArAt and its variants, in some cases, outperform their traditional softmax counterparts, or show comparable performance on CIFAR-10, CIFAR-100, and ImageNet-1K datasets. We also deploy Fourier KArAt to ConViT and Swin-Transformer, and use it in detection and segmentation with ViT-Det. We dissect these architectures' performance by analyzing their loss landscapes, weight distributions, optimizer path, attention visualization, and transferability to other datasets. KArAt's learnable activation shows a better attention score across all ViTs, indicating better token-to-token interactions, contributing to better inference. Still, its generalizability does not scale with larger ViTs. However, many factors, including the present computing interface, affect the performance of parameter- and memory-heavy KArAts. We note that the goal of this paper is not to produce efficient attention or challenge the traditional activations; by designing KArAt, we are the first to show that attention can be learned and encourage researchers to explore KArAt in conjunction with more advanced architectures.
△ Less
Submitted 28 May, 2025; v1 submitted 13 March, 2025;
originally announced March 2025.
-
MaxMin Separation Problems: FPT Algorithms for $st$-Separator and Odd Cycle Transversal
Authors:
Ajinkya Gaikwad,
Hitendra Kumar,
Soumen Maity,
Saket Saurabh,
Roohani Sharma
Abstract:
In this paper, we study the parameterized complexity of the MaxMin versions of two fundamental separation problems: Maximum Minimal $st$-Separator and Maximum Minimal Odd Cycle Transversal (OCT), both parameterized by the solution size. In the Maximum Minimal $st$-Separator problem, given a graph $G$, two distinct vertices $s$ and $t$ and a positive integer $k$, the goal is to determine whether th…
▽ More
In this paper, we study the parameterized complexity of the MaxMin versions of two fundamental separation problems: Maximum Minimal $st$-Separator and Maximum Minimal Odd Cycle Transversal (OCT), both parameterized by the solution size. In the Maximum Minimal $st$-Separator problem, given a graph $G$, two distinct vertices $s$ and $t$ and a positive integer $k$, the goal is to determine whether there exists a minimal $st$-separator in $G$ of size at least $k$. Similarly, the Maximum Minimal OCT problem seeks to determine if there exists a minimal set of vertices whose deletion results in a bipartite graph, and whose size is at least $k$. We demonstrate that both problems are fixed-parameter tractable parameterized by $k$. Our FPT algorithm for Maximum Minimal $st$-Separator answers the open question by Hanaka, Bodlaender, van der Zanden and Ono (TCS 2019).
One unique insight from this work is the following. We use the meta-result of Lokshtanov, Ramanujan, Saurabh and Zehavi (ICALP 2018) that enables us to reduce our problems to highly unbreakable graphs. This is interesting, as an explicit use of the recursive understanding and randomized contractions framework of Chitnis, Cygan, Hajiaghayi, Pilipczuk and Pilipczuk (SICOMP 2016) to reduce to the highly unbreakable graphs setting (which is the result that Lokshtanov et al. tries to abstract out in their meta-theorem) does not seem obvious because certain ``extension'' variants of our problems are W[1]-hard.
△ Less
Submitted 11 February, 2025;
originally announced February 2025.
-
Adversarially-Robust TD Learning with Markovian Data: Finite-Time Rates and Fundamental Limits
Authors:
Sreejeet Maity,
Aritra Mitra
Abstract:
One of the most basic problems in reinforcement learning (RL) is policy evaluation: estimating the long-term return, i.e., value function, corresponding to a given fixed policy. The celebrated Temporal Difference (TD) learning algorithm addresses this problem, and recent work has investigated finite-time convergence guarantees for this algorithm and variants thereof. However, these guarantees hing…
▽ More
One of the most basic problems in reinforcement learning (RL) is policy evaluation: estimating the long-term return, i.e., value function, corresponding to a given fixed policy. The celebrated Temporal Difference (TD) learning algorithm addresses this problem, and recent work has investigated finite-time convergence guarantees for this algorithm and variants thereof. However, these guarantees hinge on the reward observations being always generated from a well-behaved (e.g., sub-Gaussian) true reward distribution. Motivated by harsh, real-world environments where such an idealistic assumption may no longer hold, we revisit the policy evaluation problem from the perspective of adversarial robustness. In particular, we consider a Huber-contaminated reward model where an adversary can arbitrarily corrupt each reward sample with a small probability $ε$. Under this observation model, we first show that the adversary can cause the vanilla TD algorithm to converge to any arbitrary value function. We then develop a novel algorithm called Robust-TD and prove that its finite-time guarantees match that of vanilla TD with linear function approximation up to a small $O(ε)$ term that captures the effect of corruption. We complement this result with a minimax lower bound, revealing that such an additive corruption-induced term is unavoidable. To our knowledge, these results are the first of their kind in the context of adversarial robustness of stochastic approximation schemes driven by Markov noise. The key new technical tool that enables our results is an analysis of the Median-of-Means estimator with corrupted, time-correlated data that might be of independent interest to the literature on robust statistics.
△ Less
Submitted 7 February, 2025;
originally announced February 2025.
-
CARROT: A Cost Aware Rate Optimal Router
Authors:
Seamus Somerstep,
Felipe Maia Polo,
Allysson Flavio Melo de Oliveira,
Prattyush Mangal,
Mírian Silva,
Onkar Bhardwaj,
Mikhail Yurochkin,
Subha Maity
Abstract:
With the rapid growth in the number of Large Language Models (LLMs), there has been a recent interest in LLM routing, or directing queries to the cheapest LLM that can deliver a suitable response. We conduct a minimax analysis of the routing problem, providing a lower bound and finding that a simple router that predicts both cost and accuracy for each question can be minimax optimal. Inspired by t…
▽ More
With the rapid growth in the number of Large Language Models (LLMs), there has been a recent interest in LLM routing, or directing queries to the cheapest LLM that can deliver a suitable response. We conduct a minimax analysis of the routing problem, providing a lower bound and finding that a simple router that predicts both cost and accuracy for each question can be minimax optimal. Inspired by this, we introduce CARROT, a Cost AwaRe Rate Optimal rouTer that selects a model based on estimates of the models' cost and performance. Alongside CARROT, we also introduce the Smart Price-aware ROUTing (SPROUT) dataset to facilitate routing on a wide spectrum of queries with the latest state-of-the-art LLMs. Using SPROUT and prior benchmarks such as Routerbench and open-LLM-leaderboard-v2 we empirically validate CARROT's performance against several alternative routers.
△ Less
Submitted 19 May, 2025; v1 submitted 5 February, 2025;
originally announced February 2025.
-
Leveraging In-Context Learning and Retrieval-Augmented Generation for Automatic Question Generation in Educational Domains
Authors:
Subhankar Maity,
Aniket Deroy,
Sudeshna Sarkar
Abstract:
Question generation in education is a time-consuming and cognitively demanding task, as it requires creating questions that are both contextually relevant and pedagogically sound. Current automated question generation methods often generate questions that are out of context. In this work, we explore advanced techniques for automated question generation in educational contexts, focusing on In-Conte…
▽ More
Question generation in education is a time-consuming and cognitively demanding task, as it requires creating questions that are both contextually relevant and pedagogically sound. Current automated question generation methods often generate questions that are out of context. In this work, we explore advanced techniques for automated question generation in educational contexts, focusing on In-Context Learning (ICL), Retrieval-Augmented Generation (RAG), and a novel Hybrid Model that merges both methods. We implement GPT-4 for ICL using few-shot examples and BART with a retrieval module for RAG. The Hybrid Model combines RAG and ICL to address these issues and improve question quality. Evaluation is conducted using automated metrics, followed by human evaluation metrics. Our results show that both the ICL approach and the Hybrid Model consistently outperform other methods, including baseline models, by generating more contextually accurate and relevant questions.
△ Less
Submitted 28 January, 2025;
originally announced January 2025.
-
HateGPT: Unleashing GPT-3.5 Turbo to Combat Hate Speech on X
Authors:
Aniket Deroy,
Subhankar Maity
Abstract:
The widespread use of social media platforms like Twitter and Facebook has enabled people of all ages to share their thoughts and experiences, leading to an immense accumulation of user-generated content. However, alongside the benefits, these platforms also face the challenge of managing hate speech and offensive content, which can undermine rational discourse and threaten democratic values. As a…
▽ More
The widespread use of social media platforms like Twitter and Facebook has enabled people of all ages to share their thoughts and experiences, leading to an immense accumulation of user-generated content. However, alongside the benefits, these platforms also face the challenge of managing hate speech and offensive content, which can undermine rational discourse and threaten democratic values. As a result, there is a growing need for automated methods to detect and mitigate such content, especially given the complexity of conversations that may require contextual analysis across multiple languages, including code-mixed languages like Hinglish, German-English, and Bangla. We participated in the English task where we have to classify English tweets into two categories namely Hate and Offensive and Non Hate-Offensive. In this work, we experiment with state-of-the-art large language models like GPT-3.5 Turbo via prompting to classify tweets into Hate and Offensive or Non Hate-Offensive. In this study, we evaluate the performance of a classification model using Macro-F1 scores across three distinct runs. The Macro-F1 score, which balances precision and recall across all classes, is used as the primary metric for model evaluation. The scores obtained are 0.756 for run 1, 0.751 for run 2, and 0.754 for run 3, indicating a high level of performance with minimal variance among the runs. The results suggest that the model consistently performs well in terms of precision and recall, with run 1 showing the highest performance. These findings highlight the robustness and reliability of the model across different runs.
△ Less
Submitted 25 March, 2025; v1 submitted 14 November, 2024;
originally announced November 2024.
-
Microfoundation Inference for Strategic Prediction
Authors:
Daniele Bracale,
Subha Maity,
Felipe Maia Polo,
Seamus Somerstep,
Moulinath Banerjee,
Yuekai Sun
Abstract:
Often in prediction tasks, the predictive model itself can influence the distribution of the target variable, a phenomenon termed performative prediction. Generally, this influence stems from strategic actions taken by stakeholders with a vested interest in predictive models. A key challenge that hinders the widespread adaptation of performative prediction in machine learning is that practitioners…
▽ More
Often in prediction tasks, the predictive model itself can influence the distribution of the target variable, a phenomenon termed performative prediction. Generally, this influence stems from strategic actions taken by stakeholders with a vested interest in predictive models. A key challenge that hinders the widespread adaptation of performative prediction in machine learning is that practitioners are generally unaware of the social impacts of their predictions. To address this gap, we propose a methodology for learning the distribution map that encapsulates the long-term impacts of predictive models on the population. Specifically, we model agents' responses as a cost-adjusted utility maximization problem and propose estimates for said cost. Our approach leverages optimal transport to align pre-model exposure (ex ante) and post-model exposure (ex post) distributions. We provide a rate of convergence for this proposed estimate and assess its quality through empirical demonstrations on a credit-scoring dataset.
△ Less
Submitted 10 April, 2025; v1 submitted 13 November, 2024;
originally announced November 2024.
-
CryptoLLM: Unleashing the Power of Prompted LLMs for SmartQnA and Classification of Crypto Posts
Authors:
Aniket Deroy,
Subhankar Maity
Abstract:
The rapid growth of social media has resulted in an large volume of user-generated content, particularly in niche domains such as cryptocurrency. This task focuses on developing robust classification models to accurately categorize cryptocurrency-related social media posts into predefined classes, including but not limited to objective, positive, negative, etc. Additionally, the task requires part…
▽ More
The rapid growth of social media has resulted in an large volume of user-generated content, particularly in niche domains such as cryptocurrency. This task focuses on developing robust classification models to accurately categorize cryptocurrency-related social media posts into predefined classes, including but not limited to objective, positive, negative, etc. Additionally, the task requires participants to identify the most relevant answers from a set of posts in response to specific questions. By leveraging advanced LLMs, this research aims to enhance the understanding and filtering of cryptocurrency discourse, thereby facilitating more informed decision-making in this volatile sector. We have used a prompt-based technique to solve the classification task for reddit posts and twitter posts. Also, we have used 64-shot technique along with prompts on GPT-4-Turbo model to determine whether a answer is relevant to a question or not.
△ Less
Submitted 18 March, 2025; v1 submitted 12 November, 2024;
originally announced November 2024.
-
Cancer-Answer: Empowering Cancer Care with Advanced Large Language Models
Authors:
Aniket Deroy,
Subhankar Maity
Abstract:
Gastrointestinal (GI) tract cancers account for a substantial portion of the global cancer burden, where early diagnosis is critical for improved management and patient outcomes. The complex aetiologies and overlapping symptoms across GI cancers often delay diagnosis, leading to suboptimal treatment strategies. Cancer-related queries are crucial for timely diagnosis, treatment, and patient educati…
▽ More
Gastrointestinal (GI) tract cancers account for a substantial portion of the global cancer burden, where early diagnosis is critical for improved management and patient outcomes. The complex aetiologies and overlapping symptoms across GI cancers often delay diagnosis, leading to suboptimal treatment strategies. Cancer-related queries are crucial for timely diagnosis, treatment, and patient education, as access to accurate, comprehensive information can significantly influence outcomes. However, the complexity of cancer as a disease, combined with the vast amount of available data, makes it difficult for clinicians and patients to quickly find precise answers. To address these challenges, we leverage large language models (LLMs) such as GPT-3.5 Turbo to generate accurate, contextually relevant responses to cancer-related queries. Pre-trained with medical data, these models provide timely, actionable insights that support informed decision-making in cancer diagnosis and care, ultimately improving patient outcomes. We calculate two metrics: A1 (which represents the fraction of entities present in the model-generated answer compared to the gold standard) and A2 (which represents the linguistic correctness and meaningfulness of the model-generated answer with respect to the gold standard), achieving maximum values of 0.546 and 0.881, respectively.
△ Less
Submitted 18 March, 2025; v1 submitted 11 November, 2024;
originally announced November 2024.
-
YouTube Comments Decoded: Leveraging LLMs for Low Resource Language Classification
Authors:
Aniket Deroy,
Subhankar Maity
Abstract:
Sarcasm detection is a significant challenge in sentiment analysis, particularly due to its nature of conveying opinions where the intended meaning deviates from the literal expression. This challenge is heightened in social media contexts where code-mixing, especially in Dravidian languages, is prevalent. Code-mixing involves the blending of multiple languages within a single utterance, often wit…
▽ More
Sarcasm detection is a significant challenge in sentiment analysis, particularly due to its nature of conveying opinions where the intended meaning deviates from the literal expression. This challenge is heightened in social media contexts where code-mixing, especially in Dravidian languages, is prevalent. Code-mixing involves the blending of multiple languages within a single utterance, often with non-native scripts, complicating the task for systems trained on monolingual data. This shared task introduces a novel gold standard corpus designed for sarcasm and sentiment detection within code-mixed texts, specifically in Tamil-English and Malayalam-English languages. The primary objective of this task is to identify sarcasm and sentiment polarity within a code-mixed dataset of Tamil-English and Malayalam-English comments and posts collected from social media platforms. Each comment or post is annotated at the message level for sentiment polarity, with particular attention to the challenges posed by class imbalance, reflecting real-world scenarios.In this work, we experiment with state-of-the-art large language models like GPT-3.5 Turbo via prompting to classify comments into sarcastic or non-sarcastic categories. We obtained a macro-F1 score of 0.61 for Tamil language. We obtained a macro-F1 score of 0.50 for Malayalam language.
△ Less
Submitted 13 March, 2025; v1 submitted 6 November, 2024;
originally announced November 2024.
-
RetrieveGPT: Merging Prompts and Mathematical Models for Enhanced Code-Mixed Information Retrieval
Authors:
Aniket Deroy,
Subhankar Maity
Abstract:
Code-mixing, the integration of lexical and grammatical elements from multiple languages within a single sentence, is a widespread linguistic phenomenon, particularly prevalent in multilingual societies. In India, social media users frequently engage in code-mixed conversations using the Roman script, especially among migrant communities who form online groups to share relevant local information.…
▽ More
Code-mixing, the integration of lexical and grammatical elements from multiple languages within a single sentence, is a widespread linguistic phenomenon, particularly prevalent in multilingual societies. In India, social media users frequently engage in code-mixed conversations using the Roman script, especially among migrant communities who form online groups to share relevant local information. This paper focuses on the challenges of extracting relevant information from code-mixed conversations, specifically within Roman transliterated Bengali mixed with English. This study presents a novel approach to address these challenges by developing a mechanism to automatically identify the most relevant answers from code-mixed conversations. We have experimented with a dataset comprising of queries and documents from Facebook, and Query Relevance files (QRels) to aid in this task. Our results demonstrate the effectiveness of our approach in extracting pertinent information from complex, code-mixed digital conversations, contributing to the broader field of natural language processing in multilingual and informal text environments. We use GPT-3.5 Turbo via prompting alongwith using the sequential nature of relevant documents to frame a mathematical model which helps to detect relevant documents corresponding to a query.
△ Less
Submitted 26 March, 2025; v1 submitted 7 November, 2024;
originally announced November 2024.
-
Prompt Engineering Using GPT for Word-Level Code-Mixed Language Identification in Low-Resource Dravidian Languages
Authors:
Aniket Deroy,
Subhankar Maity
Abstract:
Language Identification (LI) is crucial for various natural language processing tasks, serving as a foundational step in applications such as sentiment analysis, machine translation, and information retrieval. In multilingual societies like India, particularly among the youth engaging on social media, text often exhibits code-mixing, blending local languages with English at different linguistic le…
▽ More
Language Identification (LI) is crucial for various natural language processing tasks, serving as a foundational step in applications such as sentiment analysis, machine translation, and information retrieval. In multilingual societies like India, particularly among the youth engaging on social media, text often exhibits code-mixing, blending local languages with English at different linguistic levels. This phenomenon presents formidable challenges for LI systems, especially when languages intermingle within single words. Dravidian languages, prevalent in southern India, possess rich morphological structures yet suffer from under-representation in digital platforms, leading to the adoption of Roman or hybrid scripts for communication. This paper introduces a prompt based method for a shared task aimed at addressing word-level LI challenges in Dravidian languages. In this work, we leveraged GPT-3.5 Turbo to understand whether the large language models is able to correctly classify words into correct categories. Our findings show that the Kannada model consistently outperformed the Tamil model across most metrics, indicating a higher accuracy and reliability in identifying and categorizing Kannada language instances. In contrast, the Tamil model showed moderate performance, particularly needing improvement in precision and recall.
△ Less
Submitted 12 March, 2025; v1 submitted 6 November, 2024;
originally announced November 2024.
-
Human-Centric eXplainable AI in Education
Authors:
Subhankar Maity,
Aniket Deroy
Abstract:
As artificial intelligence (AI) becomes more integrated into educational environments, how can we ensure that these systems are both understandable and trustworthy? The growing demand for explainability in AI systems is a critical area of focus. This paper explores Human-Centric eXplainable AI (HCXAI) in the educational landscape, emphasizing its role in enhancing learning outcomes, fostering trus…
▽ More
As artificial intelligence (AI) becomes more integrated into educational environments, how can we ensure that these systems are both understandable and trustworthy? The growing demand for explainability in AI systems is a critical area of focus. This paper explores Human-Centric eXplainable AI (HCXAI) in the educational landscape, emphasizing its role in enhancing learning outcomes, fostering trust among users, and ensuring transparency in AI-driven tools, particularly through the innovative use of large language models (LLMs). What challenges arise in the implementation of explainable AI in educational contexts? This paper analyzes these challenges, addressing the complexities of AI models and the diverse needs of users. It outlines comprehensive frameworks for developing HCXAI systems that prioritize user understanding and engagement, ensuring that educators and students can effectively interact with these technologies. Furthermore, what steps can educators, developers, and policymakers take to create more effective, inclusive, and ethically responsible AI solutions in education? The paper provides targeted recommendations to address this question, highlighting the necessity of prioritizing explainability. By doing so, how can we leverage AI's transformative potential to foster equitable and engaging educational experiences that support diverse learners?
△ Less
Submitted 18 October, 2024;
originally announced October 2024.
-
MIRROR: A Novel Approach for the Automated Evaluation of Open-Ended Question Generation
Authors:
Aniket Deroy,
Subhankar Maity,
Sudeshna Sarkar
Abstract:
Automatic question generation is a critical task that involves evaluating question quality by considering factors such as engagement, pedagogical value, and the ability to stimulate critical thinking. These aspects require human-like understanding and judgment, which automated systems currently lack. However, human evaluations are costly and impractical for large-scale samples of generated questio…
▽ More
Automatic question generation is a critical task that involves evaluating question quality by considering factors such as engagement, pedagogical value, and the ability to stimulate critical thinking. These aspects require human-like understanding and judgment, which automated systems currently lack. However, human evaluations are costly and impractical for large-scale samples of generated questions. Therefore, we propose a novel system, MIRROR (Multi-LLM Iterative Review and Response for Optimized Rating), which leverages large language models (LLMs) to automate the evaluation process for questions generated by automated question generation systems. We experimented with several state-of-the-art LLMs, such as GPT-4, Gemini, and Llama2-70b. We observed that the scores of human evaluation metrics, namely relevance, appropriateness, novelty, complexity, and grammaticality, improved when using the feedback-based approach called MIRROR, tending to be closer to the human baseline scores. Furthermore, we observed that Pearson's correlation coefficient between GPT-4 and human experts improved when using our proposed feedback-based approach, MIRROR, compared to direct prompting for evaluation. Error analysis shows that our proposed approach, MIRROR, significantly helps to improve relevance and appropriateness.
△ Less
Submitted 25 March, 2025; v1 submitted 16 October, 2024;
originally announced October 2024.
-
Generative AI and Its Impact on Personalized Intelligent Tutoring Systems
Authors:
Subhankar Maity,
Aniket Deroy
Abstract:
Generative Artificial Intelligence (AI) is revolutionizing educational technology by enabling highly personalized and adaptive learning environments within Intelligent Tutoring Systems (ITS). This report delves into the integration of Generative AI, particularly large language models (LLMs) like GPT-4, into ITS to enhance personalized education through dynamic content generation, real-time feedbac…
▽ More
Generative Artificial Intelligence (AI) is revolutionizing educational technology by enabling highly personalized and adaptive learning environments within Intelligent Tutoring Systems (ITS). This report delves into the integration of Generative AI, particularly large language models (LLMs) like GPT-4, into ITS to enhance personalized education through dynamic content generation, real-time feedback, and adaptive learning pathways. We explore key applications such as automated question generation, customized feedback mechanisms, and interactive dialogue systems that respond to individual learner needs. The report also addresses significant challenges, including ensuring pedagogical accuracy, mitigating inherent biases in AI models, and maintaining learner engagement. Future directions highlight the potential advancements in multimodal AI integration, emotional intelligence in tutoring systems, and the ethical implications of AI-driven education. By synthesizing current research and practical implementations, this report underscores the transformative potential of Generative AI in creating more effective, equitable, and engaging educational experiences.
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
Rethinking Legal Judgement Prediction in a Realistic Scenario in the Era of Large Language Models
Authors:
Shubham Kumar Nigam,
Aniket Deroy,
Subhankar Maity,
Arnab Bhattacharya
Abstract:
This study investigates judgment prediction in a realistic scenario within the context of Indian judgments, utilizing a range of transformer-based models, including InLegalBERT, BERT, and XLNet, alongside LLMs such as Llama-2 and GPT-3.5 Turbo. In this realistic scenario, we simulate how judgments are predicted at the point when a case is presented for a decision in court, using only the informati…
▽ More
This study investigates judgment prediction in a realistic scenario within the context of Indian judgments, utilizing a range of transformer-based models, including InLegalBERT, BERT, and XLNet, alongside LLMs such as Llama-2 and GPT-3.5 Turbo. In this realistic scenario, we simulate how judgments are predicted at the point when a case is presented for a decision in court, using only the information available at that time, such as the facts of the case, statutes, precedents, and arguments. This approach mimics real-world conditions, where decisions must be made without the benefit of hindsight, unlike retrospective analyses often found in previous studies. For transformer models, we experiment with hierarchical transformers and the summarization of judgment facts to optimize input for these models. Our experiments with LLMs reveal that GPT-3.5 Turbo excels in realistic scenarios, demonstrating robust performance in judgment prediction. Furthermore, incorporating additional legal information, such as statutes and precedents, significantly improves the outcome of the prediction task. The LLMs also provide explanations for their predictions. To evaluate the quality of these predictions and explanations, we introduce two human evaluation metrics: Clarity and Linking. Our findings from both automatic and human evaluations indicate that, despite advancements in LLMs, they are yet to achieve expert-level performance in judgment prediction and explanation tasks.
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
The Future of Learning in the Age of Generative AI: Automated Question Generation and Assessment with Large Language Models
Authors:
Subhankar Maity,
Aniket Deroy
Abstract:
In recent years, large language models (LLMs) and generative AI have revolutionized natural language processing (NLP), offering unprecedented capabilities in education. This chapter explores the transformative potential of LLMs in automated question generation and answer assessment. It begins by examining the mechanisms behind LLMs, emphasizing their ability to comprehend and generate human-like t…
▽ More
In recent years, large language models (LLMs) and generative AI have revolutionized natural language processing (NLP), offering unprecedented capabilities in education. This chapter explores the transformative potential of LLMs in automated question generation and answer assessment. It begins by examining the mechanisms behind LLMs, emphasizing their ability to comprehend and generate human-like text. The chapter then discusses methodologies for creating diverse, contextually relevant questions, enhancing learning through tailored, adaptive strategies. Key prompting techniques, such as zero-shot and chain-of-thought prompting, are evaluated for their effectiveness in generating high-quality questions, including open-ended and multiple-choice formats in various languages. Advanced NLP methods like fine-tuning and prompt-tuning are explored for their role in generating task-specific questions, despite associated costs. The chapter also covers the human evaluation of generated questions, highlighting quality variations across different methods and areas for improvement. Furthermore, it delves into automated answer assessment, demonstrating how LLMs can accurately evaluate responses, provide constructive feedback, and identify nuanced understanding or misconceptions. Examples illustrate both successful assessments and areas needing improvement. The discussion underscores the potential of LLMs to replace costly, time-consuming human assessments when appropriately guided, showcasing their advanced understanding and reasoning capabilities in streamlining educational processes.
△ Less
Submitted 12 October, 2024;
originally announced October 2024.
-
Code Generation and Algorithmic Problem Solving Using Llama 3.1 405B
Authors:
Aniket Deroy,
Subhankar Maity
Abstract:
Code generation by Llama 3.1 models, such as Meta's Llama 3.1 405B, represents a significant advancement in the field of artificial intelligence, particularly in natural language processing and programming automation. This paper explores the capabilities and applications of Llama-driven code generation, highlighting its ability to translate natural language prompts into executable code across mult…
▽ More
Code generation by Llama 3.1 models, such as Meta's Llama 3.1 405B, represents a significant advancement in the field of artificial intelligence, particularly in natural language processing and programming automation. This paper explores the capabilities and applications of Llama-driven code generation, highlighting its ability to translate natural language prompts into executable code across multiple programming languages. Key features include contextual awareness, multi-language support, and enhanced debugging and optimization functionalities. By examining these aspects, we illustrate how Llama can serve as a versatile tool for developers of all skill levels, improving productivity and efficiency in software development. The potential implications for education, industry, and the future of coding practices are also discussed, underscoring the transformative impact of AI in programming. Experimentation shows that while Llama 3.1 405B performs well with simple algorithmic and data structure based problems, it still struggles with problems on Quantum Computing, Bioinformatics, and Artificial Intelligence.
△ Less
Submitted 2 April, 2025; v1 submitted 26 September, 2024;
originally announced September 2024.
-
Robust Q-Learning under Corrupted Rewards
Authors:
Sreejeet Maity,
Aritra Mitra
Abstract:
Recently, there has been a surge of interest in analyzing the non-asymptotic behavior of model-free reinforcement learning algorithms. However, the performance of such algorithms in non-ideal environments, such as in the presence of corrupted rewards, is poorly understood. Motivated by this gap, we investigate the robustness of the celebrated Q-learning algorithm to a strong-contamination attack m…
▽ More
Recently, there has been a surge of interest in analyzing the non-asymptotic behavior of model-free reinforcement learning algorithms. However, the performance of such algorithms in non-ideal environments, such as in the presence of corrupted rewards, is poorly understood. Motivated by this gap, we investigate the robustness of the celebrated Q-learning algorithm to a strong-contamination attack model, where an adversary can arbitrarily perturb a small fraction of the observed rewards. We start by proving that such an attack can cause the vanilla Q-learning algorithm to incur arbitrarily large errors. We then develop a novel robust synchronous Q-learning algorithm that uses historical reward data to construct robust empirical Bellman operators at each time step. Finally, we prove a finite-time convergence rate for our algorithm that matches known state-of-the-art bounds (in the absence of attacks) up to a small inevitable $O(\varepsilon)$ error term that scales with the adversarial corruption fraction $\varepsilon$. Notably, our results continue to hold even when the true reward distributions have infinite support, provided they admit bounded second moments.
△ Less
Submitted 5 September, 2024;
originally announced September 2024.
-
Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads
Authors:
Ali Khaleghi Rahimian,
Manish Kumar Govind,
Subhajit Maity,
Dominick Reilly,
Christian Kümmerle,
Srijan Das,
Aritra Dutta
Abstract:
Transformer architectures such as Vision Transformers (ViT) have proven effective for solving visual perception tasks. However, they suffer from two major limitations; first, the quadratic complexity of self-attention limits the number of tokens that can be processed, and second, Transformers often require large amounts of training data to attain state-of-the-art performance. In this paper, we pro…
▽ More
Transformer architectures such as Vision Transformers (ViT) have proven effective for solving visual perception tasks. However, they suffer from two major limitations; first, the quadratic complexity of self-attention limits the number of tokens that can be processed, and second, Transformers often require large amounts of training data to attain state-of-the-art performance. In this paper, we propose a new multi-head self-attention (MHSA) variant named Fibottention, which can replace MHSA in Transformer architectures. Fibottention is data-efficient and computationally more suitable for processing large numbers of tokens than the standard MHSA. It employs structured sparse attention based on dilated Fibonacci sequences, which, uniquely, differ across attention heads, resulting in inception-like diverse features across heads. The spacing of the Fibonacci sequences follows the Wythoff array, which minimizes the redundancy of token interactions aggregated across different attention heads, while still capturing sufficient complementary information through token pair interactions. These sparse attention patterns are unique among the existing sparse attention and lead to an $O(N \log N)$ complexity, where $N$ is the number of tokens. Leveraging only 2-6% of the elements in the self-attention heads, Fibottention embedded into popular, state-of-the-art Transformer architectures can achieve significantly improved predictive performance for domains with limited data such as image classification, video understanding, and robot learning tasks, and render reduced computational complexity. We further validated the improved diversity of feature representations resulting from different self-attention heads, and our model design against other sparse attention mechanisms.
△ Less
Submitted 19 December, 2024; v1 submitted 27 June, 2024;
originally announced June 2024.
-
How Effective is GPT-4 Turbo in Generating School-Level Questions from Textbooks Based on Bloom's Revised Taxonomy?
Authors:
Subhankar Maity,
Aniket Deroy,
Sudeshna Sarkar
Abstract:
We evaluate the effectiveness of GPT-4 Turbo in generating educational questions from NCERT textbooks in zero-shot mode. Our study highlights GPT-4 Turbo's ability to generate questions that require higher-order thinking skills, especially at the "understanding" level according to Bloom's Revised Taxonomy. While we find a notable consistency between questions generated by GPT-4 Turbo and those ass…
▽ More
We evaluate the effectiveness of GPT-4 Turbo in generating educational questions from NCERT textbooks in zero-shot mode. Our study highlights GPT-4 Turbo's ability to generate questions that require higher-order thinking skills, especially at the "understanding" level according to Bloom's Revised Taxonomy. While we find a notable consistency between questions generated by GPT-4 Turbo and those assessed by humans in terms of complexity, there are occasional differences. Our evaluation also uncovers variations in how humans and machines evaluate question quality, with a trend inversely related to Bloom's Revised Taxonomy levels. These findings suggest that while GPT-4 Turbo is a promising tool for educational question generation, its efficacy varies across different cognitive levels, indicating a need for further refinement to fully meet educational standards.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications
Authors:
Jordy Van Landeghem,
Subhajit Maity,
Ayan Banerjee,
Matthew Blaschko,
Marie-Francine Moens,
Josep Lladós,
Sanket Biswas
Abstract:
This work explores knowledge distillation (KD) for visually-rich document (VRD) applications such as document layout analysis (DLA) and document image classification (DIC). While VRD research is dependent on increasingly sophisticated and cumbersome models, the field has neglected to study efficiency via model compression. Here, we design a KD experimentation methodology for more lean, performant…
▽ More
This work explores knowledge distillation (KD) for visually-rich document (VRD) applications such as document layout analysis (DLA) and document image classification (DIC). While VRD research is dependent on increasingly sophisticated and cumbersome models, the field has neglected to study efficiency via model compression. Here, we design a KD experimentation methodology for more lean, performant models on document understanding (DU) tasks that are integral within larger task pipelines. We carefully selected KD strategies (response-based, feature-based) for distilling knowledge to and from backbones with different architectures (ResNet, ViT, DiT) and capacities (base, small, tiny). We study what affects the teacher-student knowledge gap and find that some methods (tuned vanilla KD, MSE, SimKD with an apt projector) can consistently outperform supervised student training. Furthermore, we design downstream task setups to evaluate covariate shift and the robustness of distilled DLA models on zero-shot layout-aware document visual question answering (DocVQA). DLA-KD experiments result in a large mAP knowledge gap, which unpredictably translates to downstream robustness, accentuating the need to further explore how to efficiently obtain more semantic document layout awareness.
△ Less
Submitted 12 March, 2025; v1 submitted 12 June, 2024;
originally announced June 2024.
-
How Ready Are Generative Pre-trained Large Language Models for Explaining Bengali Grammatical Errors?
Authors:
Subhankar Maity,
Aniket Deroy,
Sudeshna Sarkar
Abstract:
Grammatical error correction (GEC) tools, powered by advanced generative artificial intelligence (AI), competently correct linguistic inaccuracies in user input. However, they often fall short in providing essential natural language explanations, which are crucial for learning languages and gaining a deeper understanding of the grammatical rules. There is limited exploration of these tools in low-…
▽ More
Grammatical error correction (GEC) tools, powered by advanced generative artificial intelligence (AI), competently correct linguistic inaccuracies in user input. However, they often fall short in providing essential natural language explanations, which are crucial for learning languages and gaining a deeper understanding of the grammatical rules. There is limited exploration of these tools in low-resource languages such as Bengali. In such languages, grammatical error explanation (GEE) systems should not only correct sentences but also provide explanations for errors. This comprehensive approach can help language learners in their quest for proficiency. Our work introduces a real-world, multi-domain dataset sourced from Bengali speakers of varying proficiency levels and linguistic complexities. This dataset serves as an evaluation benchmark for GEE systems, allowing them to use context information to generate meaningful explanations and high-quality corrections. Various generative pre-trained large language models (LLMs), including GPT-4 Turbo, GPT-3.5 Turbo, Text-davinci-003, Text-babbage-001, Text-curie-001, Text-ada-001, Llama-2-7b, Llama-2-13b, and Llama-2-70b, are assessed against human experts for performance comparison. Our research underscores the limitations in the automatic deployment of current state-of-the-art generative pre-trained LLMs for Bengali GEE. Advocating for human intervention, our findings propose incorporating manual checks to address grammatical errors and improve feedback quality. This approach presents a more suitable strategy to refine the GEC tools in Bengali, emphasizing the educational aspect of language learning.
△ Less
Submitted 27 May, 2024;
originally announced June 2024.
-
Learning the Distribution Map in Reverse Causal Performative Prediction
Authors:
Daniele Bracale,
Subha Maity,
Moulinath Banerjee,
Yuekai Sun
Abstract:
In numerous predictive scenarios, the predictive model affects the sampling distribution; for example, job applicants often meticulously craft their resumes to navigate through a screening systems. Such shifts in distribution are particularly prevalent in the realm of social computing, yet, the strategies to learn these shifts from data remain remarkably limited. Inspired by a microeconomic model…
▽ More
In numerous predictive scenarios, the predictive model affects the sampling distribution; for example, job applicants often meticulously craft their resumes to navigate through a screening systems. Such shifts in distribution are particularly prevalent in the realm of social computing, yet, the strategies to learn these shifts from data remain remarkably limited. Inspired by a microeconomic model that adeptly characterizes agents' behavior within labor markets, we introduce a novel approach to learn the distribution shift. Our method is predicated on a reverse causal model, wherein the predictive model instigates a distribution shift exclusively through a finite set of agents' actions. Within this framework, we employ a microfoundation model for the agents' actions and develop a statistically justified methodology to learn the distribution shift map, which we demonstrate to be effective in minimizing the performative prediction risk.
△ Less
Submitted 10 April, 2025; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Exploring the Capabilities of Prompted Large Language Models in Educational and Assessment Applications
Authors:
Subhankar Maity,
Aniket Deroy,
Sudeshna Sarkar
Abstract:
In the era of generative artificial intelligence (AI), the fusion of large language models (LLMs) offers unprecedented opportunities for innovation in the field of modern education. We embark on an exploration of prompted LLMs within the context of educational and assessment applications to uncover their potential. Through a series of carefully crafted research questions, we investigate the effect…
▽ More
In the era of generative artificial intelligence (AI), the fusion of large language models (LLMs) offers unprecedented opportunities for innovation in the field of modern education. We embark on an exploration of prompted LLMs within the context of educational and assessment applications to uncover their potential. Through a series of carefully crafted research questions, we investigate the effectiveness of prompt-based techniques in generating open-ended questions from school-level textbooks, assess their efficiency in generating open-ended questions from undergraduate-level technical textbooks, and explore the feasibility of employing a chain-of-thought inspired multi-stage prompting approach for language-agnostic multiple-choice question (MCQ) generation. Additionally, we evaluate the ability of prompted LLMs for language learning, exemplified through a case study in the low-resource Indian language Bengali, to explain Bengali grammatical errors. We also evaluate the potential of prompted LLMs to assess human resource (HR) spoken interview transcripts. By juxtaposing the capabilities of LLMs with those of human experts across various educational tasks and domains, our aim is to shed light on the potential and limitations of LLMs in reshaping educational practices.
△ Less
Submitted 19 May, 2024;
originally announced May 2024.
-
Parameterized Algorithms for Editing to Uniform Cluster Graph
Authors:
Ajinkya Gaikwad,
Hitendra Kumar,
Soumen Maity
Abstract:
We study the parameterized complexity of transforming graphs into Uniform Cluster graphs, where each component is an equal-sized clique. We consider Uniform Cluster Vertex Deletion (UCVD), Uniform Cluster Edge Deletion (UCED), Uniform Cluster Edge Addition (UCEA), Uniform Cluster Edge Editing (UCEE), Uniform Cluster Exclusive Vertex Splitting (UCEVS), and Uniform Cluster Inclusive Vertex Splitting…
▽ More
We study the parameterized complexity of transforming graphs into Uniform Cluster graphs, where each component is an equal-sized clique. We consider Uniform Cluster Vertex Deletion (UCVD), Uniform Cluster Edge Deletion (UCED), Uniform Cluster Edge Addition (UCEA), Uniform Cluster Edge Editing (UCEE), Uniform Cluster Exclusive Vertex Splitting (UCEVS), and Uniform Cluster Inclusive Vertex Splitting (UCIVS). For UCVD, we provide a vertex kernel of size $\mathcal{O}(k^{3})$ and an FPT algorithm with running time $2^{k} \cdot n^{\mathcal{O}(1)}$, improving the known $3^{k} \cdot n^{\mathcal{O}(1)}$ algorithm. For edge-based variants, we obtain a $\mathcal{O}(k^{2})$ vertex kernel for UCEE and linear vertex kernels for UCED and UCEA, improving the best-known results. Additionally, we present a $1.47^{k} \cdot n^{\mathcal{O}(1)}$ algorithm for UCED, improving upon the previous $2^{k} \cdot n^{\mathcal{O}(1)}$ bound. We develop a sub-exponential algorithm for UCED on everywhere dense graphs by reducing it to $d$-Way Cut. Lastly, we study vertex splitting operations and provide vertex kernels of size $4k$ for both UCIVS and UCEVS.
△ Less
Submitted 5 February, 2025; v1 submitted 15 April, 2024;
originally announced April 2024.
-
Aligners: Decoupling LLMs and Alignment
Authors:
Lilian Ngweta,
Mayank Agarwal,
Subha Maity,
Alex Gittens,
Yuekai Sun,
Mikhail Yurochkin
Abstract:
Large Language Models (LLMs) need to be aligned with human expectations to ensure their safety and utility in most applications. Alignment is challenging, costly, and needs to be repeated for every LLM and alignment criterion. We propose to decouple LLMs and alignment by training aligner models that can be used to align any LLM for a given criteria on an as-needed basis, thus also reducing the pot…
▽ More
Large Language Models (LLMs) need to be aligned with human expectations to ensure their safety and utility in most applications. Alignment is challenging, costly, and needs to be repeated for every LLM and alignment criterion. We propose to decouple LLMs and alignment by training aligner models that can be used to align any LLM for a given criteria on an as-needed basis, thus also reducing the potential negative impacts of alignment on performance. Our recipe for training the aligner models solely relies on synthetic data generated with a (prompted) LLM and can be easily adjusted for a variety of alignment criteria. We use the same synthetic data to train inspectors, binary miss-alignment classification models to guide a "squad" of multiple aligners. Our empirical results demonstrate consistent improvements when applying aligner squad to various LLMs, including chat-aligned models, across several instruction-following and red-teaming datasets.
△ Less
Submitted 4 October, 2024; v1 submitted 6 March, 2024;
originally announced March 2024.
-
A Novel Multi-Stage Prompting Approach for Language Agnostic MCQ Generation using GPT
Authors:
Subhankar Maity,
Aniket Deroy,
Sudeshna Sarkar
Abstract:
We introduce a multi-stage prompting approach (MSP) for the generation of multiple choice questions (MCQs), harnessing the capabilities of GPT models such as text-davinci-003 and GPT-4, renowned for their excellence across various NLP tasks. Our approach incorporates the innovative concept of chain-of-thought prompting, a progressive technique in which the GPT model is provided with a series of in…
▽ More
We introduce a multi-stage prompting approach (MSP) for the generation of multiple choice questions (MCQs), harnessing the capabilities of GPT models such as text-davinci-003 and GPT-4, renowned for their excellence across various NLP tasks. Our approach incorporates the innovative concept of chain-of-thought prompting, a progressive technique in which the GPT model is provided with a series of interconnected cues to guide the MCQ generation process. Automated evaluations consistently demonstrate the superiority of our proposed MSP method over the traditional single-stage prompting (SSP) baseline, resulting in the production of high-quality distractors. Furthermore, the one-shot MSP technique enhances automatic evaluation results, contributing to improved distractor generation in multiple languages, including English, German, Bengali, and Hindi. In human evaluations, questions generated using our approach exhibit superior levels of grammaticality, answerability, and difficulty, highlighting its efficacy in various languages.
△ Less
Submitted 13 January, 2024;
originally announced January 2024.
-
Jointly Optimal RIS Placement and Power Allocation for Underlay D2D Communications: An Outage Probability Minimization Approach
Authors:
Sarbani Ghose,
Deepak Mishra,
Santi P. Maity,
George C. Alexandropoulos
Abstract:
In this paper, we study underlay device-to-device (D2D) communication systems empowered by a reconfigurable intelligent surface (RIS) for cognitive cellular networks. Considering Rayleigh fading channels and the general case where there exist both the direct and RIS-enabled D2D channels, the outage probability (OP) of the D2D communication link is presented in closed-form. Next, for the considered…
▽ More
In this paper, we study underlay device-to-device (D2D) communication systems empowered by a reconfigurable intelligent surface (RIS) for cognitive cellular networks. Considering Rayleigh fading channels and the general case where there exist both the direct and RIS-enabled D2D channels, the outage probability (OP) of the D2D communication link is presented in closed-form. Next, for the considered RIS-empowered underlaid D2D system, we frame an OP minimization problem. We target the joint optimization of the transmit power at the D2D source and the RIS placement, under constraints on the transmit power at the D2D source and on the limited interference imposed on the cellular user for two RIS deployment topologies. Due to the coupled optimization variables, the formulated optimization problem is extremely intractable. We propose an equivalent transformation which we are able to solve analytically. In the transformed problem, an expression for the average value of the signal-to-interference-noise ratio (SINR) at the D2D receiver is derived in closed-form. Our theoretical derivations are corroborated through simulation results, and various system design insights are deduced. It is indicatively showcased that the proposed RIS-empowered underlaid D2D system design outperforms the benchmark semi-adaptive optimal power and optimal distance schemes, offering $44\%$ and $20\%$ performance improvement, respectively.
△ Less
Submitted 7 January, 2024; v1 submitted 21 December, 2023;
originally announced December 2023.
-
Multi-Label Classification of COVID-Tweets Using Large Language Models
Authors:
Aniket Deroy,
Subhankar Maity
Abstract:
Vaccination is important to minimize the risk and spread of various diseases. In recent years, vaccination has been a key step in countering the COVID-19 pandemic. However, many people are skeptical about the use of vaccines for various reasons, including the politics involved, the potential side effects of vaccines, etc. The goal in this task is to build an effective multi-label classifier to lab…
▽ More
Vaccination is important to minimize the risk and spread of various diseases. In recent years, vaccination has been a key step in countering the COVID-19 pandemic. However, many people are skeptical about the use of vaccines for various reasons, including the politics involved, the potential side effects of vaccines, etc. The goal in this task is to build an effective multi-label classifier to label a social media post (particularly, a tweet) according to the specific concern(s) towards vaccines as expressed by the author of the post. We tried three different models-(a) Supervised BERT-large-uncased, (b) Supervised HateXplain model, and (c) Zero-Shot GPT-3.5 Turbo model. The Supervised BERT-large-uncased model performed best in our case. We achieved a macro-F1 score of 0.66, a Jaccard similarity score of 0.66, and received the sixth rank among other submissions. Code is available at-https://github.com/anonmous1981/AISOME
△ Less
Submitted 17 December, 2023;
originally announced December 2023.
-
Weak Supervision Performance Evaluation via Partial Identification
Authors:
Felipe Maia Polo,
Subha Maity,
Mikhail Yurochkin,
Moulinath Banerjee,
Yuekai Sun
Abstract:
Programmatic Weak Supervision (PWS) enables supervised model training without direct access to ground truth labels, utilizing weak labels from heuristics, crowdsourcing, or pre-trained models. However, the absence of ground truth complicates model evaluation, as traditional metrics such as accuracy, precision, and recall cannot be directly calculated. In this work, we present a novel method to add…
▽ More
Programmatic Weak Supervision (PWS) enables supervised model training without direct access to ground truth labels, utilizing weak labels from heuristics, crowdsourcing, or pre-trained models. However, the absence of ground truth complicates model evaluation, as traditional metrics such as accuracy, precision, and recall cannot be directly calculated. In this work, we present a novel method to address this challenge by framing model evaluation as a partial identification problem and estimating performance bounds using Fréchet bounds. Our approach derives reliable bounds on key metrics without requiring labeled data, overcoming core limitations in current weak supervision evaluation techniques. Through scalable convex optimization, we obtain accurate and computationally efficient bounds for metrics including accuracy, precision, recall, and F1-score, even in high-dimensional settings. This framework offers a robust approach to assessing model quality without ground truth labels, enhancing the practicality of weakly supervised learning for real-world applications.
△ Less
Submitted 31 October, 2024; v1 submitted 7 December, 2023;
originally announced December 2023.
-
Prompted Zero-Shot Multi-label Classification of Factual Incorrectness in Machine-Generated Summaries
Authors:
Aniket Deroy,
Subhankar Maity,
Saptarshi Ghosh
Abstract:
This study addresses the critical issue of factual inaccuracies in machine-generated text summaries, an increasingly prevalent issue in information dissemination. Recognizing the potential of such errors to compromise information reliability, we investigate the nature of factual inconsistencies across machine-summarized content. We introduce a prompt-based classification system that categorizes er…
▽ More
This study addresses the critical issue of factual inaccuracies in machine-generated text summaries, an increasingly prevalent issue in information dissemination. Recognizing the potential of such errors to compromise information reliability, we investigate the nature of factual inconsistencies across machine-summarized content. We introduce a prompt-based classification system that categorizes errors into four distinct types: misrepresentation, inaccurate quantities or measurements, false attribution, and fabrication. The participants are tasked with evaluating a corpus of machine-generated summaries against their original articles. Our methodology employs qualitative judgements to identify the occurrence of factual distortions. The results show that our prompt-based approaches are able to detect the type of errors in the summaries to some extent, although there is scope for improvement in our classification systems.
△ Less
Submitted 2 December, 2023;
originally announced December 2023.
-
Harnessing the Power of Prompt-based Techniques for Generating School-Level Questions using Large Language Models
Authors:
Subhankar Maity,
Aniket Deroy,
Sudeshna Sarkar
Abstract:
Designing high-quality educational questions is a challenging and time-consuming task. In this work, we propose a novel approach that utilizes prompt-based techniques to generate descriptive and reasoning-based questions. However, current question-answering (QA) datasets are inadequate for conducting our experiments on prompt-based question generation (QG) in an educational setting. Therefore, we…
▽ More
Designing high-quality educational questions is a challenging and time-consuming task. In this work, we propose a novel approach that utilizes prompt-based techniques to generate descriptive and reasoning-based questions. However, current question-answering (QA) datasets are inadequate for conducting our experiments on prompt-based question generation (QG) in an educational setting. Therefore, we curate a new QG dataset called EduProbe for school-level subjects, by leveraging the rich content of NCERT textbooks. We carefully annotate this dataset as quadruples of 1) Context: a segment upon which the question is formed; 2) Long Prompt: a long textual cue for the question (i.e., a longer sequence of words or phrases, covering the main theme of the context); 3) Short Prompt: a short textual cue for the question (i.e., a condensed representation of the key information or focus of the context); 4) Question: a deep question that aligns with the context and is coherent with the prompts. We investigate several prompt-based QG methods by fine-tuning pre-trained transformer-based large language models (LLMs), namely PEGASUS, T5, MBART, and BART. Moreover, we explore the performance of two general-purpose pre-trained LLMs such as Text-Davinci-003 and GPT-3.5-Turbo without any further training. By performing automatic evaluation, we show that T5 (with long prompt) outperforms all other models, but still falls short of the human baseline. Under human evaluation criteria, TextDavinci-003 usually shows better results than other models under various prompt settings. Even in the case of human evaluation criteria, QG models mostly fall short of the human baseline. Our code and dataset are available at: https://github.com/my625/PromptQG
△ Less
Submitted 2 December, 2023;
originally announced December 2023.
-
Questioning Biases in Case Judgment Summaries: Legal Datasets or Large Language Models?
Authors:
Aniket Deroy,
Subhankar Maity
Abstract:
The evolution of legal datasets and the advent of large language models (LLMs) have significantly transformed the legal field, particularly in the generation of case judgment summaries. However, a critical concern arises regarding the potential biases embedded within these summaries. This study scrutinizes the biases present in case judgment summaries produced by legal datasets and large language…
▽ More
The evolution of legal datasets and the advent of large language models (LLMs) have significantly transformed the legal field, particularly in the generation of case judgment summaries. However, a critical concern arises regarding the potential biases embedded within these summaries. This study scrutinizes the biases present in case judgment summaries produced by legal datasets and large language models. The research aims to analyze the impact of biases on legal decision making. By interrogating the accuracy, fairness, and implications of biases in these summaries, this study contributes to a better understanding of the role of technology in legal contexts and the implications for justice systems worldwide. In this study, we investigate biases wrt Gender-related keywords, Race-related keywords, Keywords related to crime against women, Country names and religious keywords. The study shows interesting evidences of biases in the outputs generated by the large language models and pre-trained abstractive summarization models. The reasoning behind these biases needs further studies.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
An Investigation of Representation and Allocation Harms in Contrastive Learning
Authors:
Subha Maity,
Mayank Agarwal,
Mikhail Yurochkin,
Yuekai Sun
Abstract:
The effect of underrepresentation on the performance of minority groups is known to be a serious problem in supervised learning settings; however, it has been underexplored so far in the context of self-supervised learning (SSL). In this paper, we demonstrate that contrastive learning (CL), a popular variant of SSL, tends to collapse representations of minority groups with certain majority groups.…
▽ More
The effect of underrepresentation on the performance of minority groups is known to be a serious problem in supervised learning settings; however, it has been underexplored so far in the context of self-supervised learning (SSL). In this paper, we demonstrate that contrastive learning (CL), a popular variant of SSL, tends to collapse representations of minority groups with certain majority groups. We refer to this phenomenon as representation harm and demonstrate it on image and text datasets using the corresponding popular CL methods. Furthermore, our causal mediation analysis of allocation harm on a downstream classification task reveals that representation harm is partly responsible for it, thus emphasizing the importance of studying and mitigating representation harm. Finally, we provide a theoretical explanation for representation harm using a stochastic block model that leads to a representational neural collapse in a contrastive learning setting.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Detection and Classification of Novel Attacks and Anomaly in IoT Network using Rule based Deep Learning Model
Authors:
Sanjay Chakraborty,
Saroj Kumar Pandey,
Saikat Maity,
Lopamudra Dey
Abstract:
Attackers are now using sophisticated techniques, like polymorphism, to change the attack pattern for each new attack. Thus, the detection of novel attacks has become the biggest challenge for cyber experts and researchers. Recently, anomaly and hybrid approaches are used for the detection of network attacks. Detecting novel attacks, on the other hand, is a key enabler for a wide range of IoT appl…
▽ More
Attackers are now using sophisticated techniques, like polymorphism, to change the attack pattern for each new attack. Thus, the detection of novel attacks has become the biggest challenge for cyber experts and researchers. Recently, anomaly and hybrid approaches are used for the detection of network attacks. Detecting novel attacks, on the other hand, is a key enabler for a wide range of IoT applications. Novel attacks can easily evade existing signature-based detection methods and are extremely difficult to detect, even going undetected for years. Existing machine learning models have also failed to detect the attack and have a high rate of false positives. In this paper, a rule-based deep neural network technique has been proposed as a framework for addressing the problem of detecting novel attacks. The designed framework significantly improves respective benchmark results, including the CICIDS 2017 dataset. The experimental results show that the proposed model keeps a good balance between attack detection, untruthful positive rates, and untruthful negative rates. For novel attacks, the model has an accuracy of more than 99%. During the automatic interaction between network-devices (IoT), security and privacy are the primary obstacles. Our proposed method can handle these obstacles efficiently and finally identify, and classify the different levels of threats.
△ Less
Submitted 29 July, 2023;
originally announced August 2023.
-
Image Hash Minimization for Tamper Detection
Authors:
Subhajit Maity,
Ram Kumar Karsh
Abstract:
Tamper detection using image hash is a very common problem of modern days. Several research and advancements have already been done to address this problem. However, most of the existing methods lack the accuracy of tamper detection when the tampered area is low, as well as requiring long image hashes. In this paper, we propose a novel method objectively to minimize the hash length while enhancing…
▽ More
Tamper detection using image hash is a very common problem of modern days. Several research and advancements have already been done to address this problem. However, most of the existing methods lack the accuracy of tamper detection when the tampered area is low, as well as requiring long image hashes. In this paper, we propose a novel method objectively to minimize the hash length while enhancing the performance at low tampered area.
△ Less
Submitted 28 May, 2023;
originally announced May 2023.
-
Multi-Fidelity Machine Learning for Excited State Energies of Molecules
Authors:
Vivin Vinod,
Sayan Maity,
Peter Zaspel,
Ulrich Kleinekathöfer
Abstract:
The accurate but fast calculation of molecular excited states is still a very challenging topic. For many applications, detailed knowledge of the energy funnel in larger molecular aggregates is of key importance requiring highly accurate excited state energies. To this end, machine learning techniques can be an extremely useful tool though the cost of generating highly accurate training datasets s…
▽ More
The accurate but fast calculation of molecular excited states is still a very challenging topic. For many applications, detailed knowledge of the energy funnel in larger molecular aggregates is of key importance requiring highly accurate excited state energies. To this end, machine learning techniques can be an extremely useful tool though the cost of generating highly accurate training datasets still remains a severe challenge. To overcome this hurdle, this work proposes the use of multi-fidelity machine learning where very little training data from high accuracies is combined with cheaper and less accurate data to achieve the accuracy of the costlier level. In the present study, the approach is employed to predict the first excited state energies for three molecules of increasing size, namely, benzene, naphthalene, and anthracene. The energies are trained and tested for conformations stemming from classical molecular dynamics simulations and from real-time density functional tight-binding calculations. It can be shown that the multi-fidelity machine learning model can achieve the same accuracy as a machine learning model built only on high cost training data while having a much lower computational effort to generate the data. The numerical gain observed in these benchmark test calculations was over a factor of 30 but certainly can be much higher for high accuracy data.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation
Authors:
Subhajit Maity,
Sanket Biswas,
Siladittya Manna,
Ayan Banerjee,
Josep Lladós,
Saumik Bhattacharya,
Umapada Pal
Abstract:
Document layout analysis is a known problem to the documents research community and has been vastly explored yielding a multitude of solutions ranging from text mining, and recognition to graph-based representation, visual feature extraction, etc. However, most of the existing works have ignored the crucial fact regarding the scarcity of labeled data. With growing internet connectivity to personal…
▽ More
Document layout analysis is a known problem to the documents research community and has been vastly explored yielding a multitude of solutions ranging from text mining, and recognition to graph-based representation, visual feature extraction, etc. However, most of the existing works have ignored the crucial fact regarding the scarcity of labeled data. With growing internet connectivity to personal life, an enormous amount of documents had been available in the public domain and thus making data annotation a tedious task. We address this challenge using self-supervision and unlike, the few existing self-supervised document segmentation approaches which use text mining and textual labels, we use a complete vision-based approach in pre-training without any ground-truth label or its derivative. Instead, we generate pseudo-layouts from the document images to pre-train an image encoder to learn the document object representation and localization in a self-supervised framework before fine-tuning it with an object detection model. We show that our pipeline sets a new benchmark in this context and performs at par with the existing methods and the supervised counterparts, if not outperforms. The code is made publicly available at: https://github.com/MaitySubhajit/SelfDocSeg
△ Less
Submitted 20 August, 2023; v1 submitted 1 May, 2023;
originally announced May 2023.
-
Bayes classifier cannot be learned from noisy responses with unknown noise rates
Authors:
Soham Bakshi,
Subha Maity
Abstract:
Training a classifier with noisy labels typically requires the learner to specify the distribution of label noise, which is often unknown in practice. Although there have been some recent attempts to relax that requirement, we show that the Bayes decision rule is unidentified in most classification problems with noisy labels. This suggests it is generally not possible to bypass/relax the requireme…
▽ More
Training a classifier with noisy labels typically requires the learner to specify the distribution of label noise, which is often unknown in practice. Although there have been some recent attempts to relax that requirement, we show that the Bayes decision rule is unidentified in most classification problems with noisy labels. This suggests it is generally not possible to bypass/relax the requirement. In the special cases in which the Bayes decision rule is identified, we develop a simple algorithm to learn the Bayes decision rule, that does not require knowledge of the noise distribution.
△ Less
Submitted 13 April, 2023;
originally announced April 2023.
-
Interdisciplinary Papers Supported by Disciplinary Grants Garner Deep and Broad Scientific Impact
Authors:
Minsu Park,
Suman Kalyan Maity,
Stefan Wuchty,
Dashun Wang
Abstract:
Interdisciplinary research has emerged as a hotbed for innovation and a key approach to addressing complex societal challenges. The increasing dominance of grant-supported research in shaping scientific advances, coupled with growing interest in funding interdisciplinary work, raises fundamental questions about the effectiveness of interdisciplinary grants in fostering high-impact interdisciplinar…
▽ More
Interdisciplinary research has emerged as a hotbed for innovation and a key approach to addressing complex societal challenges. The increasing dominance of grant-supported research in shaping scientific advances, coupled with growing interest in funding interdisciplinary work, raises fundamental questions about the effectiveness of interdisciplinary grants in fostering high-impact interdisciplinary research outcomes. Here, we quantify the interdisciplinarity of both research grants and publications, capturing 350,000 grants from 164 funding agencies across 26 countries and 1.3 million papers that acknowledged their support from 1985 to 2009. Our analysis uncovers two seemingly contradictory patterns: Interdisciplinary grants tend to produce interdisciplinary papers, which are generally associated with high impact. However, compared to disciplinary grants, interdisciplinary grants on average yield fewer papers and interdisciplinary papers they support tend to have substantially reduced impact. We demonstrate that the key to explaining this paradox lies in the power of disciplinary grants in propelling high-impact interdisciplinary research. Specifically, our results show that highly interdisciplinary papers supported by deeply disciplinary grants garner disproportionately more citations, both within their core disciplines and from broader fields. Moreover, disciplinary grants, particularly when combined with other similar grants, are more effective in producing high-impact interdisciplinary research. Amidst the rapid rise of support for interdisciplinary work across the sciences, these results highlight the hitherto unknown role of disciplinary grants in driving crucial interdisciplinary advances, suggesting that interdisciplinary research requires deep disciplinary expertise and investments.
△ Less
Submitted 14 March, 2025; v1 submitted 26 March, 2023;
originally announced March 2023.
-
An Improved Exact Algorithm for Knot-Free Vertex Deletion
Authors:
Ajaykrishnan E S,
Soumen Maity,
Abhishek Sahu,
Saket Saurabh
Abstract:
A knot $K$ in a directed graph $D$ is a strongly connected component of size at least two such that there is no arc $(u,v)$ with $u \in V(K)$ and $v\notin V(K)$. Given a directed graph $D=(V,E)$, we study Knot-Free Vertex Deletion (KFVD), where the goal is to remove the minimum number of vertices such that the resulting graph contains no knots. This problem naturally emerges from its application i…
▽ More
A knot $K$ in a directed graph $D$ is a strongly connected component of size at least two such that there is no arc $(u,v)$ with $u \in V(K)$ and $v\notin V(K)$. Given a directed graph $D=(V,E)$, we study Knot-Free Vertex Deletion (KFVD), where the goal is to remove the minimum number of vertices such that the resulting graph contains no knots. This problem naturally emerges from its application in deadlock resolution since knots are deadlocks in the OR-model of distributed computation. The fastest known exact algorithm in literature for KFVD runs in time $\mathcal{O}^\star(1.576^n)$. In this paper, we present an improved exact algorithm running in time $\mathcal{O}^\star(1.4549^n)$, where $n$ is the number of vertices in $D$. We also prove that the number of inclusion wise minimal knot-free vertex deletion sets is $\mathcal{O}^\star(1.4549^n)$ and construct a family of graphs with $Ω(1.4422^n)$ minimal knot-free vertex deletion sets
△ Less
Submitted 20 March, 2023;
originally announced March 2023.
-
Simple Disentanglement of Style and Content in Visual Representations
Authors:
Lilian Ngweta,
Subha Maity,
Alex Gittens,
Yuekai Sun,
Mikhail Yurochkin
Abstract:
Learning visual representations with interpretable features, i.e., disentangled representations, remains a challenging problem. Existing methods demonstrate some success but are hard to apply to large-scale vision datasets like ImageNet. In this work, we propose a simple post-processing framework to disentangle content and style in learned representations from pre-trained vision models. We model t…
▽ More
Learning visual representations with interpretable features, i.e., disentangled representations, remains a challenging problem. Existing methods demonstrate some success but are hard to apply to large-scale vision datasets like ImageNet. In this work, we propose a simple post-processing framework to disentangle content and style in learned representations from pre-trained vision models. We model the pre-trained features probabilistically as linearly entangled combinations of the latent content and style factors and develop a simple disentanglement algorithm based on the probabilistic model. We show that the method provably disentangles content and style features and verify its efficacy empirically. Our post-processed features yield significant domain generalization performance improvements when the distribution shift occurs due to style changes or style-related spurious correlations.
△ Less
Submitted 31 May, 2023; v1 submitted 20 February, 2023;
originally announced February 2023.
-
A Machine Learning system to monitor student progress in educational institutes
Authors:
Bibhuprasad Mahakud,
Bibhuti Parida,
Ipsit Panda,
Souvik Maity,
Arpita Sahoo,
Reeta Sharma
Abstract:
In order to track and comprehend the academic achievement of students, both private and public educational institutions devote a significant amount of resources and labour. One of the difficult issues that institutes deal with on a regular basis is understanding the exam shortcomings of students. The performance of a student is influenced by a variety of factors, including attendance, attentivenes…
▽ More
In order to track and comprehend the academic achievement of students, both private and public educational institutions devote a significant amount of resources and labour. One of the difficult issues that institutes deal with on a regular basis is understanding the exam shortcomings of students. The performance of a student is influenced by a variety of factors, including attendance, attentiveness in class, understanding of concepts taught, the teachers ability to deliver the material effectively, timely completion of home assignments, and the concern of parents and teachers for guiding the student through the learning process. We propose a data driven approach that makes use of Machine Learning techniques to generate a classifier called credit score that helps to comprehend the learning journeys of students and identify activities that lead to subpar performances. This would make it easier for educators and institute management to create guidelines for system development to increase productivity. The proposal to use credit score as progress indicator is well suited to be used in a Learning Management System. In this article, we demonstrate the proof of the concept under simplified assumptions using simulated data.
△ Less
Submitted 2 November, 2022;
originally announced November 2022.
-
RMExplorer: A Visual Analytics Approach to Explore the Performance and the Fairness of Disease Risk Models on Population Subgroups
Authors:
Bum Chul Kwon,
Uri Kartoun,
Shaan Khurshid,
Mikhail Yurochkin,
Subha Maity,
Deanna G Brockman,
Amit V Khera,
Patrick T Ellinor,
Steven A Lubitz,
Kenney Ng
Abstract:
Disease risk models can identify high-risk patients and help clinicians provide more personalized care. However, risk models developed on one dataset may not generalize across diverse subpopulations of patients in different datasets and may have unexpected performance. It is challenging for clinical researchers to inspect risk models across different subgroups without any tools. Therefore, we deve…
▽ More
Disease risk models can identify high-risk patients and help clinicians provide more personalized care. However, risk models developed on one dataset may not generalize across diverse subpopulations of patients in different datasets and may have unexpected performance. It is challenging for clinical researchers to inspect risk models across different subgroups without any tools. Therefore, we developed an interactive visualization system called RMExplorer (Risk Model Explorer) to enable interactive risk model assessment. Specifically, the system allows users to define subgroups of patients by selecting clinical, demographic, or other characteristics, to explore the performance and fairness of risk models on the subgroups, and to understand the feature contributions to risk scores. To demonstrate the usefulness of the tool, we conduct a case study, where we use RMExplorer to explore three atrial fibrillation risk models by applying them to the UK Biobank dataset of 445,329 individuals. RMExplorer can help researchers to evaluate the performance and biases of risk models on subpopulations of interest in their data.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Fragile object transportation by a multi-robot system in an unknown environment using a semi-decentralized control approach
Authors:
Dibyendu Roy,
Sreejeet Maity,
Madhubanti Maitra,
Samar Bhattacharya
Abstract:
In this paper, we introduce a semi-decentralized control technique for a swarm of robots transporting a fragile object to a destination in an uncertain occluded environment.The proposed approach has been split into two parts. The initial part (Phase 1) includes a centralized control strategy for creating a specific formation among the agents so that the object to be transported, can be positioned…
▽ More
In this paper, we introduce a semi-decentralized control technique for a swarm of robots transporting a fragile object to a destination in an uncertain occluded environment.The proposed approach has been split into two parts. The initial part (Phase 1) includes a centralized control strategy for creating a specific formation among the agents so that the object to be transported, can be positioned properly on the top of the system. We present a novel triangle packing scheme fused with a circular region-based shape control method for creating a rigid configuration among the robots. In the later part (Phase 2), the swarm system is required to convey the object to the destination in a decentralized way employing the region based shape control approach. The simulation result as well as the comparison study demonstrates the effectiveness of our proposed scheme.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.